General Feedback - FHIR Search: Challenges with Deep Chaining: Tagged Patients pattern #84

bwalsh · 2024-12-05T16:31:37Z

The Challenge of Deep Chaining in FHIR Searches

FHIR’s RESTful API provides powerful mechanisms like chaining and reverse chaining for searching across interconnected resources. However, deep chaining—searching through multiple levels of relationships (e.g., ResearchStudy → ResearchSubject → Patient → Specimen)—often runs into practical and architectural limitations.

The Issue with Deep Chaining

Performance Concerns:
- Deeply chained searches require traversing multiple resource relationships, which can involve significant database joins or recursive queries. This impacts query performance, especially in large datasets with complex relationships.
Limited Server Support:
- Not all FHIR servers support chaining beyond one or two levels, leaving users unable to construct queries that navigate through deeply nested relationships.
Ambiguous Results:
- Complex chaining can result in ambiguous or overly broad results, requiring additional client-side filtering, which negates the efficiency of server-side processing.
Usability Challenges:
- Querying deeply nested relationships is not intuitive for users. Writing and debugging such queries can be cumbersome, especially without robust documentation or testing tools.

Example Problem

A researcher may want to count all Specimens tied to a ResearchStudy. This involves the following chain:

ResearchStudy → ResearchSubject → Patient → Specimen

A direct query like this is not supported by many FHIR servers due to depth limitations.

A Potential Workaround: Tagging Patients

To address this, we can leverage tags or extensions on Patient resources to simplify queries:

Tagging Patients:
- Tag or extend Patient resources with metadata indicating their association with a specific ResearchStudy.
- Example: Add a ResearchStudy identifier as a tag or extension to all related Patient resources.
Simplified Query:
- Instead of chaining, directly query Patients by their tag and use reverse chaining to find associated Specimens:
```
GET [base]/Specimen?_has:Patient:subject:_tag=ResearchStudy/[study-identifier]
```
Benefits:
- Simplifies queries by reducing chaining depth.
- Improves performance since fewer resource relationships need to be traversed.
- Provides flexibility for analytics use cases without requiring extensive server-side changes.

Considerations for Tagging

Governance: Establish clear guidelines for tagging to maintain consistency across resources.
Scalability: Ensure that tagging does not introduce additional performance overhead, data redundancy or ETL complexity.

By strategically tagging or extending resources, FHIR implementers can address the limitations of deep chaining, enabling efficient and effective data queries while maintaining compliance with the FHIR standard.

The text was updated successfully, but these errors were encountered:

bwalsh · 2024-12-05T16:33:08Z

Example:

Count a cohort from Patient->ResearchSubject->ResearchStudy

# get count of patients in a study
curl -s $FHIR_BASE'/Patient?_has:ResearchSubject:subject:study.identifier=TCGA-KIRC&_summary=count'
# returns "total": 537

However, this seems to be unsupported Specimen->Patient->ResearchSubject->ResearchStudy

curl -v -s $FHIR_BASE'/Specimen?subject=Patient:_has:ResearchSubject:subject:study.identifier=TCGA-KIRC&_total=accurate'
# returns "total": 0
# no errors/ warnings in server logs

Your mileage may vary, but I see many examples in the spec for queries of the form:

[parameter]=[type]/[id]

However, I can't see any examples in the spec for:

[parameter]=[type]:_has

bwalsh · 2024-12-05T16:36:19Z

@teslajoy Can you review and comment?
@RobertJCarroll FYI

teslajoy · 2024-12-06T20:54:49Z

LGTM 👍 confirming, don't see :_has

RadixSeven · 2024-12-10T17:32:21Z

Tags are a poor choice because of their odd update semantics

An extension and a supported search parameter are better for linking to other resources than tags. This is because update operations for tags do not allow removing tags by default. Instead, the result is the union of old and new tags. I have also heard of servers that resurrect tags on a resource when it is deleted and re-created. The same problem exists for security labels. This is rarely an issue since security labels seldom change. Still, it is serious if you need to change them since it has implications for maintaining subject consent and privacy requirements.

If you have access to your server's source code, the standard allows you to deviate from this default behavior, but we wouldn't want to limit the NCPI IG only to servers that users could customize in this way.

Use extensions

Instead, I would want custom extensions. For example, one experimental internal system we set up has a "partOfStudy" extension of type "Reference(ResearchStudy)" and an associated search parameter. This can be searched directly or can be used for chaining. We put this on all resources derived from a particular study. This would reduce the example query to {base}/Specimen?part-of-study=ResearchStudy/TCGA-KIRC&_total=accurate - requiring no joins.

Governance

Using extensions also helps ensure good governance over the available "tags" - since we'd define an extension for each one. This is something commonly specified in an IG, so users would know exactly where to look and whether the IG handles a case or not. However, governance is not unique to this solution. One can also limit tags in an IG by sub-classing the Meta element and using the appropriate subclass in your IG-specialized resources.

A note on chaining

Some servers have limited reverse chaining (_has) and forward chaining (.) implementations. In particular, Google Cloud Healthcare limits the number of results to 100 joined resources. So, we should put less weight on chaining for essential use cases.

bwalsh · 2024-12-10T18:56:04Z

This would reduce the example query to {base}/Specimen?part-of-study=ResearchStudy/TCGA-KIRC&_total=accurate - requiring no joins

@RadixSeven Thanks Eric. Agreed, you had me at requiring no joins. Anything that will simplify querying for downstream analysts.

Question: When you set this up, did you need to create a SearchParameter?

Something like:

{
  "resourceType": "SearchParameter",
  "id": "patient-partOfStudy",
  "url": "http://example.org/fhir/SearchParameter/patient-partOfStudy",
  "version": "1.0.0",
  "name": "PartOfStudy",
  "status": "active",
  "publisher": "Example Organization",
  "description": "Search for Patients who are part of a specific ResearchStudy.",
  "code": "partOfStudy",
  "base": ["Patient"],
  "type": "reference",
  "expression": "Patient.extension.where(url='http://example.org/fhir/StructureDefinition/patient-partOfStudy').valueReference",
  "target": ["ResearchStudy"]
}

RadixSeven · 2024-12-11T14:09:46Z

Question: When you set this up, did you need to create a SearchParameter?

Yes. We created several SearchParameter resources. If you'd like more details, I'll need to check with the developer who did the work.

bwalsh assigned JamedFV Dec 5, 2024

bwalsh changed the title ~~General Feedback - FHIR Search: Challenges with Deep Chaining: Tag Patients~~ General Feedback - FHIR Search: Challenges with Deep Chaining: Tagged Patients pattern Dec 5, 2024

bwalsh self-assigned this Dec 5, 2024

bwalsh mentioned this issue Jan 19, 2025

Feature/condition graph FHIR-Aggregator/fhir-query#7

Merged

JamedFV added this to FHIR IG v2 Feedback and Development Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General Feedback - FHIR Search: Challenges with Deep Chaining: Tagged Patients pattern #84

General Feedback - FHIR Search: Challenges with Deep Chaining: Tagged Patients pattern #84

bwalsh commented Dec 5, 2024

bwalsh commented Dec 5, 2024

bwalsh commented Dec 5, 2024

teslajoy commented Dec 6, 2024 •

edited

Loading

RadixSeven commented Dec 10, 2024

bwalsh commented Dec 10, 2024 •

edited

Loading

RadixSeven commented Dec 11, 2024 •

edited

Loading

General Feedback - FHIR Search: Challenges with Deep Chaining: Tagged Patients pattern #84

General Feedback - FHIR Search: Challenges with Deep Chaining: Tagged Patients pattern #84

Comments

bwalsh commented Dec 5, 2024

The Challenge of Deep Chaining in FHIR Searches

The Issue with Deep Chaining

Example Problem

A Potential Workaround: Tagging Patients

Considerations for Tagging

bwalsh commented Dec 5, 2024

Example:

bwalsh commented Dec 5, 2024

teslajoy commented Dec 6, 2024 • edited Loading

RadixSeven commented Dec 10, 2024

Tags are a poor choice because of their odd update semantics

Use extensions

Governance

A note on chaining

bwalsh commented Dec 10, 2024 • edited Loading

RadixSeven commented Dec 11, 2024 • edited Loading

teslajoy commented Dec 6, 2024 •

edited

Loading

bwalsh commented Dec 10, 2024 •

edited

Loading

RadixSeven commented Dec 11, 2024 •

edited

Loading