Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Feature: Support for datetime calculations in cohort definitions #2886

Open
TomWhite-MedStar opened this issue Sep 27, 2023 · 2 comments · May be fixed by OHDSI/circe-be#200 or #2984
Open

New Feature: Support for datetime calculations in cohort definitions #2886

TomWhite-MedStar opened this issue Sep 27, 2023 · 2 comments · May be fixed by OHDSI/circe-be#200 or #2984
Assignees
Labels

Comments

@TomWhite-MedStar
Copy link

Problem Statement

Multiple US electronic Clinical Quality Measure (eCQM) require datetime calculations when doing computations from electronic health records. Public domain examples are published here. The trend for needing datetime logic appears to be increasing.

In many cases, it is possible to create versions of those eCQM as Atlas cohorts, and then adapt the SQL to use datetime calculations instead of datetime. Although this may work for single institutions, it does not lend itself to developing these as phenotypes that can be run in a network study.

Examples:

  1. Hospital Harm - Severe Hyperglycemia. Checks for days when very high glucose measurements are present, but based upon rolling 24 hour periods starting from the time of ED or hospital admission (instead of calendar dates). Thus, the index date is actually a datetime.
  2. Hospital Harm - Severe Hypoglycemia. Checks whether follow-up glucose measurements occurred within 5 minutes after critically low measurements.
  3. Hospital Harm - Opiate-Related Adverse Events. Checks whether opioid antagonists had to be administered within 12 hours after administration of an opioid.

Current Behavior

The Atlas GUI for selecting timespans (e.g. between two dates) only supports whole-number date logic.

Desired behavior

Augment the Atlas GUI to allow for whole number datetime interval logic, with options to specify "seconds", "minutes", "hours", or "days" instead of only "days".

For simplicity to the users, the internal logic should know to use datetime interval logic instead of date logic whenever "seconds", "minutes", or "hours" are selected. This would eliminate the need for cohort authors to choose between "index start date" and "index start datetime".

For example, if you want to know that an event occurred within 24 hours of admission, you would use an interval of 24 hours (which would use datetime logic) rather than 1 day (which would use date logic).

Here is example where I'd want to use 24 hours:
image

And here is example where I'd want to use minutes:
image

Feature Request Scope

  1. Make the word "days" a drop-down box like "Before/After" than has choices {days, hours, minutes, seconds)
  2. Augment the JSON to support this. I presume you want backwards compatibility. So, perhaps keep the existing syntax, but modify it to add optional "TimeUnit" and "TimeUnitValue" fields, where "TimeUnit" can have values "Days", "Hours", "Minutes", "Seconds"; and "TimeUnitValue" would have the desired value (that is currently show in the "Days" field? Below is example of how such time windows are currently represented in JSON:
                "StartWindow": {
                  "Start": {
                    "Days": 9,
                    "Coeff": -1
                  },
                  "End": {
                    "Days": 1,
                    "Coeff": -1
                  },
  1. Augment the generated SQL to use time interval logic whenever a TimeUnit other than "Days" is selected. Retain the current logic when "Days" is selected. Note. I have opened a feature enhancement issue with SqlRender to see if are able to generate interval time translations across all supported databases. Per GPT-4, the translations looks pretty straight forward.
  2. Are augmentations to WebAPI needed?

Note, although not all data source have datetime-level data, the 5.3 OMOP data model and above all require datetime fields. By convention, those are populated with the same value that are in the matching data field. So, if this enhancement is added, phenotypes using seconds/minutes/hours should work (without breaking) on any 5.3 and above OMOP datasets. Data contributors would need to clarify which subset (if any) of their data has true datetime values, and Researchers would need to take that into consideration.

@TomWhite-MedStar
Copy link
Author

Note, SqlRender already supports DATEADD and some DATEDIFF calculations for non-day intervals. There is a pull request to add support for Spark and Oracle - and it should be possible to add support for other databases by simply updating the replacementPatterns.csv file (e.g. no R or Java changes needed).

So, the level of effort to modify the SQL calls may be relatively small. Any place that currently uses DATEADD(start,end) could change to DATEADD(interval,start,end) where interval in DAY, SECOND, MINUTE, HOUR, MONTH, YEAR. Similarly, DATEDIFF(start,end) would change to DATEDIFF(interval,start,end)

@anthonysena anthonysena added this to the v2.15 milestone Oct 3, 2023
@TomWhite-MedStar
Copy link
Author

Based upon what I'm seeing while implementing the hypoglycemia measure, there are a few more small tweaks that may be needed or desirable.

As context, I created cohorts using Atlas as-is, then edited the generated SQL as follows:

  1. Bulk find/replace all all _date to _datetime
  2. Then revert all observation_period_start_datetime and observation_period_end_datetime back to date values
  3. Change final insert from generated final_cohort to use date instead of datetime values.
  4. Change any examples of DATEADD() that required datetime to use interval logic. This was typically in the Inclusion_NN tables.

In my example, I want to know every day when the patient is an outlier for the quality measure -- so one record per day in the final cohort table. I did this by setting event persistence to be one day (offset = 0). However, the generated SQL uses DATEADD(start,0), which casts the value to date. This causes all of my records to drop except for those where the start datetime is exactly midnight.

So, additional modification needed:
(1) Allow time unit for Cohort Exit:
image
(2) Also allow time unit for Cohort Eras(?) (I don't use these yet, so not sure whether needed):
image

One last thing to consider is whether to use datetime values for cohort_start_date and cohort_end_date in the cohort table instead of just date. This would enable two capabilities:

  1. Know exact datetime of cohort entry - so can support sub-day time-to-event analyses
  2. Allow for multiple cohort entries for a given person on a given day. I'm already seeing the value of this for drug safety compliance events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 📋 Backlog
4 participants