-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling calculations in between two related Measurements by adding a new criteria attribute #210
base: master
Are you sure you want to change the base?
Conversation
Added dateAdjust to print friendly and other corrections.
@@ -38,6 +45,34 @@ protected String getQueryTemplate() { | |||
return MEASUREMENT_TEMPLATE; | |||
} | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be deleted
@@ -151,7 +151,9 @@ c.visitType??><#local temp>a visit occurrence that is: <@inputTypes.ConceptList | |||
measurement<#if isPlural && !(c.first!false)>s</#if> of ${utils.codesetName(c.codesetId!"", "any measurement")}<#if | |||
c.measurementSourceConcept??> (including ${utils.codesetName(c.measurementSourceConcept, "any measurement")} source concepts)</#if><#if | |||
c.first!false> for the first time in the person's history</#if><#if attrs?size gt 0>, ${attrs?join("; ")}</#if><#if | |||
c.CorrelatedCriteria??>; <@Group group=c.CorrelatedCriteria level=level indexLabel=utils.codesetName(c.codesetId!"", "any measurement") /><#else>.</#if></#macro> | |||
c.CorrelatedCriteria??>; <@Group group=c.CorrelatedCriteria level=level indexLabel=utils.codesetName(c.codesetId!"", "any measurement") /><#else>.</#if> | |||
<#if c.measurementOperand??>numeric value calculated as <@MeasurementOperand c=c.measurementOperand level=level+1 /></#if><#nested /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to enhance it so that the sentence is better readable
} | ||
|
||
protected abstract String getTableColumnForCriteriaColumn(CriteriaColumn column); | ||
|
||
protected String getAdditionalColumns(List<CriteriaColumn> columns) { | ||
protected String getAdditionalColumns(T criteria, List<CriteriaColumn> columns) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was the criteria parameter added to this function when it's not even used in it?
I wanted to offer some test case implementations but I noticed this is on a fork that I might not have access to push branches, so I'll see about requesting access (or if you could grant) and then I can push up PRs for your branch on your fork (I'd rather not fork-a-fork in order to collaborate). |
if (filteredColumns.size() > 0) { | ||
query = StringUtils.replace(query, "@additionalColumns", ", " + this.getAdditionalColumns(filteredColumns)); | ||
return StringUtils.replace(query, "@additionalColumns", ", " + this.getAdditionalColumns(criteria, filteredColumns)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe changing the 3 query assignments into 3 individual returns doesn't improve anything with the code, and makes it harder to debug when you just want to know what the query result that will be returned.
whereClauses.add(buildNumericRangeClause( | ||
MessageFormat.format("(C.value_as_number {0} SC.value_as_number)", criteria.measurementOperand.operator), | ||
criteria.measurementOperand.valueAsNumber, ".4f") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that we embed the operator (ie the sql operator) as the string in the payload.
I feel like the operators should be a distinct list of supported operators, and based on the user selection we translate it into the sql operation.
And, it's not quite clear how this is used. The template is:
(C.value_as_number {0} SC.value_as_number)
So can the operator be >
or ==
?
Or is the operator supposed to be an arithmetic operation (+, - , / *)
Unfortunately, I can't find documentation on this feature about how it's supposed to be used and what functions it supports.
@JsonProperty("ValueAsNumber") | ||
public NumericRange valueAsNumber; | ||
@JsonProperty("SameVisit") | ||
public Boolean sameVisit = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this needs to be true by default. Most of the time, we default boolean values to false.
if (criteria.measurementOperand != null && criteria.measurementOperand.measurement != null) { | ||
BuilderOptions options = new BuilderOptions(); | ||
options.additionalColumns = Collections.singletonList(CriteriaColumn.VALUE_AS_NUMBER); | ||
String subquery = getCriteriaSql((T) criteria.measurementOperand.measurement, options); | ||
joinClauses.add(MessageFormat.format("JOIN (SELECT E.person_id, E.event_id, E.visit_occurrence_id, E.value_as_number, row_number() over (PARTITION BY E.person_id ORDER BY E.start_date, E.event_id {1}) as ordinal FROM (\n{0}\n) E) SC on C.person_id = SC.person_id", | ||
subquery, | ||
Optional.ofNullable(criteria.measurementOperand.limit).map(limit -> limit.equals("Last") ? "DESC" : "").orElse("") | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be time-bounding to looking for these measurements (either prior to or post the 'index event'). I believe you would want to specify this somehow, and maybe the extent of the prior/post time to select records from. I understand 'same visit' addresses some of this, but I don't think you can depend on Measurement domains to come from the same visit, especially if the idea is to compare 2 measurements that could be spread across different visits.
Hi @wivern and @alex-odysseus , Thanks very much for the efforts you put into this feature. As it stands, I think there's necessary additions that are going to be necessary in order to accept this PR: Time bounding measurementsAs I was writing tests, I imagined an index measurement and then an associated measurements being located before and after this index within a time window. So I put my target measurement in the data, and then scattered other measurements before and after at 1mo, 3mo, 6mo time. When i wrote my usecase to use the earliest event that came from 3/6/9 months prior to the measurement, I noticed there wasn't an attribute to define this. This meant that a associated measurements could appear anywhere in the patient history and I'm pretty sure this would never be useful. I think the 'same visit' function would narrow the context to find the correlated measurement but frequently there is no visit-to-measurement relationship in CDMs, so depending on that behavior is not a good strategy. So limiting finding measurements to a time horizon is necessary. Operators are free-form textI was wondering what sort of operations are allowed in the MeasurementOperands operator field, and it looks like it's going to inject any string the user inputs into the SQL. Sql injection attacks aside, we would want to define what the valid operators are (maybe through an enum) and map user choices to the available operator, and then build the correct sql statement based on that choice (as well as the correct print-friendly output) Measurements can refer to itselfConsider a patient with exactly 1 measurement, and you create the measurement criteria to look for latest other measurement and subtract the values. Should it use itself as the 'other' measurement in these operations? I'm thinking 'no'. Joins lead to multiple rows to count when used in the CriteriaGroupsThis might be hard to describe, but given the index measurement event, you want it to appear once for counting when you are dealing with criteria groups (which say 'having at least one or exactly 0'). So if you do this measurement calculation, you could return multiple rows (if you don't specify first or last). I should confirm if you must specify the first or last in the criteria, but assuming you can let multiple measurement records into the query, if you have N number of related measurements that meet the NumericRange criteria, then the index event will yield multiple times from the JOIN statement. I'm wondering if it would be more appropriate to use If it is the case that the first/last is requred such that it will only evaluate 1 record from the subquery and therefore never return more than 1 record, then I think the change here would be to add aggregated functions to the co-measurement query to find the AVG, MIN, MAX of the records so you can say things like 'what was the average measurements values compared to this index measurement's value' or 'what was the min value in history compared to the index measurement's value'. NamingI'm not sure if ConclusionThis is as far as I got on the review, so there still may be more to come (ie: the above is not a complete review). It would have really helped if the intention of how this functionality was supposted to work and what the tradeoffs / limitations were in implementation so that I would have a better understanding about what to look for and what to accept as limitations (such as it was perfectly fine to not limit associated measurements to time windows because the default behavior is to use same-visit). Thank you again for your hard work on this. Being able to make assertions about associated or correlated measurements is a powerful feature (such as looking for people who experienced weight gain (at least 30 lbs) could be defined as taking a weight measurement and determing if the min() weight in the past year - index weight > 30). We just need some minimum functionality before I would want to incorporate it into the library. |
MeasurementOperand attribute is possible to configure for a Measurement so that calculation formula will be possible to construct in between two related Measurements
If multiple related Measurement events are at place it is possible to specify that the earliest/latest event should be used, another configuration option is that Measurements should belong to the same Visit only
The formula result value is treated as a number then and compared to some value(s) of interest