Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MeasureQualifierCode definitions #358

Closed
6 tasks
cristinamullin opened this issue Nov 14, 2023 · 1 comment · Fixed by #369
Closed
6 tasks

MeasureQualifierCode definitions #358

cristinamullin opened this issue Nov 14, 2023 · 1 comment · Fixed by #369
Assignees
Labels
Good First Issue Good issue for first time contributors Usability

Comments

@cristinamullin
Copy link
Collaborator

Is your feature request related to a problem? Please describe. Describe the solution you'd like

MeasureQualifierCodes are hard to interpret. Leverage the TADA WQXMeasureQualifierCodeRef.csv to add definitions for each MeasureQualifierCode. Use TADA_GetMeasureQualifierCodeRef() (https://usepa.github.io/TADA/reference/TADA_GetMeasureQualifierCodeRef.html) to get table. Recommend to create a new TADA.MeasureQualifierCode upon data retrieval as part of autoclean, that is the concatenated MeasureQualifierCode from the dataset with it's associated description from the MeasureQualifierCodeRef.

New features should include all of the following work:

  • Create the function/code.

  • Document all code using comments to describe what is does.

  • Create tests in tests folder.

  • Create help file using roxygen2 above code.

  • Create working examples in help file (via roxygen2).

  • Add to appropriate vignette (or create new one).

@cristinamullin cristinamullin added Good First Issue Good issue for first time contributors Usability labels Nov 14, 2023
@hillarymarler hillarymarler self-assigned this Dec 15, 2023
@hillarymarler
Copy link
Collaborator

hillarymarler commented Dec 18, 2023

This one has been a little trickier than I expected as some results have multiple qualifiers associated.

This i what I have working so far.
(1) Create concatenated MeasureQualifierCodes and descriptions using MeasureQualifierCodeRef.
(2) Separate and unnest any instances of multiple MeasureQualiferCodes per sample.
(3) Merge the two by MeasureQualifierCode to add the concatenated descriptions to the data.frame of WQP data
(4) Group by ResultIdentifier and use summarize() to return only one row per ResultIdentifier
(5) Assign to TADA.MeasureQualifierCode in .data by ResultIdentifier

I tried a few approaches, but so far this is the most efficient one that I have come up with. Any other suggestions for other strategies? This one does work, but there may be a more efficient option.

  mqc.ref <-  utils::read.csv(system.file("extdata", "WQXMeasureQualifierCodeRef.csv", package = "TADA")) %>%
    select(Code, Description) %>%
    group_by(Code) %>%
    mutate(Concat = paste(Code, "-", Description, collapse  ="")) %>%
    select(Code, Concat) %>%
    rename(MeasureQualifierCode = Code)
  
  mqc.TADA <- .data %>%
    mutate(MeasureQualifierCode = str_split(MeasureQualifierCode, ";")) %>%
    unnest(MeasureQualifierCode) %>%
    merge(mqc.try) %>%
    group_by(ResultIdentifier) %>%
    summarize(TADA.MeasureQualifierCode = paste(Concat, collapse = "; "))
  
  
  .data$TADA.MeasureQualifierCode <- mqc.TADA$TADA.MeasureQualifierCode[match(.data$ResultIdentifier, mqc.TADA$ResultIdentifier)]
  
  rm(mqc.ref, mqc.TADA)

This data set gives a few examples of results with multiple qualifiers:

MT_ex <- TADA_DataRetrieval(startDate = "2013-07-01",
                            endDate = "2013-07-20",
                            organization = "MTWTRSHD_WQX")

@hillarymarler hillarymarler linked a pull request Dec 20, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good First Issue Good issue for first time contributors Usability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants