Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #3024 and #3030 Series-Analysis GRAD #3036

Merged
merged 24 commits into from
Jan 16, 2025

Conversation

JohnHalleyGotway
Copy link
Collaborator

@JohnHalleyGotway JohnHalleyGotway commented Dec 12, 2024

This pull request is for enhancements described in issue MET#3024 and MET#3030. I originally did the changes for MET#3024 on a branch named feature_3024_GRAD and then created this new feature_3030_series_analysis_GRAD branch from the feature_3024_GRAD. I am combining them into one PR to make the review process more efficient.
This PR includes all the following changes:

For MET#3024:

  • Adds 4 new columns to the GRAD line type written by Grid-Stat: FGMAG, OGMAG, MAG_RMSE, LAPLACE_RMSE
  • Updates Stat-Analysis to parse the new columns when reading GRAD lines.
  • Updates the documentation:
    • Adds 4 new rows to the GRAD line type table in the Grid-Stat chapter.
    • Adds equations to Appendix C to define their computaiton.
    • Adds reference to the DRAFT PAPER about sharpness (Please advise if/how this reference should be updated!).

For MET#3030:

  • Adds new gradient dictionary and output_stat.grad entry to the default Series-Analysis config file.
  • Updates all Series-Analysis config files with these changes.
  • Updates logic of Series-Analysis to compute gradient statistics for each gradient requested in the gradient dictionary.
  • Updates the documentation:
    • Moves description of the gradient dictionary from the Grid-Stat chapter to the "common config entries" chapter.
    • Notes that output_stats.grad can be set to "ALL" to facilitate aggregation across multiple runs.
  • Updates the testing by re-configuring an existing config test for precip to request that 2 gradients (sizes 1 and 3) by computed. Note that gradient stats aren't all that great for precip, but that the missing data and 0 values makes it really good for software testing.

Expected Differences

  • Do these changes introduce new tools, command line arguments, or configuration file options? [Yes]

    If yes, please describe:

    In Series-Analysis config file, adds new gradient dictionary and output_stats.grad option.

  • Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [Yes]

    If yes, please describe:

  • Adds 4 new columns (FGMAG, OGMAG, MAG_RMSE, LAPLACE_RMSE) to the end of the existing GRAD line type, written by Grid-Stat.

  • Enhances Series-Analysis to compute/write GRAD statistics to its NetCDF output.

Pull Request Testing

  • Describe testing already performed for these changes:

    Manually ran Grid-Stat to confirm the logic for computing GRAD stats in a single run, using the existing unit tests.
    Manually ran Series-Analysis to confirm the logic for GRAD stats in a single run, plus aggregating them across multiple ones.

  • Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

  • Several things:

    • Confirm with @bgbrowntollerud that the implementation in MET matches the logic described in the source paper. And that the new equations in Appendix C are correct.
    • Inspect the differences flagged by the regression test for this PR to confirm that the modified output from Grid-Stat and Series-Analysis make sense... and that all differences are expected.
    • Review the documentation updates for clarity and accuracy.
  • Please find this feature branch compiled/available for testing on seneca in:

/d1/projects/MET/MET_pull_requests/met-12.1.0/beta1/MET-feature_3030_series_analysis_GRAD/bin
  • Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]

  • Do these changes include sufficient testing updates? [Yes]
    Adds no new tests, but reconfigures existing ones which causes differences in the output.

  • Will this PR result in changes to the MET test suite? [Yes]

    If yes, describe the new output and/or changes to the existing output:

  • 4 new columns added to all instance of the GRAD line type.

  • Modified output from 2 Series-Analysis runs that now include new gradient output variables.

Note that I inspected the differences flagged in this GHA testing workflow run. Differences exist in the following 9 files:

egrep -i "file1:|file2:|ERROR" comp_dir.log  | egrep -i -B 2 ERROR | grep file1 | cut -d':' -f2
 /data/output/met_test_truth/climatology_1.5deg/grid_stat_WMO_CLIMO_1.5DEG_240000L_20120410_000000V.stat
 /data/output/met_test_truth/grid_stat/grid_stat_GRIB1_NAM_STAGE4_120000L_20120409_120000V.stat
 /data/output/met_test_truth/grid_stat/grid_stat_GRIB1_NAM_STAGE4_120000L_20120409_120000V_grad.txt
 /data/output/met_test_truth/met_test_scripts/grid_stat/grid_stat_120000L_20050807_120000V.stat
 /data/output/met_test_truth/met_test_scripts/grid_stat/grid_stat_120000L_20050807_120000V_grad.txt
 /data/output/met_test_truth/met_test_scripts/stat_analysis/job_aggregate_GRAD.stat
 /data/output/met_test_truth/met_test_scripts/stat_analysis/stat_analysis.out
 /data/output/met_test_truth/series_analysis/series_analysis_AGGR_CMD_LINE_APCP_06_2012040900_to_2012041018.nc
 /data/output/met_test_truth/series_analysis/series_analysis_CMD_LINE_APCP_06_2012040900_to_2012041000.nc
  • I used vimdiff on seneca to look through the diffs in all .txt and .stat files and confirmed that they're all due to the 4 new columns being added to the end of the GRAD line type.
  • For the NetCDF Series-Analysis output, I see that the existing TRUTH output has 50 gridded fields and the updated OUTPUT now has 78, with 28 being added by setting output_stat.grad = "ALL" in the config file. That 14 columns in the GRAD line type (TOTAL ... LAPLACE_RMSE) x 2 gradients: dx,dy = (1,1) and (3,3).
> ncdump -h series_analysis/series_analysis_CMD_LINE_APCP_06_2012040900_to_2012041000_TRUTH.nc  | grep "float series" | wc -l
50
> ncdump -h series_analysis/series_analysis_CMD_LINE_APCP_06_2012040900_to_2012041000_OUTPUT.nc  | grep "float series" | wc -l
78
  • So all of these differences are consistent with the code changes for this PR.

  • Will this PR result in changes to existing METplus Use Cases? [Yes]

    If yes, create a new Update Truth METplus issue to describe them.
    The output from METplus use case that writes the GRAD line type will also change.

  • Do these changes introduce new SonarQube findings? [No]

    If yes, please describe:
    The current develop branch flags 18,253 code smells overall.
    After making some changes to fix easy ones, I was able to reduce them in the feature_3030_series_analysis_GRAD branch down to 18,173 overall.

  • Please complete this pull request review by [Friday 1/17/25].

Pull Request Checklist

See the METplus Workflow for details.

  • Review the source issue metadata (required labels, projects, and milestone).
  • Complete the PR definition above.
  • Ensure the PR title matches the feature or bugfix branch name.
  • Define the PR metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the version that will include these changes
    Select: Coordinated METplus-X.Y Support project for bugfix releases or MET-X.Y.Z Development project for official releases
  • After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
  • After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
  • Close the linked issue and delete your feature or bugfix branch from GitHub.

…new columns to the existing GRAD line type.
…Stat to the common area and then referencing it in both Grid-Stat and Series-Analysis.
…dictionary and an entry for output_stats.gradient. Update the conf_info source code to parse them. Still need to update OTHER Series-Analysis config files and also update the logic in series_analysis.cc to compute GRAD statistics.
…ong_name attribute of the Series-Analysis output files.
…crementally across multiple runs. However, this can only be done when requesting that 'ALL' GRAD columns be written.
@JohnHalleyGotway JohnHalleyGotway requested review from j-opatz and removed request for KathrynNewman January 13, 2025 16:05
@JohnHalleyGotway JohnHalleyGotway added pull request: MODIFIES CONFIG Changes that add new or modify existing configuration options pull request: MODIFIES OUTPUT Changes that add new or modify existing output formats labels Jan 14, 2025
@@ -130,6 +130,12 @@ References
| a review and proposed framework. *Meteorological Applications*, 15, 51-64.
|
.. _Ebert-Uphoff-2024:

| Ebert-Uphoff, I.,, 2024: An Investigation of Metrics to Evaluate the Sharpness
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this fits under "unpublished material" type. If so, that would mean it can be mentioned in-line with I. Ebert-Uphoff (2024, unpublished article) and is not included in the reference page.

However, if this is expected to be accepted soon, then I'd rather we just stick to a normal journal article citation (similar to how it currently is written) rather than trying to remember the name of the article later on.

Copy link
Contributor

@j-opatz j-opatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor updates that are listed as comments should be completed; I'm going to run these equations by Barb today in our meeting (if it's not cancelled) and approve/request changes based on her feedback.

@bgbrowntollerud
Copy link

bgbrowntollerud commented Jan 15, 2025 via email

j-opatz
j-opatz previously approved these changes Jan 15, 2025
Copy link
Contributor

@j-opatz j-opatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some syntax corrections and discussion of the implementation with Barb, it seems like this feature branch is approved. I noted a different way the draft article could be cited, but as we are not beholden to any specific citation notation, what's there is good. The differences cited by GHA also are within expectations and are from the new column headers (and new configuration file options).

Ultimately I don't have the data or the time to create a proper test for this new logic; I sincerely hope we can obtain timely feedback from whomever helped create this issue. Testing with their datasets would be the fastest way to check for coding accuracy.

@JohnHalleyGotway
Copy link
Collaborator Author

Proceeding with the squash and merge. I note that I did have to resolve a minor conflict in the Grid-Stat user's guide that appeared after I submitted this PR.

@JohnHalleyGotway JohnHalleyGotway merged commit a6ed575 into develop Jan 16, 2025
12 of 13 checks passed
metplus-bot added a commit to dtcenter/METplus that referenced this pull request Jan 16, 2025
georgemccabe pushed a commit to dtcenter/METplus that referenced this pull request Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pull request: MODIFIES CONFIG Changes that add new or modify existing configuration options pull request: MODIFIES OUTPUT Changes that add new or modify existing output formats
Projects
Status: 🏁 Done
3 participants