Bugfix: METplus GridStat can't find OBS file using template and time range #2415
-
Replace italics below with details for this issue. Describe the ProblemA METplus error occurs when this EVS/cam job is run on NCEP's WCOSS2 system in one environment (named "test_ecflow_env" for testing here). However, the same job is successful when run in a different environment ("test_parallel_env"). In the first case, METplus is unable to find a file that exists when checked manually. Could you help us debug? Attached are copies of (1) the stdout log for the EVS job that failed (2) the METplus log file of the failed METplus run, and (3) two test scripts, described below and in the "To Reproduce" section. Here is an excerpt from the stdout log for the EVS job that included that METplus run:
The above excerpt is a section of our code that checks for the presence of an observation file that will be used in METplus, a check which was successful in this case. To confirm, a list command for the file showed that the file does exist on the system (WCOSS2/Dogwood). However, the METplus run that followed the excerpted code outputs an error. The associated METplus log is attached, but here is an excerpt:
The valid time in this case is 2023110715, so METplus should have been able to find the file. We also confirmed that (1) the observation file was already available before METplus was run, at about 11/08 18:15Z and (2) the valid time stored in the observation file is 11/07/2023 15:00:38Z, which should have satisfied the matching condition in METplus. I've set up two test cases ("test_ecflow_env" and "test_parallel_env") with slightly different environments, one in which the error described here can be replicated, the other in which the same METplus run is successful. I'm wondering if any part of our configuration jumps out to you as a possible cause of the issue? Expected BehaviorThe observation file should satisfy the METplus check for files that match the template within the valid hour range (11/07/2023 15Z +/- 5 mins). That observation file should be used to complete the METplus run successfully (the forecast file having been already found successfully). In other words, the "test_parallel_env" test results represent the expected behavior. EnvironmentDescribe your runtime environment:
To ReproduceDescribe the steps to reproduce the behavior:
If testing this on WCOSS2/Dogwood (currently the production machine) is not possible, or if you'd prefer the test data is provided here, please let me know and I can ask someone to copy it over. Thanks Relevant DeadlinesThe relevant EVS code will be delivered at COB 11/17/2023. Funding SourceNOAA/NCEP/EMC/VPPPG Define the MetadataAssignee
Labels
Projects and Milestone
Define Related Issue(s)Consider the impact to the other METplus components. Bugfix ChecklistSee the METplus Workflow for details.
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 6 replies
-
Hi: Could you include a log file for a successful run, as this will help track down the differences. Thanks, |
Beta Was this translation helpful? Give feedback.
-
After reviewing the files you provided and your notes, I don't see anything obviously incorrect with your configuration. I don't have access to Dogwood and the only developer on our team that currently has access is not available to help with this until tomorrow afternoon. I have scheduled a meeting with them to take a closer look. If you are able to send over the log file for a successful run and the metplus_final.conf (defined by There are a few things I noticed that I don't think are the cause of the failure but may point you in the right direction. The +/-5 minute window you have set only considers the filenames, not the actual time stored inside the files. In your example, the filename exactly matches the valid time, so a window is not necessary to find the
Your example has 0000 for the minutes/seconds, so this shouldn't be the reason that the file was not found. |
Beta Was this translation helpful? Give feedback.
-
I worked with a few other METplus developers for an hour trying to investigate this error. We were able to replicate the error using the scripts you provided, so thank you for providing those easy-to-use scripts. We had to process files from 20231105, as there were no files in the com directory past that date. We were not able to figure out why the use case was failing to find the files within the time we had available. However, we were able to get it to work properly by removing the file window so the exact file name is searched. To remove the file window, change these values in your METplus configuration file:
We may be able to continue debugging the cause of the issue on Tuesday of next week if you'd like, but I hope that changing the configuration settings will get you a successful run. Please let me know if that change fixes the issue and if you'd like us to continue investigating the failure next week. |
Beta Was this translation helpful? Give feedback.
Hi @MarcelCaron-NOAA,
I worked with a few other METplus developers for an hour trying to investigate this error. We were able to replicate the error using the scripts you provided, so thank you for providing those easy-to-use scripts. We had to process files from 20231105, as there were no files in the com directory past that date. We were not able to figure out why the use case was failing to find the files within the time we had available. However, we were able to get it to work properly by removing the file window so the exact file name is searched. To remove the file window, change these values in your METplus configuration file: