-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with run_neon with the --experiment
flag starting in ctsm5.1.dev172
#2433
Comments
--experiment
flag?
Thanks for pointing this out @wwieder. This must be in #2363, but because it's a refactoring step it's hard to point to the exact line. This is where adding more testing for all the important configurations run_neon is used in would be critical to prevent this type of thing. We test a few -- but obviously not enough. So @TeaganKing and @wwieder it would be good to sit down together and map out the list of important configuration of options you use for run_neon. It looks like right now our testing is really minimal, and we need to also test interactions of command line options with each other (like in this case). Having this kind of comprehensive testing enables us to refactor and improve the design without losing important functionality. Rather than adding comprehensive testing up front we've been incrementally adding testing as we go, which in this case led to this problem. In Agile SDM it's considered important to have the comprehensive testing up front so you know you can refactor without regressing functionality. So we might want to discuss what kind of balance we want to have between those two things as we move forward. |
Thanks for catching this bug and making this issue. A few other notes I wanted to add are below.
|
A few more errors after trying to start postad runs
|
it seems like the run directory isn't being created in the new .postad case, the initial conditions can't be copied, and the jobs fail. Manually copying files into the run directory still doesn't pickup the right restart files (not sure what else isn't happening correctly...) |
Thanks for these extra details, @wwieder ! |
I may have fixed the same failure in this test with a suggestion from @ekluzek: |
Hi @slevis-lmwg , Is there another stand-up other than the Tuesday 3pm one? @ekluzek , @wwieder , and I were planning to discuss testing strategies at 11am this morning. I think the PR mentioned above includes a fix that seems to be working, but if you have already implemented this and/or want to discuss alternative methods to fixing this issue, I'd be happy to chat! |
Mondays 10 am MT we have the ctsm software stand-up. The error message seems the same but our fixes are different. I think that yours seems fine in the context of this code. |
Okay, thanks. I have a different 10-11am meeting, but I might be able to join at the end of the meeting if we wrap up my other meeting early... Otherwise maybe Erik and Will can bring a summary of where we're at to the stand up, and then I can discuss with them a bit at 11am. Or, if you want to point me to your fix, it might be helpful to see what you did, too. Did you implement/merge this in already? |
I confirmed that @TeaganKing fixes in #2435 address the The issue with creating postad cases seems to be more related to the externals update in dev172?, see #2437 (which should be addressed separately). |
Thanks for testing this @wwieder ! I also did some tests on |
I am however running into some issues such as the following. @wwieder are you seeing similar results, as well? Command submitted: Error: |
agreed, @TeaganKing I'm getting this error too. A bit up I'm also seeing 'File /glade/derecho/scratch/wwieder/neon_AK/RMNP.no_runtype_test.transient/LockedFiles/env_build.xml has been modified I'm not really sure where the calendar is getting set or changed, but maybe this is a clue? |
In our meeting this morning we thought to make this problem less likely, @TeaganKing will do the following:
This would exercise the most important options to run-neon and gives much better test coverage for the options in run-neon. Also putting #2438 in place would help us with externals issues. |
It looks like the calendar differences error is only occurring with the newest CIME checkout. I'm still a bit stumped as to why it's occurring. With the same command as above, I'm getting a different error with the previous CIME checkout: |
I don't seem to be able to check off items in your comment, but these are addressed in #2406 -- with the caveat that I think we discussed not actually testing prism and specifying 'ad' run in the second test. |
With ctsm5.2 work I ended up doing the python testing for recent versions of ctsm5.1. And I noticed that ctsm5.1.dev175 fails as follows (previous versions pass from dev171 to dev174). I think this might be helpful to @TeaganKing @wwieder and @slevis-lmwg. @slevis-lmwg this covers what we were discussing with the b4b-dev tag testing (that we added to a future CTSM SE meeting to discuss with the group). In this case for ctsm5.1.dev175, doing the python testing would have been helpful (and I'm also just making sure you didn't run the python testing). But, I think the error below might help us in tracking down where at least the fail below happened. And that might fix some issues for us...
|
--experiment
flag?--experiment
flag starting in ctsm5.1.dev172
experiment bug fix Address #2433 with changes to the arguments in run_case
Brief summary of bug
It seems like in the NEON refactor for PLUMBER2 we lost come capabilities using the --experiment flag? Specifically creating an AD case that includes the experiment flag in the case name.
General bug information
CTSM version you are using: dev175
Does this bug cause significantly incorrect results in the model's science? No
Configurations affected: running NEON cases with
--experiment
flag.Details of bug
I created two cases, one with dev171 & another with dev175, that were intended to evaluate impacts of dead arctic veg on NEON simulations in Alaska. The following command with dev175 creates an AD case, as expected, but without the experiment flag in the case name.
the same command using dev 171 produces case names like this:
BARR.dev171.ad
cases are all in my scratch directory:
/glade/u/home/wwieder/scratch/neon_AK
The text was updated successfully, but these errors were encountered: