-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Belos: Fix Belos solver behavior to handle NaNs #12792
Conversation
The ThyraTpetraAdapters are an optional dependency, so they may not be available. Code that is dependent upon these adapters in Belos and Ifpack2 cannot build without them, so it should be conditionally compiled in.
In the event that a NaN is detected in the StatusTest residual norm classes, an exception is thrown. This exception has not historically been handled within the Belos solver managers and, if uncaught, will result in the exit of any code using these solvers. In comparison, AztecOO will handle any detected NaNs by throwing a warning, setting the solution vector to zeros, and returning a code indicating that it ran into issues and is unconverged. In practice, that approach is more robust in that it allows for the outer numerical code to attempt to recover from the introduction of NaNs in the linear solver. This commit adds a special exception for NaNs, StatusTestNaNError, that inherits off StatusTestError, which is thrown by any of the StatusTestResNorm classes when a NaN is encountered. Each of the solver managers will handle this special exception by outputting a warning, setting the solution vector to zero, and returning Unconverged.
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-11.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-2021.3
Jenkins Parameters
Using Repos:
Pull Request Author: hkthorn |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-11.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-2021.3
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ csiefer2 ]! |
Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR... |
@trilinos/belos @trilinos/stratimikos
Motivation
In the event that a NaN is detected in the StatusTest residual norm classes,
an exception is thrown. This exception has not historically been handled
within the Belos solver managers and, if uncaught, will result in the exit
of any code using these solvers. In comparison, AztecOO will handle any
detected NaNs by outputting a warning, setting the solution vector to zeros,
and returning a code indicating that it ran into issues and is unconverged.
In practice, that approach is more robust in that it allows for the outer
numerical code to attempt to recover from the introduction of NaNs in the
linear solver.
This commit adds a special exception for NaNs, StatusTestNaNError, that
inherits off StatusTestError, which is thrown by any of the StatusTestResNorm
classes when a NaN is encountered. Each of the solver managers will handle
this special exception by outputting a warning, setting the solution vector
to zero, and returning Unconverged.
Stakeholder Feedback
This change will improve the robustness of the Trilinos solver stack to numerical challenges within the application codes it serves. In particular this work has been motivated by current efforts to improve the performance of Charon.
Testing
Mac OSX, GCC13