-
Notifications
You must be signed in to change notification settings - Fork 213
How to set up a case and customize the PE layout
After creating a case using create_newcase, you need to call the case.setup command from $CASEROOT
. To see the options to case.setup use the --help
option. Calling case.setup
creates the following additional files and directories in $CASEROOT
: (**TODO: which files are modifiable below?)
case.setup -clean removes $CASEROOT/$CASE.run
and must be run if modifications are made to env_mach_pes.xml
. case.setup must then be rerun before you can build and run the model. If env_mach_pes.xml
variables need to be changed after case.setup has been called, then case.setup -clean must be run first, followed by case.setup.
(Also see the Section called *BASICS: What are the directories and files in my case directory?* in Chapter 6.)
The file, env_mach_pes.xml
, determines the number of processors and OpenMP threads for each component, the number of instances of each component and the layout of the components across the hardware processors. Optimizing the throughput and efficiency of a CIME experiment often involves customizing the processor (PE) layout for load balancing. CIME provides significant flexibility with respect to the layout of components across different hardware processors. In general, the CIME components -- atm, lnd, ocn, ice, glc, rof, wav, and cpl -- can run on overlapping or mutually unique processors. Whereas Each component is associated with a unique MPI communicator, the CIME driver runs on the union of all processors and controls the sequencing and hardware partitioning. The component processor layout is via three settings: the number of MPI tasks, the number of OpenMP threads per task, and the root MPI processor number from the global set.
The entries in env_mach_pes.xml
have the following meanings:
XML entry | Description |
---|---|
NTASKS | the total number of MPI tasks, a negative value indicates nodes rather than tasks. |
NTHRDS | the number of OpenMP threads per MPI task. |
ROOTPE | the global mpi task of the component root task, if negative, indicates nodes rather than tasks. |
PSTRID | the stride of MPI tasks across the global set of pes (for now set to 1) |
NINST | the number of component instances (will be spread evenly across NTASKS) |
For example, if a component has NTASKS=16
, NTHRDS=4
and ROOTPE=32
, then it will run on 64 hardware processors using 16 MPI tasks and 4 threads per task starting at global MPI task 32. Each CIME component has corresponding entries for NTASKS
, NTHRDS
, ROOTPE
and NINST
in env_mach_pes.xml
. There are some important things to note.
- NTASKS must be greater or equal to 1 (one) even for inactive (stub) components.
- NTHRDS must be greater or equal to 1 (one). If NTHRDS is set to 1, this generally means threading parallelization will be off for that component. NTHRDS should never be set to zero.
- The total number of hardware processors allocated to a component is NTASKS * NTHRDS.
- The coupler processor inputs specify the pes used by coupler computation such as mapping, merging, diagnostics, and flux calculation. This is distinct from the driver which always automatically runs on the union of all processors to manage model concurrency and sequencing.
- The root processor is set relative to the MPI global communicator, not the hardware processors counts. An example of this is below.
- The layout of components on processors has no impact on the science. The scientific sequencing is hardwired into the driver. Changing processor layouts does not change intrinsic coupling lags or coupling sequencing. ONE IMPORTANT POINT is that for a fully active configuration, the atmosphere component is hardwired in the driver to never run concurrently with the land or ice component. Performance improvements associated with processor layout concurrency is therefore constrained in this case such that there is never a performance reason not to overlap the atmosphere component with the land and ice components. Beyond that constraint, the land, ice, coupler and ocean models can run concurrently, and the ocean model can also run concurrently with the atmosphere model.
- If all components have identical NTASKS, NTHRDS, and ROOTPE set, all components will run sequentially on the same hardware processors.
An important, but often misunderstood point, is that the root processor for any given component, is set relative to the MPI global communicator, not the hardware processor counts. For instance, in the following example:
NTASKS(ATM)=6 NTHRRDS(ATM)=4 ROOTPE(ATM)=0
NTASKS(OCN)=64 NTHRDS(OCN)=1 ROOTPE(OCN)=16
Note: env_mach_pes.xml
cannot be modified after "./case.setup" has been invoked without first invoking "case.setup -clean". For an example of changing pes, see the Section called *BASICS: How do I change processor counts and component layouts on processors?* in Chapter 6.