diff --git a/_toc.yml b/_toc.yml index 4408f5159..4ab5c8427 100644 --- a/_toc.yml +++ b/_toc.yml @@ -168,6 +168,7 @@ chapters: sections: - file: notebooks/diagnostics/cice/basics_cice - file: notebooks/diagnostics/cice/advanced_cice + - file: notebooks/diagnostics/cupid - file: notebooks/diagnostics/additional/additional sections: - file: notebooks/diagnostics/additional/postprocessing diff --git a/notebooks/diagnostics/cupid.ipynb b/notebooks/diagnostics/cupid.ipynb new file mode 100644 index 000000000..2b8a8e8a4 --- /dev/null +++ b/notebooks/diagnostics/cupid.ipynb @@ -0,0 +1,411 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "6cbe06f3-0299-4db3-a02e-d4b27ffbd27e", + "metadata": {}, + "source": [ + "# CUPiD" + ] + }, + { + "cell_type": "markdown", + "id": "3c0aaa4a-ebf0-4c3c-a045-ee58d183ce45", + "metadata": {}, + "source": [ + "The CESM Unified Postprocessing and Diagnostics (CUPiD) package is a new python-based system for running post-processing routines and diagnostics across all CESM components with a common user and developer interface.\n", + "This notebook is a chance to try out CUPiD and run its' time series generation tool on your own model simulation. \n", + "\n", + "Note that the underlying python code is very similar to the routines shown in the component-specific diagnostics, which is why we recommend trying those notebooks first. Additional info, including the source code, can be found on [Github here](https://github.com/NCAR/CUPiD)." + ] + }, + { + "cell_type": "markdown", + "id": "d5f923bd-6d73-4e60-ac49-8295c96e585c", + "metadata": {}, + "source": [ + "**BEFORE BEGINNING THIS EXERCISE** - Check that your kernel (upper right corner, above) is `Bash`. This should be the default kernel, but if it is not, click on that button and select `Bash`." + ] + }, + { + "cell_type": "markdown", + "id": "4ac4bcb0-3b7c-488b-85d5-9c3c3802ab4e", + "metadata": {}, + "source": [ + "CUPiD is currently a command line tool. This means that instead of running python code directly, this notebook will run unix commands CUPiD provides in order to generate the relevant diagnostics. To start, we need to clone CUPiD from Github:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9bdbabd9-4479-440f-9f82-c97f57da3e90", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "#Delete old CUPiD directory if one exists:\n", + "if [ -d \"CUPiD\" ]; then\n", + " rm -rf CUPiD\n", + "fi\n", + "\n", + "#Clone CUPiD source code from Github repo:\n", + "git clone --recurse-submodules https://github.com/NCAR/CUPiD.git\n", + "cd CUPiD #Need to enter CUPiD directory for remaining commands" + ] + }, + { + "cell_type": "markdown", + "id": "efc83e39-b35d-4dd5-9411-7737ac648375", + "metadata": {}, + "source": [ + "We'll also need to grab some external libraries, which is done the same way as CESM via `checkout_externals`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "13f43abf-a7cd-40fb-a55b-e68781282162", + "metadata": {}, + "outputs": [], + "source": [ + "./manage_externals/checkout_externals" + ] + }, + { + "cell_type": "markdown", + "id": "e30ceb39-d434-4520-a29b-e23343f633e0", + "metadata": {}, + "source": [ + "This downloads two additional diagnostics packages that CUPiD will use. One is the [AMWG Diagnostics Framework (ADF)](https://github.com/NCAR/ADF), which is a command-line tool that can be used to generate CAM diagnostics, and [mom6-tools](https://github.com/NCAR/mom6-tools.git), which is a python package that can be used to analyze MOM6, which is the ocean model that will be used in CESM3 (but for this tutorial we'll ignore)." + ] + }, + { + "cell_type": "markdown", + "id": "5b3c7314-4a6d-40a8-92b0-d3953335a0a6", + "metadata": {}, + "source": [ + "Next we need to setup the proper python environment using conda/mamba, and activate the `cupid-dev` environment:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d105bb3b-84bf-4e87-a6ab-8c2e8b40b57a", + "metadata": {}, + "outputs": [], + "source": [ + "#Load conda to your environment:\n", + "module load conda\n", + "\n", + "#Install 'cupid-dev' environment if it doesn't already exist:\n", + "if ! { conda env list | grep 'cupid-dev'; } >/dev/null 2>&1; then\n", + " mamba env create -f environments/dev-environment.yml\n", + "fi\n", + "\n", + "#Install 'cupid-analysis' environment if ti doesn't already exist:\n", + "if ! { conda env list | grep 'cupid-analysis'; } >/dev/null 2>&1; then\n", + " mamba env create -f environments/cupid-analysis.yml\n", + "fi\n", + "\n", + "#Activate CUPiD conda environemnt:\n", + "conda activate cupid-dev\n", + "#NOTE: You may see a red \": 1\" message below, but it can be ignored.\n", + "\n", + "#Check that cupid-run can be accessed appropriately:\n", + "which cupid-run\n", + "if [ $? -ne 0 ]; then\n", + " #If not then use pip to install:\n", + " pip install -e .\n", + "fi" + ] + }, + { + "cell_type": "markdown", + "id": "aa3bd9bc-fd0c-44d5-b80a-14a509d7ffb4", + "metadata": {}, + "source": [ + "CUPiD is controlled via a config YAML file. Here we create a new directory and write the relevant config file for our tutorial simulation. Please note that if your tutorial simulations didn't finish then you can use the provided simulations instead: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f515b255-d9a0-4dc6-87d3-95c34a2e2f0a", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "cd examples #Go to the examples directory\n", + "if ! [ -d \"cesm_tutorial\" ]; then #Check if CESM tutorial directory already exists.\n", + " mkdir cesm_tutorial #If not, then make a new directory to hold our config file\n", + "fi \n", + "cd cesm_tutorial #Go to newly made CESM tutorial example directory\n", + "cat << EOF > config.yml\n", + "################## SETUP ##################\n", + "\n", + "#NOTE: CUPiD ocean diagnostics are currently only designed for upcoming MOM6\n", + "# ocean model, so for this tutorial we will only do example atmosphere,\n", + "# land, and sea ice diagnostics.\n", + "\n", + "################\n", + "# Data Sources #\n", + "################\n", + "data_sources:\n", + " # sname is any string used as a nickname for this configuration. It will be\n", + " ### used as the name of the folder your computed notebooks are put in\n", + " sname: cesm_tutorial_quick_run\n", + "\n", + " # run_dir is the path to the folder you want\n", + " ### all the files associated with this configuration\n", + " ### to be created in\n", + " run_dir: .\n", + "\n", + " # nb_path_root is the path to the folder that cupid will\n", + " ### look for your template notebooks in. It doesn't have to\n", + " ### be inside run_dir, or be specific to this project, as\n", + " ### long as the notebooks are there\n", + " nb_path_root: ../nblibrary\n", + "\n", + "######################\n", + "# Computation Config #\n", + "######################\n", + "\n", + "computation_config:\n", + "\n", + " # default_kernel_name is the name of the environment that\n", + " ### the notebooks in this configuration will be run in by default.\n", + " ### It must already be installed on your machine. You can also\n", + " ### specify a different environment than the default for any\n", + " ### notebook in NOTEBOOK CONFIG\n", + "\n", + " default_kernel_name: cupid-analysis\n", + "\n", + "############# NOTEBOOK CONFIG #############\n", + "\n", + "############################\n", + "# Notebooks and Parameters #\n", + "############################\n", + "\n", + "# All parameters under global_params get passed to all the notebooks\n", + "\n", + "global_params:\n", + " CESM_output_dir: /glade/derecho/scratch/USERNAME/archive #<-Replace \"USERNAME\" with your own username\n", + " #Uncomment code here if you need a complete CESM tutorial simulation:\n", + " #CESM_output_dir: /glade/campaign/cesm/tutorial/tutorial_2023_archive\n", + " lc_kwargs:\n", + " threads_per_worker: 1\n", + "\n", + "timeseries:\n", + " # This section of the config file controls the time series generator, which\n", + " # takes standard CESM history (time-slice) files and converts them into single\n", + " # variable time series files.\n", + " \n", + " num_procs: 8\n", + " ts_done: [False]\n", + " overwrite_ts: [False]\n", + " ts_output_dir: /glade/derecho/scratch/USERNAME/archive #<-Replace \"USERNAME\" with your own username\n", + " case_name: 'b.day2.1'\n", + "\n", + " #Variables can either be provided as a list (e.g. ['X', 'Y', 'Z']) or,\n", + " #if you want to convert everything on the file, by using the ['process_all']\n", + " #keyword. For the example below we'll only convert a single variable\n", + " #from each component.\n", + "\n", + " atm:\n", + " vars: ['TREFHT']\n", + " derive_vars: []\n", + " hist_str: 'h0'\n", + " start_years: [1]\n", + " end_years: [3]\n", + " level: 'lev'\n", + "\n", + " lnd:\n", + " vars: ['ALTMAX']\n", + " derive_vars: []\n", + " hist_str: 'h0'\n", + " start_years: [1]\n", + " end_years: [3]\n", + " level: 'lev'\n", + "\n", + " ocn:\n", + " vars: [] # Not doing ocean analyses\n", + " derive_vars: []\n", + " hist_str: 'h'\n", + " start_years: [1]\n", + " end_years: [3]\n", + " level: 'lev'\n", + "\n", + " ice:\n", + " vars: ['hi']\n", + " derive_vars: []\n", + " hist_str: 'h'\n", + " start_years: [1]\n", + " end_years: [3]\n", + " level: 'lev'\n", + "\n", + " glc:\n", + " vars: ['usurf']\n", + " derive_vars: []\n", + " hist_str: 'initial_hist'\n", + " start_years: [1]\n", + " end_years: [3]\n", + " level: 'lev'\n", + "\n", + "#IGNORE EVERYTHING BELOW THIS LINE!\n", + "\n", + "########### JUPYTER BOOK CONFIG ###########\n", + "\n", + "##################################\n", + "# Jupyter Book Table of Contents #\n", + "##################################\n", + "book_toc:\n", + "\n", + " # See https://jupyterbook.org/en/stable/structure/configure.html for\n", + " # complete documentation of Jupyter book construction options\n", + "\n", + " format: jb-book\n", + "\n", + " # All filenames are notebook filename without the .ipynb, similar to above\n", + "\n", + " root: infrastructure/index # root is the notebook that will be the homepage for the book\n", + " parts:\n", + "\n", + " # Parts group notebooks into different sections in the Jupyter book\n", + " # table of contents, so you can organize different parts of your project.\n", + "\n", + "# - caption: Atmosphere\n", + "\n", + " # Each chapter is the name of one of the notebooks that you executed\n", + " # in compute_notebooks above, also without .ipynb\n", + "# chapters:\n", + "# - file: atm/adf_quick_run\n", + "\n", + "# - caption: Land\n", + "# chapters:\n", + "# - file: lnd/land_comparison\n", + "\n", + "# - caption: Sea Ice\n", + "# chapters:\n", + "# - file: ice/seaice\n", + "\n", + "#####################################\n", + "# Keys for Jupyter Book _config.yml #\n", + "#####################################\n", + "book_config_keys:\n", + "\n", + " title: CESM Tutorial - CUPiD # Title of your jupyter book\n", + "\n", + " # Other keys can be added here, see https://jupyterbook.org/en/stable/customize/config.html\n", + " ### for many more options \n", + "EOF" + ] + }, + { + "cell_type": "markdown", + "id": "97f47be7-8f20-4b98-844c-5d067056df42", + "metadata": {}, + "source": [ + "Now we are ready to run CUPiD!" + ] + }, + { + "cell_type": "markdown", + "id": "27ed4c0c-f6ee-48a2-b264-2a2e58744f82", + "metadata": {}, + "source": [ + "## Generating time series" + ] + }, + { + "cell_type": "markdown", + "id": "22ab4b4f-e69c-40ea-a9f4-1455218a858c", + "metadata": {}, + "source": [ + "One of CUPiD's currently-working functions is to help convert CESM history files into single-variable time series files, which are required for various different diagnostic systems, as well for submitting to CMIP. Here we can create some time series files from the tutorial simulation using the config file we just created above and the `-ts` flag." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c593d967-45e6-4e64-b3cc-32c8c50ed139", + "metadata": {}, + "outputs": [], + "source": [ + "module load nco #Currently the NetCDF Operators are needed by CUPiD in order to run the time series generator\n", + "cupid-run -ts" + ] + }, + { + "cell_type": "markdown", + "id": "63d02026-9dc3-492d-a14d-5b05d854833f", + "metadata": {}, + "source": [ + "Now let's check if the time series files were generated successfully, by examining the directory we are writing the time series files to. In the below command replace \"USERNAME\" with your username, and \"COMP\" with your components of choice (atm, lnd, ocn, ice, or glc):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ad3447d0-ca93-415f-82a1-eb1ce615ecfc", + "metadata": {}, + "outputs": [], + "source": [ + "ts_data_path=\"/glade/derecho/scratch/USERNAME/archive/COMP/proc/tseries\"\n", + "ls $ts_data_path" + ] + }, + { + "cell_type": "markdown", + "id": "fc3f7715-965b-4750-9352-594f3beadf32", + "metadata": {}, + "source": [ + "Do you see a NetCDF file there? How does the naming convention compare to\n", + "standard CESM history output? Now let's check inside the file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e0663047-18bf-4380-8739-ef31810b53ca", + "metadata": {}, + "outputs": [], + "source": [ + "ncdump -h $ts_data_path/*.nc" + ] + }, + { + "cell_type": "markdown", + "id": "76706875-8830-4043-863c-bf952d7735d7", + "metadata": {}, + "source": [ + "How do these results compare to a standard CESM history file? What is the same?\n", + "What is different?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e1b5db8c-4890-46b3-8b5c-c095e75bcd18", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Bash", + "language": "bash", + "name": "bash" + }, + "language_info": { + "codemirror_mode": "shell", + "file_extension": ".sh", + "mimetype": "text/x-sh", + "name": "bash" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}