Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making this project Cloud compatible and EOSC infrastructure runs #45

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 17 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,16 @@
Benchmarking & Scaling Studies of the Pangeo Platform

- [Benchmarking](#benchmarking)
- [Creating an Environment](#creating-an-environment)
- [Creating an Environment on an HPC Center](#creating-an-environment-on-an-hpc-center)
- [Environment on a Kubernetes based system](#environment-on-a-kubernetes-based-system)
- [Benchmark Configuration](#benchmark-configuration)
- [Running the Benchmarks](#running-the-benchmarks)
- [Benchmark Results](#benchmark-results)
- [Visualization](#visualization)

## Creating an Environment
## Creating an Environment on an HPC Center

To run the benchmarks, it's recommended to create a dedicated conda environment by running:
To run the benchmarks on an HPC platform, it's recommended to create a dedicated conda environment by running:

```bash
conda env create -f ./binder/environment.yml
Expand All @@ -31,33 +32,31 @@ and then run the post build script:
./binder/postBuild
```

## Benchmark Configuration
## Environment on a Kubernetes based system

The `benchmark-configs` directory contains YAML files that are used to run benchmarks on different machines. So far, the following HPC systems' configs are provided:
To run the benchmark on any Cloud platform using Kubernetes, it is recommanded to use [pangeo/pangeo-notebook Docker image](https://github.com/pangeo-data/pangeo-docker-images/tree/master/pangeo-notebook).

```bash
$ tree ./benchmark-configs/
benchmark-configs/
├── cheyenne.yaml
└── hal.yaml
└── wrangler.yaml
This package currently assumes a Dask Gateway cluster is available from the Kubernetes environment.

```
## Benchmark Configuration

The `benchmark-configs` directory contains YAML files that are used to run benchmarks on different machines. So far, HPC systems config have been provided for several clusters: Cheyenne from NCAR, HAL from CNES, Wrangler from TACC. It also contains configurations for CESNET Center based on a Kubernetes deployment over Openstack. There might be several configurations for each center.

In case you are interested in running the benchmarks on another system, you will need to create a new YAML file for your system with the right configurations. See the existing config files for reference.

## Running the Benchmarks

### from command line

To run the benchmarks, a command utility `pangeobench` is provided in this repository.
To use it to benchmark Pangeo computation, you need to specify subcommand `run` and the location of the benchmark configuration
To use it to benchmark Pangeo computation, you need to specify subcommand `run` and the location of the benchmark configuration.

```bash
./pangebench run benchmark-configs/cheyenne.computation.yaml
./pangeobench run benchmark-configs/cheyenne.pri2.yaml
```


To use it to benchmark Pangeo IO with weak scaling analysis, you need to specify subcommand `run` and the location of the benchmark configuration
To use it to benchmark Pangeo IO with weak scaling analysis, you need to specify subcommand `run` and the location of the benchmark configuration.


```bash
Expand All @@ -72,7 +71,7 @@ First, create data files:
```
Second, upload data files to S3 object store if you need to benchmark S3 object store:
```bash
./pangebench upload --config_file benchmark-configs/cheyenne.write.yaml
./pangeobench upload --config_file benchmark-configs/cheyenne.write.yaml
```

Last, read data files:
Expand All @@ -91,8 +90,8 @@ Commands:
run Run benchmarking
upload Upload benchmarking files from local directory to S3 object store
```
## Running the Benchmarks
### from jupyter notebook.

### from Jupyter notebook.

To run the benchmarks from jupyter notebook, install 'pangeo-bench' kernel to your jupyter notebook enviroment, then start run.ipynb notebook. You will need to specify the configuration file as described above in your notebook.

Expand Down
27 changes: 27 additions & 0 deletions benchmark-configs/EOSC-CESNET-small.readwrite.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
operation_choice: readwrite
machine: EOSC-CESNET-small
cluster_manager: gateway
cluster_kwargs:
worker_memory: 4
chunk_per_worker: 2
spil: false
freq: 1D
parameters:
fixed_totalsize: False
number_of_workers_per_nodes:
- 1
number_of_nodes:
- 1
- 4
chunk_size:
- 32MB
- 64MB
chunking_scheme:
- temporal
io_format:
- zarr
filesystem:
- s3
profile: default
bucket: pangeo-benchmarking
endpoint_url: https://object-store.cloud.muni.cz
35 changes: 35 additions & 0 deletions benchmark-configs/EOSC-CESNET.readwrite.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
operation_choice: readwrite
machine: EOSC-CESNET
cluster_manager: gateway
cluster_kwargs:
worker_memory: 4
chunk_per_worker: 10
spil: false
freq: 1D
parameters:
fixed_totalsize: False
number_of_workers_per_nodes:
- 1
number_of_nodes:
- 1
- 4
- 8
- 16
- 32
chunk_size:
- 32MB
- 64MB
- 128MB
- 256MB
- 512MB
- 1024MB
chunking_scheme:
- temporal
- auto
io_format:
- zarr
filesystem:
- s3
profile: default
bucket: pangeo-benchmarking
endpoint_url: https://object-store.cloud.muni.cz
22 changes: 17 additions & 5 deletions benchmark-configs/cheyenne.pri1-a.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
operation_choice: readwrite
machine: cheyenne
job_scheduler: pbs
queue: regular
walltime: 1:00:00
maxmemory_per_node: 109gb
maxcore_per_node: 36
cluster_manager: pbs
cluster_kwargs:
queue: regular
walltime: 1:00:00
memory: 109gb
cores: 36
local_directory: "$TMPDIR"
interface: "ib0"
chunk_per_worker: 10
spil: false
freq: 1D
parameters:
fixed_totalsize: False
number_of_workers_per_nodes:
- 1
number_of_threads_per_workers: 1
Expand All @@ -18,3 +24,9 @@ parameters:
- spatial
- temporal
- auto
io_format:
- zarr
- netcdf
filesystem:
- posix
local_dir: test_pri1-a
22 changes: 17 additions & 5 deletions benchmark-configs/cheyenne.pri1-b.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
operation_choice: readwrite
machine: cheyenne
job_scheduler: pbs
queue: regular
walltime: 1:00:00
maxmemory_per_node: 109gb
maxcore_per_node: 36
cluster_manager: pbs
cluster_kwargs:
queue: regular
walltime: 1:00:00
memory: 109gb
cores: 36
local_directory: "$TMPDIR"
interface: "ib0"
chunk_per_worker: 10
spil: false
freq: 1D
parameters:
fixed_totalsize: False
number_of_workers_per_nodes:
- 1
number_of_threads_per_workers: 1
Expand All @@ -18,3 +24,9 @@ parameters:
- spatial
- temporal
- auto
io_format:
- zarr
- netcdf
filesystem:
- posix
local_dir: test_pri1-b
22 changes: 17 additions & 5 deletions benchmark-configs/cheyenne.pri2.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
operation_choice: readwrite
machine: cheyenne
job_scheduler: pbs
queue: regular
walltime: 1:00:00
maxmemory_per_node: 109gb
maxcore_per_node: 36
cluster_manager: pbs
cluster_kwargs:
queue: regular
walltime: 1:00:00
memory: 109gb
cores: 36
local_directory: "$TMPDIR"
interface: "ib0"
chunk_per_worker: 10
spil: false
freq: 1D
parameters:
fixed_totalsize: False
number_of_workers_per_nodes:
- 1
number_of_threads_per_workers: 1
Expand All @@ -25,3 +31,9 @@ parameters:
- spatial
- temporal
- auto
io_format:
- zarr
- netcdf
filesystem:
- posix
local_dir: test_pri2
13 changes: 8 additions & 5 deletions benchmark-configs/cheyenne.readwrite.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
operation_choice: readwrite
machine: cheyenne
job_scheduler: pbs
queue: regular
walltime: 1:00:00
maxmemory_per_node: 109gb
maxcore_per_node: 36
cluster_manager: pbs
cluster_kwargs:
queue: regular
walltime: 1:00:00
memory: 109gb
cores: 36
local_directory: "$TMPDIR"
interface: "ib0"
chunk_per_worker: 10
spil: false
freq: 1D
Expand Down
13 changes: 8 additions & 5 deletions benchmark-configs/cheyenne.write.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
operation_choice: write
machine: cheyenne
job_scheduler: pbs
queue: regular
walltime: 1:00:00
maxmemory_per_node: 109gb
maxcore_per_node: 36
cluster_manager: pbs
cluster_kwargs:
queue: regular
walltime: 1:00:00
memory: 109gb
cores: 36
local_directory: "$TMPDIR"
interface: "ib0"
chunk_per_worker: 10
spil: false
freq: 1D
Expand Down
32 changes: 0 additions & 32 deletions benchmark-configs/cheyenne.yaml

This file was deleted.

26 changes: 0 additions & 26 deletions benchmark-configs/hal.yaml

This file was deleted.

23 changes: 18 additions & 5 deletions benchmark-configs/hal1D.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
operation_choice: readwrite
machine: hal1D
job_scheduler: pbs
queue: batch
walltime: 1:00:00
maxmemory_per_node: 128gb
maxcore_per_node: 24
cluster_manager: pbs
cluster_kwargs:
queue: batch
walltime: 1:00:00
memory: 128gb
cores: 24
local_directory: "$TMPDIR"
interface: "ib0"
chunk_per_worker: 10
spil: false
freq: 1D
parameters:
fixed_totalsize: False
number_of_workers_per_nodes:
- 1
number_of_threads_per_workers: 1
Expand All @@ -24,3 +31,9 @@ parameters:
- spatial
- temporal
- auto
io_format:
- zarr
- netcdf
filesystem:
- posix
local_dir: test_1D
Loading