Skip to content

Commit

Permalink
Merge pull request #44 from Infectious-Disease-Modeling-Hubs/data-sto…
Browse files Browse the repository at this point in the history
…rage

Update Hub config page
  • Loading branch information
annakrystalli authored Apr 25, 2023
2 parents ee6845b + f12ca12 commit cbf4ff8
Show file tree
Hide file tree
Showing 5 changed files with 64 additions and 67 deletions.
59 changes: 59 additions & 0 deletions docs/source/format/hub-config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
(hub-config)=
# Hub configuration files

## Directory Structure
The `hub-config` directory in a modeling hub is required to contain three files:
1. `admin.json` - JSON file containing generic information about the hub as well as static configuration settings for downstream tools such as validations, visualizations, etc.
2. `tasks.json` - JSON file specifing modeling tasks and model output formats, which may be round-specific.
3. `model-metadata-schema.json` - JSON or YAML file defining format of model metadata files

```{caution}
Note: Due to technical issues, we do not currently support json references or yaml metadata files.
```

## Purpose
The files withing the `hub-config` directory specify general configurations for a hub as well as (possibly round-specific) details of what model outputs are requested or required. Hub configuration files are used for:
* Validating model output submissions
* `tasks.json` file specifies the file format and task id, output type, value combinations (both required or optional) that submitted model output data must adhere to.
* `tasks.json` file also specifies the window of submission for each round (with the time zone information in the `admin.json` file).
* Scoring model outputs
* the hub configuration files specify the scores that are used
* the task id variables specified in the `tasks.json` can be used to join model output data with truth data for the purpose of scoring forecasts.
* Configuring model output visualizations
* Visualization tools may benefit from the ability to programmatically identify task id variables so that a separate visualization of model outputs can be generated for each combination of those variables (e.g. via facetting or menu selections). For example, it may be beneficial to produce separate visualizations for different locations or scenario ids.
* Visualization tools may give special treatment to the hub’s ensemble and baseline models, which are identified in the hub configuration files.
* The `tasks.json` file contains metadata regarding the targets including human readable description and units which can be used for visualization
* Report generation
* `admin.json` allows configuration of ensemble and baseline models to be treated specially in reports.


## Hub administrative configuration (`admin.json` file)

The administrative hub configuration file contains global administrative settings that are expected to remain fixed throughout a hub’s existence and applies to all rounds in a hub.

### Hub administrative configuration (`admin.json`) Interactive Schema

#### Schema Version: {{schema_version}}
{{'[See raw schema](https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/BRANCH/SCHEMA_VERSION/admin-schema.json)'.replace('SCHEMA_VERSION', schema_version).replace('BRANCH', schema_branch)}}

{{'<script src="../_static/docson/widget.js" data-schema="https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/BRANCH/SCHEMA_VERSION/admin-schema.json"></script>'.replace('SCHEMA_VERSION', schema_version).replace('BRANCH', schema_branch)}}

```{note}
Other things we may want to consider adding here:
* Something about truth data?
* Something about scoring?
* Something about report generation?
```

(tasks_metadata)=
## Hub model task configuration (`tasks.json` file)
The hub model task configuration file specifies the model tasks (tasks id and targets) as well as model output types. The `tasks.json` file is flexible enough to accomodate different style of hubs. Hubs can varie from a simple forecast hub (see [US Forecast Hub example](/format/intro-data-formats.md) to a more complex round related scenario hub (see [US Scenario Modeling Hub example](/format/intro-data-formats.md)).


### Model Tasks (`tasks.json`) Interactive Schema

#### Schema Version: {{schema_version}}
{{'[See raw schema](https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/BRANCH/SCHEMA_VERSION/tasks-schema.json)'.replace('SCHEMA_VERSION', schema_version).replace('BRANCH', schema_branch)}}

{{'<script src="../_static/docson/widget.js" data-schema="https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/BRANCH/SCHEMA_VERSION/tasks-schema.json"></script>'.replace('SCHEMA_VERSION', schema_version).replace('BRANCH', schema_branch)}}

62 changes: 0 additions & 62 deletions docs/source/format/hub-metadata.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/source/format/hub-structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The directory and file structure of a modeling hub should contain only the follo
* Documentation files
* Hubs should provide a documentation file (e.g., `README.md`) at the top level that describes the overall structure of the hub, as well as a documentation file within each folder that provides more detail.

* `hub-metadata` directory (see {doc}`/format/hub-metadata`)
* `hub-config ` directory (see {doc}`/format/hub-config`)

* `model-output` directory (see {doc}`/format/model-output`)

Expand Down
4 changes: 2 additions & 2 deletions docs/source/format/intro-data-formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ On this page we provide an [outline on the contents of this data formats section
This section of the documentation provides standards for:

* [Structure of hub repositories](hub-structure): standards for file and directory structures for Hubs
* [Hub configuration files](hub-metadata): the files needed to set up and run a modeling Hub
* [Hub configuration files](hub-config): the files needed to set up and run a modeling Hub
* [Model metadata](model-metadata): metadata describing models
* [Model output](model-output): standard formats for model output such as forecasts and projections that are saved in Hubs
* [Target data](target-data): standard formats for target data, the eventually observable quantities of interest to a hub
Expand Down Expand Up @@ -118,4 +118,4 @@ As Hubs define new modeling tasks, they may need to introduce new task ID variab

(submission-rounds)=
## Submission rounds
Many Hubs will accept model output submissions over multiple rounds. In the case of the forecast hubs there has typically been one submission round per week, while the scenario hubs have had submission rounds less frequently, typically about once per month. As part of the [Hub metadata](hub-metadata), Hubs should specify a set of `round_id` values that uniquely identify the submission round. For instance, for weekly submissions the round id might be the date that submissions are due to the Hub or a specification of an epidemic week. In instances where the rounds do not follow a predetermined schedule, more generic identifiers such as “round1” may be preferred. The round id will be used as the file names of model output submissions and round-specific model abstract submissions, as well as in the Hub metadata to specify model tasks that may vary across rounds.
Many Hubs will accept model output submissions over multiple rounds. In the case of the forecast hubs there has typically been one submission round per week, while the scenario hubs have had submission rounds less frequently, typically about once per month. As part of the [Hub configuration files](hub-config), Hubs should specify a set of `round_id` values that uniquely identify the submission round. For instance, for weekly submissions the round id might be the date that submissions are due to the Hub or a specification of an epidemic week. In instances where the rounds do not follow a predetermined schedule, more generic identifiers such as “round1” may be preferred. The round id will be used as the file names of model output submissions and round-specific model abstract submissions, as well as in the Hub metadata to specify model tasks that may vary across rounds.
4 changes: 2 additions & 2 deletions docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ We have created some [example Hub repositories](https://github.com/Infectious-Di

### Schema files for hub configuration

To take advantage of the infrastructure designed by the Consortium, a hub must contain JSON configuration files in a [specific location and format](hub-metadata). The schemas that define the structure and formats of the configuration files live in their own [schemas repository](https://github.com/Infectious-Disease-Modeling-Hubs/schemas). The schemas are versioned, and every hub must point to a specific version of the schemas that they are using.
To take advantage of the infrastructure designed by the Consortium, a hub must contain JSON configuration files in a [specific location and format](hub-config). The schemas that define the structure and formats of the configuration files live in their own [schemas repository](https://github.com/Infectious-Disease-Modeling-Hubs/schemas). The schemas are versioned, and every hub must point to a specific version of the schemas that they are using.

## Software for modeling hubs

Expand Down Expand Up @@ -56,7 +56,7 @@ overview/definitions.md
:hidden:
format/intro-data-formats.md
format/hub-structure.md
format/hub-metadata.md
format/hub-config.md
format/model-metadata.md
format/model-output.md
format/target-data.md
Expand Down

0 comments on commit cbf4ff8

Please sign in to comment.