This repository facilitates Gen3 data modeling using Google Sheets. It includes tools to convert Google Sheets into YAML files and then into a bundled JSON format. Additionally, it offers tools for schema validation and local data model visualization.
git clone --recurse-submodules "https://github.com/AustralianBioCommons/gen3schemadev.git"
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
To install Docker Desktop, download it from the Docker website and follow the installation instructions for your operating system. After installation, verify by running docker --version
in the terminal.
cd umccr-dictionary
make pull
make up
make ps
cd ..
You can run using the Schema Development Framework Notebook or by following the usage below.
Alternatively you can run the script:
bash scripts/generate_schema.sh --help
bash scripts/generate_schema.sh
This step involves pulling the schema design from a Google Sheet template. The template can be accessed here. Feel free to duplicate this spreadsheet and input your own google sheet id along with the tab ids for objects, links, properties, and enums.
[ -d "schema_out" ] && rm -rf "schema_out"
python3 sheet2yaml-CLI.py --google-id '1zjDBDvXgb0ydswFBwy47r2c8V1TFnpUj1jcG0xsY7ZI' --objects-gid 0 --links-gid 270346573 --properties-gid 613332252 --enums-gid 1807456496
Move the generated schema files to the umccr-dictionary
directory:
mkdir -p umccr-dictionary/dictionary/schema_dev/gdcdictionary/schemas
cp schema_out/* umccr-dictionary/dictionary/schema_dev/gdcdictionary/schemas/
ls -lsha umccr-dictionary/dictionary/schema_dev/gdcdictionary/schemas/
Compile and bundle the schema into a JSON format:
cd umccr-dictionary && make compile program=schema_dev
Validate the compiled schema:
cd umccr-dictionary && make validate program=schema_dev
Open the data dictionary visualization in your web browser:
open http://localhost:8080/#schema/schema_dev.json