Skip to content

AustralianBioCommons/gen3schemadev

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tools for Gen3 Data Dictionary Development

This repository facilitates Gen3 data modeling using Google Sheets. It includes tools to convert Google Sheets into YAML files and then into a bundled JSON format. Additionally, it offers tools for schema validation and local data model visualization.

Setup

1. Set up environment

git clone --recurse-submodules "https://github.com/AustralianBioCommons/gen3schemadev.git"
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2. Install Docker

To install Docker Desktop, download it from the Docker website and follow the installation instructions for your operating system. After installation, verify by running docker --version in the terminal.

3. Spin up containers

cd umccr-dictionary
make pull
make up
make ps
cd ..

Usage

You can run using the Schema Development Framework Notebook or by following the usage below.

Alternatively you can run the script:

bash scripts/generate_schema.sh --help
bash scripts/generate_schema.sh

1. Pull Data Schema from Google Sheets

This step involves pulling the schema design from a Google Sheet template. The template can be accessed here. Feel free to duplicate this spreadsheet and input your own google sheet id along with the tab ids for objects, links, properties, and enums.

[ -d "schema_out" ] && rm -rf "schema_out"
python3 sheet2yaml-CLI.py --google-id '1zjDBDvXgb0ydswFBwy47r2c8V1TFnpUj1jcG0xsY7ZI' --objects-gid 0 --links-gid 270346573 --properties-gid 613332252 --enums-gid 1807456496

2. Move Schema Output

Move the generated schema files to the umccr-dictionary directory:

mkdir -p umccr-dictionary/dictionary/schema_dev/gdcdictionary/schemas
cp schema_out/* umccr-dictionary/dictionary/schema_dev/gdcdictionary/schemas/
ls -lsha umccr-dictionary/dictionary/schema_dev/gdcdictionary/schemas/

3. Compile and Bundle into JSON

Compile and bundle the schema into a JSON format:

cd umccr-dictionary && make compile program=schema_dev

4. Run Validation

Validate the compiled schema:

cd umccr-dictionary && make validate program=schema_dev

5. Visualize Data Dictionary

Open the data dictionary visualization in your web browser:

open http://localhost:8080/#schema/schema_dev.json