There are three options for using your dataset to finetune/evaluate the Text to SQL (QueryCraft) pipeline:
-
Bring your dataset with golden queries in the following format: question, query, and db_id. Instruction for ingesting the dataset is provided in the next Step 1.
-
Curate the golden query dataset using our annotation tool: https://annotator.superknowa.tsglwatson.buildlab.cloud/
-
Use the example datasets provided below for testing: Spider and KaggleDBQA
Unzip the example datasets using the command:
unzip spider.zip unzip KaggleDBQA.zip cd ..
- Go to our annotation tool. https://annotator.superknowa.tsglwatson.buildlab.cloud/
- Click on the Instruction Manual and follow the instructions for curating the golden queries dataset. https://annotator.superknowa.tsglwatson.buildlab.cloud/documentation