Skip to content

Files

Latest commit

 

History

History
85 lines (65 loc) · 3.11 KB

deepvariant-docker.md

File metadata and controls

85 lines (65 loc) · 3.11 KB

DeepVariant via Docker

You may run DeepVariant binaries via Docker. The image has all DeepVariant binaries and their dependencies preinstalled, which makes it easier to run DeepVariant locally, on a VM, or at scale using the Genomics Pipelines API.

Setup

If you have not already done so, please follow the instructions on the quick start page.

Run DeepVariant at scale using the Genomics Pipelines API

Please follow the instructions on Google Cloud Genomics for running DeepVariant at scale using the Genomics Pipelines API.

Run the DeepVariant docker image locally or on a dedicated VM

Prerequisites

You need to install docker to run the image locally or on a dedicated VM. General installation instructions are on the Docker site. Ubuntu installation instructions are here. To avoid having to run the docker commands using sudo, follow the instructions here.

Run the pipeline

Copy all required input files to a directory that can be accessed within the docker image. In this example, we are going to use the quickstart test data:

mkdir -p input
gsutil -m cp gs://deepvariant/quickstart-testdata/* input/

Copy the model:

mkdir -p models
gsutil -m cp gs://deepvariant/models/DeepVariant/0.4.0/DeepVariant-inception_v3-0.4.0+cl-174375304.data-wgs_standard/* models/

Run the docker image mounting the input and models folders. You may need to run with sudo if you get permission denied errors or follow the instructions here to manage docker as a non-root user.

# You may use the 'latest' label to get the latest version.
IMAGE_VERSION=0.4.1
gcloud docker -- pull gcr.io/deepvariant-docker/deepvariant:$IMAGE_VERSION
docker run -it -v $PWD/input:/dv2/input -v $PWD/models:/dv2/models \
    gcr.io/deepvariant-docker/deepvariant:$IMAGE_VERSION

Finally, within the docker image, run:

./opt/deepvariant/bin/make_examples \
    --mode calling \
    --ref /dv2/input/ucsc.hg19.chr20.unittest.fasta.gz \
    --reads /dv2/input/NA12878_S1.chr20.10_10p1mb.bam \
    --examples output.examples.tfrecord \
    --regions "chr20:10,000,000-10,010,000"

./opt/deepvariant/bin/call_variants \
    --outfile call_variants_output.tfrecord \
    --examples output.examples.tfrecord \
    --checkpoint /dv2/models/model.ckpt

./opt/deepvariant/bin/postprocess_variants \
    --ref /dv2/input/ucsc.hg19.chr20.unittest.fasta.gz \
    --infile call_variants_output.tfrecord \
    --outfile output.vcf

You may mount an additional output folder (using -v) when running the docker image to store the resulting VCF file directly on your local machine.