diff --git a/docs/deeptrio-pacbio-case-study.md b/docs/deeptrio-pacbio-case-study.md index 5040fb68..428bb8fa 100644 --- a/docs/deeptrio-pacbio-case-study.md +++ b/docs/deeptrio-pacbio-case-study.md @@ -85,7 +85,7 @@ is run as a separate command. mkdir -p output mkdir -p output/intermediate_results_dir -BIN_VERSION="1.7.0" +BIN_VERSION="1.8.0" sudo apt -y update sudo apt-get -y install docker.io @@ -221,13 +221,13 @@ As a result we should get the following output: ```bash Checking: /output/HG002_trio_merged.vcf.gz Family: [HG003 + HG004] -> [HG002] -222 non-pass records were skipped -Concordance HG002: F:166005/169476 (97.95%) M:166074/168579 (98.51%) F+M:159317/164363 (96.93%) +188 non-pass records were skipped +Concordance HG002: F:166225/169750 (97.92%) M:166415/168977 (98.48%) F+M:159575/164659 (96.91%) Sample HG002 has less than 99.0 concordance with both parents. Check for incorrect pedigree or sample mislabelling. -0/188247 (0.00%) records did not conform to expected call ploidy -176481/188247 (93.75%) records were variant in at least 1 family member and checked for Mendelian constraints -10169/176481 (5.76%) records had indeterminate consistency status due to incomplete calls -6610/176481 (3.75%) records contained a violation of Mendelian constraints +0/188437 (0.00%) records did not conform to expected call ploidy +176829/188437 (93.84%) records were variant in at least 1 family member and checked for Mendelian constraints +10143/176829 (5.74%) records had indeterminate consistency status due to incomplete calls +6722/176829 (3.80%) records contained a violation of Mendelian constraints ``` ### Benchmark variant calls against 4.2.1 truth set with hap.py @@ -289,22 +289,22 @@ sudo docker run \ ``` Benchmarking Summary for HG002: Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio -INDEL ALL 11256 11215 41 23348 85 11580 30 50 0.996357 0.992777 0.495974 0.994564 NaN NaN 1.561710 2.133416 -INDEL PASS 11256 11215 41 23348 85 11580 30 50 0.996357 0.992777 0.495974 0.994564 NaN NaN 1.561710 2.133416 - SNP ALL 71333 71303 30 108157 20 36757 16 4 0.999579 0.999720 0.339849 0.999650 2.314904 1.745105 1.715978 1.773270 - SNP PASS 71333 71303 30 108157 20 36757 16 4 0.999579 0.999720 0.339849 0.999650 2.314904 1.745105 1.715978 1.773270 +INDEL ALL 11256 11213 43 23405 84 11635 32 45 0.996180 0.992863 0.497116 0.994519 NaN NaN 1.561710 2.151675 +INDEL PASS 11256 11213 43 23405 84 11635 32 45 0.996180 0.992863 0.497116 0.994519 NaN NaN 1.561710 2.151675 + SNP ALL 71333 71305 28 108561 21 37160 14 7 0.999607 0.999706 0.342296 0.999657 2.314904 1.742256 1.715978 1.772847 + SNP PASS 71333 71305 28 108561 21 37160 14 7 0.999607 0.999706 0.342296 0.999657 2.314904 1.742256 1.715978 1.772847 Benchmarking Summary for HG003: Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio -INDEL ALL 10628 10575 53 23766 78 12623 33 44 0.995013 0.993000 0.531137 0.994006 NaN NaN 1.748961 2.326587 -INDEL PASS 10628 10575 53 23766 78 12623 33 44 0.995013 0.993000 0.531137 0.994006 NaN NaN 1.748961 2.326587 - SNP ALL 70166 70145 21 117124 35 46895 11 10 0.999701 0.999502 0.400388 0.999601 2.296566 1.579731 1.883951 1.689079 - SNP PASS 70166 70145 21 117124 35 46895 11 10 0.999701 0.999502 0.400388 0.999601 2.296566 1.579731 1.883951 1.689079 +INDEL ALL 10628 10577 51 23776 77 12634 33 43 0.995201 0.993089 0.531376 0.994144 NaN NaN 1.748961 2.332224 +INDEL PASS 10628 10577 51 23776 77 12634 33 43 0.995201 0.993089 0.531376 0.994144 NaN NaN 1.748961 2.332224 + SNP ALL 70166 70143 23 117125 35 46898 13 9 0.999672 0.999502 0.400410 0.999587 2.296566 1.57963 1.883951 1.685873 + SNP PASS 70166 70143 23 117125 35 46898 13 9 0.999672 0.999502 0.400410 0.999587 2.296566 1.57963 1.883951 1.685873 Benchmarking Summary for HG004: Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio -INDEL ALL 11000 10957 43 24219 60 12690 25 30 0.996091 0.994796 0.523969 0.995443 NaN NaN 1.792709 2.345610 -INDEL PASS 11000 10957 43 24219 60 12690 25 30 0.996091 0.994796 0.523969 0.995443 NaN NaN 1.792709 2.345610 - SNP ALL 71659 71621 38 116803 28 45069 10 10 0.999470 0.999610 0.385855 0.999540 2.310073 1.63293 1.878340 1.630435 - SNP PASS 71659 71621 38 116803 28 45069 10 10 0.999470 0.999610 0.385855 0.999540 2.310073 1.63293 1.878340 1.630435 +INDEL ALL 11000 10954 46 24235 70 12701 29 36 0.995818 0.993931 0.524077 0.994874 NaN NaN 1.792709 2.351344 +INDEL PASS 11000 10954 46 24235 70 12701 29 36 0.995818 0.993931 0.524077 0.994874 NaN NaN 1.792709 2.351344 + SNP ALL 71659 71617 42 116988 22 45260 11 7 0.999414 0.999693 0.386877 0.999554 2.310073 1.633809 1.878340 1.626369 + SNP PASS 71659 71617 42 116988 22 45260 11 7 0.999414 0.999693 0.386877 0.999554 2.310073 1.633809 1.878340 1.626369 ``` diff --git a/docs/deeptrio-quick-start.md b/docs/deeptrio-quick-start.md index 43e42b62..6463e234 100644 --- a/docs/deeptrio-quick-start.md +++ b/docs/deeptrio-quick-start.md @@ -32,7 +32,7 @@ documentation on how to build. ### Get Docker image ```bash -BIN_VERSION="1.7.0" +BIN_VERSION="1.8.0" sudo apt -y update sudo apt-get -y install docker.io @@ -174,17 +174,14 @@ HG002.g.vcf.gz HG002.g.vcf.gz.tbi HG002.output.vcf.gz HG002.output.vcf.gz.tbi -HG002.output.visual_report.html HG003.g.vcf.gz HG003.g.vcf.gz.tbi HG003.output.vcf.gz HG003.output.vcf.gz.tbi -HG003.output.visual_report.html HG004.g.vcf.gz HG004.g.vcf.gz.tbi HG004.output.vcf.gz HG004.output.vcf.gz.tbi -HG004.output.visual_report.html intermediate_results_dir ``` @@ -341,7 +338,7 @@ INDEL PASS 2 2 0 2 0 0 [BAM]: http://genome.sph.umich.edu/wiki/BAM [BWA]: https://academic.oup.com/bioinformatics/article/25/14/1754/225615/Fast-and-accurate-short-read-alignment-with [docker build]: https://docs.docker.com/engine/reference/commandline/build/ -[Dockerfile]: https://github.com/google/deepvariant/blob/r1.7/Dockerfile.deeptrio +[Dockerfile]: https://github.com/google/deepvariant/blob/r1.8/Dockerfile.deeptrio [FASTA]: https://en.wikipedia.org/wiki/FASTA_format [VCF]: https://samtools.github.io/hts-specs/VCFv4.3.pdf [run_deeptrio.py]: ../scripts/run_deeptrio.py diff --git a/docs/deeptrio-wgs-case-study.md b/docs/deeptrio-wgs-case-study.md index d001e412..41d07611 100644 --- a/docs/deeptrio-wgs-case-study.md +++ b/docs/deeptrio-wgs-case-study.md @@ -82,7 +82,7 @@ command. mkdir -p output mkdir -p output/intermediate_results_dir -BIN_VERSION="1.7.0" +BIN_VERSION="1.8.0" sudo docker pull google/deepvariant:deeptrio-"${BIN_VERSION}" @@ -211,13 +211,13 @@ As a result we should get the following output: ```bash Checking: /output/HG002_trio_merged.vcf.gz Family: [HG003 + HG004] -> [HG002] -95 non-pass records were skipped -Concordance HG002: F:137908/139703 (98.72%) M:137988/139909 (98.63%) F+M:134596/137968 (97.56%) +86 non-pass records were skipped +Concordance HG002: F:138004/139790 (98.72%) M:138049/139959 (98.64%) F+M:134711/138044 (97.59%) Sample HG002 has less than 99.0 concordance with both parents. Check for incorrect pedigree or sample mislabelling. -0/146013 (0.00%) records did not conform to expected call ploidy -143704/146013 (98.42%) records were variant in at least 1 family member and checked for Mendelian constraints -5066/143704 (3.53%) records had indeterminate consistency status due to incomplete calls -3886/143704 (2.70%) records contained a violation of Mendelian constraints +0/146134 (0.00%) records did not conform to expected call ploidy +143783/146134 (98.39%) records were variant in at least 1 family member and checked for Mendelian constraints +5082/143783 (3.53%) records had indeterminate consistency status due to incomplete calls +3842/143783 (2.67%) records contained a violation of Mendelian constraints ``` ### Perform analysis with hap.py against 4.2.1 truth set @@ -279,22 +279,22 @@ sudo docker run \ ``` Benchmarking Summary for HG002: Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio -INDEL ALL 11256 11208 48 21239 13 9586 7 4 0.995736 0.998884 0.451340 0.997308 NaN NaN 1.561710 2.047281 -INDEL PASS 11256 11208 48 21239 13 9586 7 4 0.995736 0.998884 0.451340 0.997308 NaN NaN 1.561710 2.047281 - SNP ALL 71333 71087 246 88976 42 17795 5 4 0.996551 0.999410 0.199998 0.997979 2.314904 2.029984 1.715978 1.716560 - SNP PASS 71333 71087 246 88976 42 17795 5 4 0.996551 0.999410 0.199998 0.997979 2.314904 2.029984 1.715978 1.716560 +INDEL ALL 11256 11208 48 21232 13 9579 7 4 0.995736 0.998884 0.451159 0.997308 NaN NaN 1.561710 2.044750 +INDEL PASS 11256 11208 48 21232 13 9579 7 4 0.995736 0.998884 0.451159 0.997308 NaN NaN 1.561710 2.044750 + SNP ALL 71333 71088 245 89034 41 17853 4 3 0.996565 0.999424 0.200519 0.997993 2.314904 2.026055 1.715978 1.717178 + SNP PASS 71333 71088 245 89034 41 17853 4 3 0.996565 0.999424 0.200519 0.997993 2.314904 2.026055 1.715978 1.717178 Benchmarking Summary for HG003: Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio -INDEL ALL 10628 10584 44 21028 20 9969 13 6 0.995860 0.998192 0.474082 0.997024 NaN NaN 1.748961 2.197401 -INDEL PASS 10628 10584 44 21028 20 9969 13 6 0.995860 0.998192 0.474082 0.997024 NaN NaN 1.748961 2.197401 - SNP ALL 70166 69975 191 85299 55 15231 15 4 0.997278 0.999215 0.178560 0.998246 2.296566 2.064978 1.883951 1.845348 - SNP PASS 70166 69975 191 85299 55 15231 15 4 0.997278 0.999215 0.178560 0.998246 2.296566 2.064978 1.883951 1.845348 +INDEL ALL 10628 10578 50 21055 24 9997 17 6 0.995295 0.997830 0.474804 0.996561 NaN NaN 1.748961 2.209131 +INDEL PASS 10628 10578 50 21055 24 9997 17 6 0.995295 0.997830 0.474804 0.996561 NaN NaN 1.748961 2.209131 + SNP ALL 70166 69977 189 85399 64 15325 17 8 0.997306 0.999087 0.179452 0.998196 2.296566 2.061752 1.883951 1.846595 + SNP PASS 70166 69977 189 85399 64 15325 17 8 0.997306 0.999087 0.179452 0.998196 2.296566 2.061752 1.883951 1.846595 Benchmarking Summary for HG004: Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio -INDEL ALL 11000 10945 55 21426 27 9969 22 4 0.995000 0.997643 0.465276 0.996320 NaN NaN 1.792709 2.279678 -INDEL PASS 11000 10945 55 21426 27 9969 22 4 0.995000 0.997643 0.465276 0.996320 NaN NaN 1.792709 2.279678 - SNP ALL 71659 71446 213 86406 52 14858 9 4 0.997028 0.999273 0.171956 0.998149 2.310073 2.064306 1.878340 1.735500 - SNP PASS 71659 71446 213 86406 52 14858 9 4 0.997028 0.999273 0.171956 0.998149 2.310073 2.064306 1.878340 1.735500 +INDEL ALL 11000 10949 51 21433 23 9975 16 5 0.995364 0.997993 0.465404 0.996676 NaN NaN 1.792709 2.280107 +INDEL PASS 11000 10949 51 21433 23 9975 16 5 0.995364 0.997993 0.465404 0.996676 NaN NaN 1.792709 2.280107 + SNP ALL 71659 71445 214 86523 48 14980 8 3 0.997014 0.999329 0.173133 0.998170 2.310073 2.064759 1.878340 1.737322 + SNP PASS 71659 71445 214 86523 48 14980 8 3 0.997014 0.999329 0.173133 0.998170 2.310073 2.064759 1.878340 1.737322 ```