You must be signed in to change notification settings - Fork 10
Illumina ReadMe
MikeWLloyd edited this page May 28, 2024
4 revisions
For input sample:
• Fastp read quality and adapter trimming
• Get Read Group Information
• BWA-MEM Alignment
• Samtools SortSam and GATK Mark Duplicates
• Collect Alignment Summary Metrics
• Smoove SV calling and Bcftools reheader variant sorting
• Manta SV calling and Bcftools reheader variant sorting
• Delly SV calling and Bcftools reheader variant sorting
• Delly CNV calling and Bcftools reheader variant sorting
• GATK Haplotype Calling
• VEP annotation of Haplotype Called GVCF
• VEP annotation of sorted CNVs
• Duphold annotation with Bam, SVs and SNPs/INDELs
• Survivor Merge of Duphold annotated VCFs
• Collect Survivor merged VCFs Summary Metrics
• Survivor merged VCFs to Table and Survivor to BEDs
• Bedtools intersect of Survivor BEDs
• Survivor annotation VCF with Exons
flowchart TD
p00 --> p01
p01 --> p02
p02 --> p03
p03 --> p04
m01 -..-> |If Pre-Aligned Bam Provided| p04
o1 --> p05
o1 --> p06
o1 --> p07
o1 --> p08
o1 --> p09
p09 --> p12
o1 --> p10
o2 --> p11
p11 --> p13
o9 --> p14
o2 --> p14
p11 --> p14
o10 --> p15
p10 --> o2
o2 --> p15
p11 --> p15
p13 --> p16
p14 --> p16
p15 --> p16
p16 --> o4
o4 --> p17
o4 --> p18
p17 --> p19
p18 --> p19
p19 --> p20
p20 --> o11
p19 --> p21
o11 --> p21
p17 --> p21
p18 --> p21
o4 --> p22
o11 --> p22
o1([Genomic BAM]):::output
o2([Raw Variant Calls]):::output
o3([Alignment Stats]):::output
o4([Merged VCF]):::output
o5([Annotated SV Calls]):::output
o6([SV Joined Results]):::output
o7([DELLY SV Calls]):::output
o8([Annotated CNVs]):::output
o9([MANTA SV Calls]):::output
o10([SMOOVE SV Calls]):::output
o11([Intersect BEDS]):::output
p04 --> o1
p05 --> o3
p21 --> o6
p22 --> o5
p08 --> o7
p12 --> o8
p07 --> o9
p06 --> o10
classDef output fill:#90aaff,stroke:#6c8eff,stroke-width:2px,color:#000000
- Default:
- Comment: The sample ID for the input data (required).
- Default:
- Default:
- Comment: The directory that the saved outputs will be stored.
- Default:
- Default:
- Comment: How to organize the output folder structure. Options: sample or analysis.
- Default:
- Default:
- Comment: This is directory that contains cached Singularity containers. JAX users should not change this parameter.
- Default:
- Default:
- Comment: The directory that all intermediary files and nextflow processes utilize. This directory can become quite large. This should be a location on /fastscratch or other directory with ample storage.
- Default:
- Default: null
- Comment: Options:
, oront
- Default:
- Comment: Options:
. Default:PE
. Type of reads: paired end (PE) or single end (SE).
- Default:
- Default: null
- Comment: Provide a CSV manifest file with the header: "sampleID,lane,fastq_1,fastq_2". See below for an example file. Fastq_2 is optional and used only in PE data. Fastq files can either be absolute paths to local files, or URLs to remote files. If remote URLs are provided, *
can be specified.
- Default: null
- Comment: The path to a single FASTQ file, or one of a pair of FASTQs for paired-end data.
- Default: null
- Comment: The path to the second of a pair of FASTQs for paired-end data.
- Default: null
- Comment: The path to a BAM input data if alignment has already been performed outside this pipeline.
- Default:
- Comment: The path to the reference genome in FASTA format.
- Default:
- Default:
- Comment: Optional paramter to specify BWA indices for alignment. If not provided, pipeline will generate these indices.
- Default:
- Default:
- Comment: Mouse specific. Options: GRCm38 or GRCm39. Parameter that controls reference data used for alignment and annotation.
- Default:
- Default:
- Comment: BED file that lists the coordinates of centromeres and telomeres to exclude as alignment targets. Note: default path refers to a location within the containers quay.io/jaxcompsci/lumpy-ref_data:0.3.1--refv0.2.0and quay.io/jaxcompsci/delly-ref_data:1.1.6--refv0.2.0, which require this file.
- Default:
- Default:
- Comment: BED file that lists previously indentified insertion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
- Default:
- BED file that lists previously indentified deletion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
- Default:
- BED file that lists previously indentified inversion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
- Default:
- BED file that lists regulatory features. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
- Default:
- BED file that lists gene symbol IDs and coordinates. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
- Default:
- BED file that lists exons and coordinates. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
- Default:
- Default: 30
- Quality score threshold.
- Default: 30
- Percent threhold of unqualified bases to pass reads.
- Default: 1000
- Maximum distance between breakpoints for merging SVs.
- Default: 1
- The number of callers (out of 4) required to support an SV.
- Default: 1
- Boolean (0/1) that requires SVs to be the same type for merging.
- Default: 1
- Boolean (0/1) that requires SVs to be on the same strand for merging.
- Default: 30
- Minimum length (bp) to output SVs.
Naming Convention | Description |
germline_sv_report.html |
Nextflow autogenerated report |
trace/trace.txt |
Nextflow trace of processes |
${sampleID}/${sampleID}_ILLUMINA_DLM_struct_var.vcf |
VCF output combining merged Delly, Lumpy, and Manta calls annotated for overlap with exonic regions |
${sampleID}/${sampleID}_survivor_joined_results.csv |
Table of SVs annotated with overlaps of previously identified SVs (beck), genes, exons, regulatory regions |
${sampleID}/stats/${sampleID}_fastp_report.html |
Filtering and trimming report from fastp |
${sampleID}/alignments/${sampleID}.md.bam |
Analysis-ready alignment of reads |
${sampleID}/alignments/${sampleID}.md.bai |
Index for analysis-ready alignment of reads |
${sampleID}/alignments/${sampleID}.md.metrics |
GATK MarkDuplicates log |
${sampleID}/alignments/${sampleID}.insert_size.txt |
Inferred read insert size |
${sampleID}/unmerged_calls/${sampleID}_dellySort.vcf |
SV calls from Delly |
${sampleID}/unmerged_calls/${sampleID}_lumpySort.vcf |
SV calls from Lumpy |
${sampleID}/unmerged_calls/${sampleID}_mantaSort.vcf |
SV calls from Manta |