-
Notifications
You must be signed in to change notification settings - Fork 6
PanPhlAn mapping
Leonard Dubois edited this page May 18, 2020
·
1 revision
panphlan_map.py
requires bowtie2 and samtools in order to map metagenomic samples against the pangenome using data generated by panphlan_new_pangenome_generation.py
(bowtie2 indexes and cancatenated .fna
file). The function must be called once for each sample file. Output generated can finaly be analyzed by panphlan_profile.py
Example:
panphlan/panphlan_map.py -c erectale -i sample01.tar.gz -o map_results/sample01_erectale.csv
-
-c CLADE_NAME
to specify the species database. -
-i INPUT_FILE
input path to a metagenomic sample. The following file formats are accepted:.fastq
,.fastq.gz
,.fastq.bz2
,.tar.gz
,.tar.bz2
, and.sra
.
If no --output
argument is provided, the default value map_results
will lead to the creation of the map_results/
folder. In this folder :
- a mapping result file named
INPUT_FILE_CLADE_NAME.csv
./panphlan/panphlan_map.py -h
-h, --help show this help message and exit
-i INPUT_FILE, --input INPUT_FILE
File(s) containing the unpaired reads to be aligned
using Bowtie2. If not specified, Bowtie2 gets the read
from the stdin filehandle.
--i_bowtie2_indexes INPUT_BOWTIE2_INDEXES
Input directory of bowtie2 indexes and pangenome
--fastx FASTX_FORMAT Read input format (fasta or fastq), default: fastq, if
not fasta recognized by file ending.
-c CLADE_NAME, --clade CLADE_NAME
Name of the specie to consider, i.e. the basename of
the index for the reference genome used by Bowtie2 to
align reads.
-o OUTPUT_FILE, --output OUTPUT_FILE
Mapping result output-file: path/sampleID_clade.csv
--th_mismatches NUMOF_MISMATCHES
Number of mismatches to filter.
-p NUMOF_PROCESSORS, --nproc NUMOF_PROCESSORS
Maximum number of processors to use. Default value is
the minimum between 12 and the number of available
processors.
-b OUTPUT_BAM_FILE, --out_bam OUTPUT_BAM_FILE
Forces the name of the BAM file generated by the
Samtools pipeline.
-m MEMORY_GIGABTES_FOR_SAMTOOLS, --mGB MEMORY_GIGABTES_FOR_SAMTOOLS
Maximum amount of memory we get available for
Samtools.
--readLength READS_LENGTH
Minimum read length.
--tmp TEMP_FOLDER Alternative folder for temporary files.
--verbose Defines if the standard output must be verbose or not.
-v, --version Prints the current PanPhlAn version and exits.
PanPhlAn is a project of the Computational Metagenomics Lab at CIBIO, University of Trento, Italy.
- PanPhlAn 3.0
- PanPhlAn 1.3