Software for handling metagenomic whole-genome shotgun (WGS) data.
The scripts/ folder contains script resource (mostly Python and R) for running and analysing results from popular microbiome programs/pipelines:
- Kraken, a k-mer-based tool for fast taxonomic classification of metagenomic reads (see Kraken paper). The parseKrona Python module annd associated scripts notably allow one to filter reads based on their taxonomic classification by Kraken and efficiently extract them from FASTQ/FASTA sequence files.
- Phylosift, a pipeline for core taxon abundance estimation in microbiome data using phylogenetic placement of reads matching conserved marker genes (see Phylosift paper)
- Interproscan (notably as part of the EBI Metagenomics pipeline), a tool for functional annotation of metagenomic reads (see Interproscan paper)
This software suit was originally developped for the study published by Lassalle et al. in Molecular Ecology (2017) comparing human oral microbiomes from hunter-gatherers vs. traditional farmers from the Philippines, for which metadata required for reproduing the study can be found in the data/ folder.
Related links:
- published Molecular Ecology paper: http://dx.doi.org/10.1111/mec.14435
- underlying data and supplementary result files: Figshare project