Skip to content

Latest commit

 

History

History
128 lines (87 loc) · 8.33 KB

README.md

File metadata and controls

128 lines (87 loc) · 8.33 KB

SCCion

Overview

SCCion is a genotyping toolkit for Staphylococcus aureus sequence data. It provides three analysis methods:

  1. fast whole genome typing from assemblies similar to Kleborate
  2. parallelized read-to-assembly and read-genotyping pipeline in Nextflow
  3. real-time MinHash typing of uncorrected nanopore reads with Sketchy

The last component is somewhat experimental, and should be considered a pre-release for now until Sketchy is more mature and has tests and stuff. However, in the few bootstrap analyses we have run on data from S. aureus it performed reasonably well, specifically because we have generated an extensive index of S. aureus genomes from the European Nucleotide Archive for Sketchy. Do not rely on it for more serious matters, like clinical diagnostics. All in all, SCCion combines a variety of databases sourced from many different open-source projects. Please make sure to have a look at the Citations section to see who to pay respect to for their valiant efforts in creating the databases used by SCCion.

Pre-print available on BioRxiv soon.

Install


conda install -c conda-forge -c bioconda -c esteinig sccion

Usage


From assembly:

sccion type reference.fasta

From assemblies:

sccion type path/to/assemblies/*.fasta

From uncorrected nanopore reads:

sccion type reads.fq.gz --limit 1000

From uncorrected nanopore reads, live run, watching directory:

sccion type path/to/basecalled/fastq

Nextflow set of paired end reads on default PBS cluster configuration:

nextflow pf-core/pf-sccion -profile cluster --fastq path/to/fastq/*.fq.gz

Modules


  • Genome assembly typing
  • Real-time nanopore typing with Sketchy
  • Illumina and ONT read-to-assembly pipelines in Nextflow

Limitations


Most importantly, SCCion expects input that is definitely S. aureus or at least a Staphylococcal species (but then SCCmec typing and other genotypes might be off). This is also true for the real-time nanopore typing component, which will break and do all sorts of funky things if input is from species other than S. aureus. One can use a prefiltering step on the reads to make sure this is the case as outlined over at the repository for Sketchy.

SCCion also uses a simple MinHash matching with MASH against the small database of whole SCCmec cassette types collected by the authors of SCCmecFinder. It does not have the rigorous error checking as the original implementation of SCCmecFinder, which should be preferred for subtyping for now.

Citations


We rely on a host of excellent software and all too it can go unnoticed when it is wrapped into a program like SCCion. When using SCCion please also cite MASH, SCCmecFinder, Mykrobe, Sketchy, Abricate DBs and refer to the unpublished programs by URL. For specific assembly and typing pipelines, please refer to the tables below.

You can output all citations in RIS format by using:

sccion cite -o sccion_citations


sccion type assembly:

Program Author(s) Publication Code
MASH Ondov et al.
SCCmecFinder
Mykrobe
Ridom Spa Ondov et al.
mlst
Abricate
Resfinder Ondov et al.
Plasmidfinder
VFDB

sccion type nanopore:

Program Author(s) Publication Code
MASH Ondov et al.
Sketchy Steinig et al.
Nanopath Steinig

sccion illumina:

Program Author(s) Publication Code
Trimmomatic
Shovill Seemann et al.
Prokka Seemann
Snippy Seemann et al.
SPANDx Sarovich et al.

sccion ont:

Program Author(s) Publication Code
FLYE
wtdbg2
Racon
Medaka
Nanopolish