This is a tool for using the Clermont 2013 PCR typing method for in silico analysis of E. coli whole genomes or assembled contigs.
- bump to version 0.7 in Nov 2021; add option for logfile instead of stderr messages for workflow compatibility
- bump to version 0.4 in May 2018; improved handling of partial matches
- made a webapp on April 19th, 2018 after requests from several to make the tool more user friendly.
- updated on August 2, 2017 to add reactions that differentiate A/C, D/E/cryptic, and to add more robust tests.
- released Dec. 2016
EzClermont can either read in a file or read from stdin
.
Try:
ezclermont tests/refs/CP004009.1.fasta
or
cat tests/refs/CP004009.1.fasta | ezclermont - -e "APEC_O78"
usage: ezclermont [-m MIN_LENGTH] [-e EXPERIMENT_NAME] [-n]
[--logfile LOGFILE] [-h] [--version]
contigs
run a 'PCR' to get Clermont 2013 phylotypes; version 0.7.0
positional arguments:
contigs FASTA formatted genome or set of contigs. If reading
from stdin, use '-'
optional arguments:
-m MIN_LENGTH, --min_length MIN_LENGTH
minimum contig length to consider.default: 500
-e EXPERIMENT_NAME, --experiment_name EXPERIMENT_NAME
name of experiment; defaults to file name without
extension. If reading from stdin, uses the first
contig's ID
-n, --no_partial If scanning contigs, breaks between contigs could
potentially contain your sequence of interest. if
--no_partial, these plausible partial matches will NOT
be reported; default behaviour is to consider partial
hits if the assembly has more than 4 sequnces(ie, no
partial matches for complete genomes, allowing for 1
chromasome and several plasmids)
--logfile LOGFILE send log messages to logfile instead stderr
-h, --help Displays this help message
--version show program's version number and exit
It prints out the presense or absence of the PCR product to stderr, and the resulting phylotype and experiment name to stdout. It checks the length, accepting fragments that are within 20bp of the expected size. When using --partial
, if a single primer has a hit but the contig starts/ends within the length of the expected product size, we call it a hit.
A minimal filename.fasta ClermontType
output table can be generated by piping to a results file using a bash loop:
for i in strain1 strain2 strain3;
do
ezclermont ${i} >> results.txt
done
or, using GNU parallel, and saving a log file:
ls ./folder/with/assemblies/*.fa | parallel "ezclermont {} 1>> results.txt 2>> results.log"
docker run -p 5000:5000 nickp60/ezclermont
Have fun!
conda create -n ezclermont_env ezclermont
conda activate ezclermont_env
conda create -n ez biopython
conda activate ezclermont
git clone https://github.com/nickp60/ezclermont && cd ezclermont
pip install .
The tests can be run by either unittests or nosetests.
Biopython
flask biopython
Thanks to Dave Gamache for Skeleton, the webapp CSS theme.
The name of this repo (and pypi package was changed on April 21 from ClermontPCR to EzClermont.