Bad Clade Deletion (BCD) Supertrees is a java library and command line tool providing the Bad Clade Deletion supertree algorithm for rooted input trees. It provides several strategies to weight the clades that have to be removed during the micut phase. Bad Clade Deletion (BCD) Supertrees can use the Greedy Strict Consensus Merger (GSCM) algorithm to preprocess the input trees. For the GSCM algorithm it provides several scoring functions to determine in which oder the input trees get merged. Combining different scorings is also implemented as well as a randomized version of the algorithm. For more detailed information about the algorithm see the Literature.
[1] Markus Fleischauer and Sebastian Böcker.
Bad Clade Deletion Supertrees: A Fast and Accurate Supertree Algorithm.
Mol Biol Evol, 34:2408-2421, 2017
[2] Markus Fleischauer and Sebastian Böcker.
BCD Beam Search: Considering suboptimal partial solutions in Bad Clade Deletion supertrees
in review
[3] Markus Fleischauer and Sebastian Böcker.
Collecting reliable clades using the Greedy Strict Consensus Merger
PeerJ (2016) 4:e2172
BCD Supertrees commandline tool v1.1.3
- for Windows
- for Linux/Unix/Mac
- as jar file
The Source Code can be found on GitHub
The bcdSupertrees.exe should hopefully work out of the box. To execute BCD from every location you have to add the location of the bcdSupertrees.exe to your PATH environment variable.
To start BCD Supertrees you just have to start the bcd start script from the command line:
/path/to/bcd/bcd LIST_OF_BCD_ARGS
The BCD Supertrees directory contains another start script named bcdVM. This script allows you to run BCD Supertrees with specific JVM (Java Virtual Machine) arguments. The command to run the bcdVM start script type:
/path/to/bcd/bcdVM "LIST_OF_JVM_ARGS" "LIST_OF_BCD_ARGS"
To execute BCD Supertrees from every location you have to add the location of
the BCD start script to your PATH variable. Open the file ~/.profile
in an editor and add the following line (replacing the placeholder
path):
export PATH-$PATH:/path/to/bcd
Alternatively, you can run the jar file using java with the command:
java -jar /path/to/bcdSupertrees/bcdSupertrees.jar
You can always use the --help
option to get a documentation about
the available commands and options.
Generally you only need to specify the input trees as input.
If your input data contains bootstrap values we recommend the BOOTSTRAP_VALUES
weighting
Other options are listet below or be see via --help
option
The BCD Supertrees command line tool handles trees in NEWICK and NEXUS format.
For an automatic file format detection use the common file extension
for NEWICK (tree|TREE|tre|TRE|phy|PHY|nwk|NWK) and NEXUS (nex|NEX|ne|NE|nexus|NEXUS).
Per default the output tree format equals the input format. To specify a different
output format you can use the option --outFileType
or the short form-d
.
bcd [options...] INPUT_TREE_FILE
The only required argument is the input tree file
bcd [options...] INPUT_TREE_FILE GUIDE_TREE_FILE
Additionally, a guide tree can be specified. Otherwise the GSCM tree will be calculated as default guide tree
PATH : Path of the file containing the input
trees
PATH : Path of the file containing the guide
tree
-H (--HELP) : Full usage message including
nonofficial Options (default: false)
-O (--fullOutput) PATH : Output file containing full output
-V (--VERBOSE) : many console output
-b (--bootstrapThreshold) N : Minimal bootstrap value of a
tree-node to be considered during the
supertree calculation (default: 0)
-d (--outFileType) [NEXUS | NEWICK | AUTO] : Output file type (default: AUTO)
-f (--fileType) [NEXUS | NEWICK | AUTO] : Type of input files and if not
specified otherwise also of the
output file (default: AUTO)
-h (--help) : usage message (default: true)
-j (--supportValues) : Calculate Split Fit for every clade
of the supertree(s) (default: false)
-o (--outputPath) PATH : Output file
-p (--workingDir) PATH : Path of the working directory. All
relative paths will be rooted here.
Absolute paths are not effected
-s (--scm) VALUE : Use SCM-tree as guide tree (default:
true)
-v (--verbose) : some more console output
-w (--weighting) [UNIT_WEIGHT | : Weighting strategy
TREE_WEIGHT | BRANCH_LENGTH |
BOOTSTRAP_WEIGHT | LEVEL |
BRANCH_AND_LEVEL | BOOTSTRAP_AND_LEVEL]
-t (--threads) N : Set a positive number of Threads that
should be used
-T (--singleThreaded) : starts in single threaded mode, equal
to "-t 1"
-B (--disableProgressbar) : Disables progress bar (cluster/backgro
und mode)
You can integrate the BCD library in your java project, either by using Maven [1] or by including the jar file directly. The latter is not recommended, as the BCD jar contains also dependencies to other external libraries.
Add the following repository to your pom file:
<distributionManagement>
<repository>
<id>bioinf-jena</id>
<name>bioinf-jena-releases</name>
<url>https://bio.informatik.uni-jena.de/repository/libs-releases-local</url>
</repository>
</distributionManagement>
Now you can integrate BCD in your project by adding the following dependency:
Library containing all algorithms
<dependency>
<groupId>de.unijena.bioinf.phylo</groupId>
<artifactId>flipcut-lib</artifactId>
<version>1.1.1</version>
</dependency>
Whole project containing the algorithm (bcd-lib) and the command line interface (bcd-cli)
<dependency>
<groupId>de.unijena.bioinf.phylo</groupId>
<artifactId>flipcut</artifactId>
<version>1.1.1</version>
</dependency>
The main class in the BCD library is phylo.tree.algorithm.flipcut.AbstractFlipCut
.
It specifies the main API of all provided algorithm implementation. To run the algorithm you
just have to specify the input trees.
There is currently 1 implementation of phylo.tree.algorithm.flipcut.AbstractFlipCut
:
phylo.tree.algorithm.flipcut.FlipCutSingleCutSimpleWeight
This class provides the basic Bad Clade Deletion algorithm. Parameters:
- input -- List of rooted input trees.
- weight -- clade weighting to use
Returns: The bcd supertree
The interface phylo.tree.algorithm.flipcut.costComputer.FlipCutWeights
provides
different weightings. The package phylo.tree.algorithm.flipcut.costComputer
contains implementations of these weightings.
UNIT_WEIGHT
TREE_WEIGHT
BRANCH_LENGTH
BOOTSTRAP_VALUES
LEVEL
BRANCH_AND_LEVEL
BOOTSTRAP_AND_LEVEL
The in Fleischauer et al. [1] presented scorings are:
UNIT_WEIGHT
BRANCH_LENGTH
BOOTSTRAP_VALUES
- better cut sampling
- upper bound for cut sampling (optimal cut)
- much less memory consuption with cut sampling (as good as Vazirani)
- improved performance for cut sampling (character merging)
- bug with too less iterations in recursive cut sampling fixed
- recursive cut sampling algorithm is now the default cut sampling algorithm
- Completely new and memory efficient data structure for BCD Graph
- The Beam Search agorithm can now run on a standard notebook (even for serveral thousand taxa).
- Multiple bugfixes in the Beam Search
- Beam Search algorithm to consider suboptimal partial soulutions in BCD algorithm
- Cut Enumeration (Vaziranis Algorthm)
- Cut Sampling
- Low Overlap: BCD now returns a warning for input tree sets with low overlap instead of not calculating them.
- bcd now supports both bootstrap notations of the newick file format.
- some minor fixes
- release version
- We thank Stefano Scerra for providing us his implementation of the Ahuja-Orlin max flow algorithm