-
Notifications
You must be signed in to change notification settings - Fork 174
Why ggtree is special?
The innovation of ggtree
including
- parsing data from several evolution software.
- not only for visualization in
ggtree
, but also bring these data toR
user for further analysis (e.g. summation, visualization)
- not only for visualization in
- viewing and annotating phylogenetic tree programmatically in
R
. - support grammar of graphics implemented in
ggplot2
.
see user comments.
It's differ from other tree viewers which contain pre-defined specific cases of tree view. ggtree
doesn't define how annotation should be presented. Users have no restriction of presenting data in their favorite way and complex tree view can be achieved via multiple layers of annotation.
The grammar is extended from ggplot2
which has been widely used in biomedince and ecology. Many researchers in this field already familiar with the grammar.
There are several packages that implemented tree viewer using ggplot2
, including ggphylo, OutbreakTools and phyloseq.
Using ggplot2
can't guarantee that the grammar of graphics is supported. Among these packages, only ggtree
supports grammar of graphics, while others only implemented tree viewer for specific need.
This package is designed for viewing phylogenetic tree with alignment. It stopped updating since 2012 and the alignment part is not yet implemented.
PS. Viewing phylogenetic tree with alignment is supported in
ggtree
.
For the tree viewer, it mimic the function call of plot.phylo
defined in ape
. The ggphylo function is complex and how to view a tree is pre-defined with parameter to control it's behavior.
As showed in the screenshot, it created several data.frame
and the tree was draw by q <- ggplot(lines.df)
. ggphylo
parses a tree as a collection of lines which is meaningless (information only related to taxa).
OutbreakTools is designed for disease outbreak analysis and viewing phylogenetic tree is not their major focus.
The tree view function plotggphy
is only applicable to obkData
class defined within this package.
As showed in the screenshot, it has similar design as in ggphylo
that creates several data.frame
and draws the tree via p <- ggplot(data=df.edge)
. It also parse a tree as a collection of lines.
phyloseq is designed for viewing microbiome census data.
The tree viewer defined in phyloseq
only applied to phyloseq
class. It can be used to view tree parsed by newick
file.
Internally, it called ape
to calculate edge positions.
It draw horizontal lines followed by vertical lines.
- designed for specific need
- ggphylo for alignment (not implemented yet)
- OutbreakTools for outbreak data
- phyloseq for microbiome census data
- not applicable for widely use tree file format
- plotggphy in OutbreakTools assumes input as an instance of obkData
- plot_tree in phyloseq assumes input as an instance of phyloseq
- not extensible
- tree is draw by lines, but information is related to taxa (nodes & tips)
- data was separated in different data.frame/data.table, make it impossible for user to further modify the tree
Using
ggplot2
can't guarantee that the grammar of graphics is supported. Among these packages, onlyggtree
supports grammar of graphics, while others only implemented tree viewer for specific need.
As I mentioned in the beginning, only ggtree
supports grammar of graphics.
In ggphylo
:
lines.df <- subset(layout.df, type=='line')
nodes.df <- subset(layout.df, type=='node')
labels.df <- subset(layout.df, type=='label')
internal.labels.df <- subset(layout.df, type=='internal.label')
q <- ggplot(lines.df)
geom.fn <- switch(aes.type,
line='geom_joinedsegment',
node='geom_point',
label='geom_text',
internal.label='geom_text'
)
q <- q + do.call(geom.fn, geom.list)
In OutbreakTools
:
ggphy <- phylo2ggphy(phylo, tip.dates = tip.dates, branch.unit = branch.unit)
##TODO: allow edge and node attributes and merge with df.edge and df.node
df.tip <- ggphy[[1]]
df.node <- ggphy[[2]]
df.edge <- ggphy[[3]]
p <- ggplot(data = df.edge)
p <- p + geom_segment(data = df.edge, aes(x = x.beg, xend = x.end, y = y.beg, yend = y.end), lineend = "round")
p <- p + scale_y_continuous("", breaks = NULL)
if (show.tip.label) {
p <- p + geom_text(data = df.tip, aes(x = x, y = y, label = label), hjust = 0, size = tip.label.size)
}
In phyloseq
:
treeSegs <- tree_layout(phy_tree(physeq), ladderize=ladderize)
edgeMap = aes(x=xleft, xend=xright, y=y, yend=y)
vertMap = aes(x=x, xend=x, y=vmin, yend=vmax)
# Initialize phylogenetic tree.
# Naked, lines-only, unannotated tree as first layers. Edge (horiz) first, then vertical.
p = ggplot(data=treeSegs$edgeDT) + geom_segment(edgeMap) +
geom_segment(vertMap, data=treeSegs$vertDT)
if(!is.null(label.tips)){
# `tiplabDT` has only one row per tip, the farthest horizontal
# adjusted position (one for each taxa)
tiplabDT = dodgeDT
tiplabDT[, xfartiplab:=max(xdodge), by=OTU]
tiplabDT <- tiplabDT[h.adj.index==1, .SD, by=OTU]
if(!is.null(color)){
if(color %in% sample_variables(physeq, errorIfNULL=FALSE)){
color <- NULL
}
}
labelMap <- NULL
if(justify=="jagged"){
labelMap <- aes_string(x="xfartiplab", y="y", label=label.tips, color=color)
} else {
labelMap <- aes_string(x="max(xfartiplab, na.rm=TRUE)", y="y", label=label.tips, color=color)
}
# Add labels layer to plotting object.
p <- p + geom_text(labelMap, tiplabDT, size=I(text.size), hjust=-0.1, na.rm=TRUE)
}
These tree view functions are just plot functions. Although they use ggplot2
and we can for example use theme
to change background, scale_X
function to change xy axis break points and we can add nonsense layer above the tree just as we can produce grammar correct sentence that is nonsense, this is not the philosophy of grammar of graphics.
The tree view can only be controlled via pre-defined parameters. As the code showed above, if we create a tree without labels we can't add a layer of tip labels since the information is created within the function and we don't have that information (we only have the positions of lines after the tree was draw).
For example, in OutbreakTools
if (show.tip.label) {
p <- p + geom_text(data = df.tip, aes(x = x, y = y, label = label), hjust = 0, size = tip.label.size)
}
If show.tip.label = FALSE
, the df.tip
will be throw away when p
was returned.