Skip to content

Commit

Permalink
v3.1.2
Browse files Browse the repository at this point in the history
v3.1.2
  • Loading branch information
martin-raden authored Oct 30, 2019
2 parents 0477649 + e735251 commit 8285d48
Show file tree
Hide file tree
Showing 39 changed files with 1,303 additions and 273 deletions.
114 changes: 113 additions & 1 deletion ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,124 @@
# changes in development version since last release
################################################################################

################################################################################
################################################################################

################################################################################
### version 3.1.2
################################################################################

# IntaRNA
- `--outSep` = user-defined column separator for tabular CSV output
- bugfix non-overlapping suboptimal enumeration
- bugfix noLP optimization (missing case of direct left-stack extension)
- CSV output
- new ensemble energy and partition function output for intra-molecular
structures formed by seq1 and seq2 (`Eall1, Eall2, Zall1, Zall2`)
- new total energy output `Etotal` = (E+Eall1+Eall2) and
`EallTotal` = (Eall+Eall1+Eall2)
- new `RT` output

# auxiliary R scripts
- plotRegions.py - visualization of sequence regions covered by IntaRNA
predictions, similar to the IntaRNA webserver output (thanks to @dgelsin)

################################################################################

191030 Martin Raden
* IntaRNA/OutputHandlerCsv :
+ RT
* string2list() :
+ support of '*' encoding to generate full list
* bin/CommandLineParsing :
+ docu and implementation of '*' outCsvCol behaviour
* IntaRNA/InteractionEnergyBasePair :
+ computeIntraEall() : computes Eall1|2 via NussinovHandler
* getEall1|2() : call computeIntraEall if needed
* README.md
+ RT CSV col

191029 Martin Raden
* IntaRNA/PredictorMfe :
* getNextBest() :
* bugfix: energy check was applying duplicated ED values
(thanks to Jens Georg)
* IntaRNA/PredictorMfe2dSeedExtension :
* IntaRNA/PredictorMfe2dHeuristicSeedExtension :
* IntaRNA/PredictorMfeEns2dSeedExtension :
* fillHybrid*_left() :
* traceBack() :
* bugfix: missing case in noLP mode (direct left-stack extension)
* IntaRNA/AccessibilityVrna :
+ addConstraints() : dedicated function to add constraint to VrnaFoldCompound
* fillByRNAplfold() : using addConstraints()
* IntaRNA/InteractionEnergy :
+ getBoltzmannWeight( Z_type ) : conversion from kcal/mol-based energies
+ getEall1|2() : ensemble energy for seq1|2
* IntaRNA/InteractionEnergyBasePair :
+ getEall1|2() : NOT IMPLEMENTED YET
* IntaRNA/InteractionEnergyIdxOffset :
+ getEall1|2() : forward to wrapped energy handler
* IntaRNA/InteractionEnergyVrna :
+ Eall1|2 : ensemble energies for seq1|2
+ getEall1|2() : lazy computation of Eall1|2 using computeIntraEall()
+ computeIntraEall() : Eall* computation via vrna_pf() using the respective
accessibility constraints
* IntaRNA/OutputHandlerEnsemble :
- no output of Zall
+ output of RT, Eall1, Eall2, EallTotal
* IntaRNA/OutputHandlerCsv :
+ Eall1, Eall2, EallTotal, Etotal, Zall1, Zall2
* README.md :
+ docu of new CSV columns (Eall1, Eall2, Zall1, Zall2, Etotal, EallTotal)
* ensemble output docu updated

191021 Martin Raden
+ R/plotRegions.R : visualization of sequence regions covered by RRI
+ R/README.md : plotRegions docu
* README.md : link to R/README.md

191008 Martin Raden
* IntaRNA/OutputHandlerCsv :
+ bpList output : list of base pairs
* README.md :
+ bpList
+ outSep

191008 Martin Raden
* bin/CommandLineParsing :
+ 'outSep' argument to set column separator for tabular CSV output
* outSep applied to PredictionTracker*
* IntaRNA/PredictionTracker* :
+ explicit output column separator
* IntaRNA/OutputHandlerCsv :
* needsZall() :
* needBPs() :
- colSep argument : obsolete

191007 Martin Raden
* bin/CommandLineParsing :
* bugfix accNoLP and accNoGUend checks for energy!=V
* IntaRNA/NussinovHandler :
* getQ() :
* fix: return 1 if (i==j)
* IntaRNA/AccessibilityVrna :
* construction() :
+ bugfix: missing ED initialization for short sequences
* tests/AccessibilityBasePair :
+ short sequence test
+ tests/AccessibilityVrna :
+ short sequence test to validate ED initialization

################################################################################
### version 3.1.1
################################################################################

# IntaRNA
- base pairs details only computed if needed for output (speedup for large -n)
- predefined parameter sets for loading (Turner04, Turner99, Andronescu07)
- 'tRegion' and 'qRegion' now available for and applied to multi-sequence input

################################################################################
################################################################################

190924 Martin Raden
Expand Down
35 changes: 35 additions & 0 deletions R/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@

# Auxiliary R scripts of the IntaRNA package



# `plotRegions.R` - Visualization of RRI-covered regions

To visualize sequences' regions covered by RNA-RNA interactions predicted by
IntaRNA, you can use `plotRegions.R` by providing the following arguments (in
the given order)

1. CSV-IntaRNA output file (semicolon separated) covering the columns `start,end,id`
with suffix `1` or `2` to plot target or query regions, respectively
2. `1` or `2` to select whether to plot target or query regions
3. output file name with a file-format-specific suffix from `.pdf`, `.png`,
`.svg`, `.eps`, `.ps`, `.jpeg`, `.tiff`

An example is given below, when calling
```bash
Rscript --vanilla plotRegions.R pred.csv 1 regions.png
```

with `pred.csv` containing
```
id1;start1;end1;id2;start2;end2
b0001;266;273;query;116;123
b0002;204;231;query;85;111
b0003;229;262;query;96;125
b0004;265;300;query;10;38
b0005;281;295;query;5;22
```

will produce the output
![plotRegions.R example](plotRegions.example.png)

151 changes: 151 additions & 0 deletions R/plotRegions.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
#!/usr/bin/env Rscript

####################################################################
# Visualization of sequence regions covered by RNA-RNA interactions
# predicted by IntaRNA.
#
# arguments: <IntaRNA-output-CSV> <1|2> <output-plot-file>
# 1 <IntaRNA-output-CSV> = ";"-separated CSV output of IntaRNA
# 2 <1|2> = suffix of "start,end,id" CSV cols to plot
# 3 <output-plot-file> = file name of the otuput figure suffixed
# by one of ".pdf",".png",".svg",".eps",".ps",".jpeg",".tiff"
#
# example call:
#
# Rscript --slave -f plotRegions.R --args pred.csv 1 regions.png
#
# This script is part of the IntaRNA source code package. See
# respective licence and documentation for further information.
#
# https://github.com/BackofenLab/IntaRNA
#
####################################################################
# check and load dependencies
####################################################################

options(warn=-1)
suppressPackageStartupMessages(require(ggplot2))
suppressPackageStartupMessages(require(ggalt))
suppressPackageStartupMessages(require(cowplot)) # cowplot starts with a note
options(warn=0)

theme_set(theme_cowplot())

####################################################################
# get command line arguments
####################################################################

args = commandArgs(trailingOnly=TRUE)
# check and parse
if (length(args)!=3) { stop("call with <intarna-csv-output> <1|2> <out-file-of-plot>", call.=FALSE) }

intarnaOutputFile = args[1];
if (!file.exists(intarnaOutputFile )) { stop("intarna-csv-output file '", intarnaOutputFile, "' does not exist!", call.=FALSE) }

seqNr = args[2];
if (seqNr != "1" && seqNr != "2") { stop("second call argument as to be '1' or '2' to specify which regions to plot"); }
# compile column names
id = paste("id",seqNr,sep="");
start = paste("start",seqNr,sep="");
end = paste("end",seqNr,sep="");

outFile = args[3];
fileExtensions = c(".pdf",".png",".svg",".eps",".ps",".jpeg",".tiff");
outFileExtOk = FALSE;
for( ext in fileExtensions ) { outFileExtOk = outFileExtOk || endsWith(outFile,ext); }
if ( !outFileExtOk ) {stop("<out-file-of-plot> has to have one of the following file extensions ",paste(fileExtensions,sep=" "), call.=FALSE);}

# if set to some x-position, this will trigger the plotting of a vertical line at that location
xVline = NA;
#xVline = 250; # DEBUG TEST VALUE


####################################################################
# parse IntaRNA output
####################################################################

d = read.csv2( intarnaOutputFile )
# check if all columns present
for( x in c(id,start,end)) {
if (!is.element(x, colnames(d))) {
stop("'",id,"' is not among the column names of '",intarnaOutputFile,"'", call.=FALSE);
}
}

####################################################################
# create count plot
####################################################################

allPos = c();
for( i in 1:nrow(d) ) {
allPos = c( allPos, d[i,start]:d[i,end] );
}
allPos = as.data.frame(allPos,ncol=1)
#allPos # DEBUG OUT

coveragePlot =
ggplot( allPos, aes(x=allPos, stat(count))) +
geom_density() +
ylab("coverage") +
xlab("position") +
scale_y_continuous(position = "right", expand=expand_scale(mult = c(0, .02))) +
scale_x_continuous(expand = c(0, 0)) +
theme( axis.title.x=element_blank() )

if ( ! is.na(xVline)) {
coveragePlot = coveragePlot +
geom_vline(aes(xintercept=xVline));
}

####################################################################
# create region plot
####################################################################

dRegion = data.frame()
dRegion[1:nrow(d),1:3] = d[,c(start,end,id)]
dRegion[1:nrow(d),4] = factor(sprintf("%08d",nrow(d):1))
colnames(dRegion) = c("start","end","id","idx");
#dRegion # DEBUG OUT

yLabelScale = 0.6 # if you have to alter for more/less sequence IDs per inch; see <plotHeight> below

regionPlot =
ggplot(dRegion, aes(x=start,xend=end,y=idx)) +
geom_dumbbell(color="dodgerblue", size=2) +
xlab("position") +
scale_x_continuous( expand = c(0, 0) ) +
ylab("") +
scale_y_discrete(position = "right", breaks=dRegion$idx, labels=dRegion$id) +
geom_vline(aes(xintercept=min(allPos))) +
theme(panel.grid.major.y=element_line(size=0.7,color="lightgray")
, axis.text.y=element_text(size=rel(yLabelScale))
)

if ( ! is.na(xVline)) {
regionPlot = regionPlot +
geom_vline(aes(xintercept=xVline));
}

####################################################################
# plot to file
####################################################################

plotWidth = 6
plotHeightDensity = 2
plotHeight = plotHeightDensity + max(2,nrow(d)/9)
plotHeightDensityRel = plotHeightDensity / plotHeight


plot_grid( coveragePlot, regionPlot
, nrow=2, ncol=1
, align = "hv"
, axis = "r"
, rel_heights= c( plotHeightDensityRel, 1.0-plotHeightDensityRel)
)

ggsave( outFile
, width= plotWidth
, height= plotHeight
);

#############################################################EOF
Binary file added R/plotRegions.example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 8285d48

Please sign in to comment.