Skip to content

Commit

Permalink
refactor: peaksData,MsBackendMemory can return data.frame
Browse files Browse the repository at this point in the history
- `peaksData,MsBackendMemory` returns by default a `list` of `matrix` or a
  `list` of `data.frame`s if other peak variables than `"mz"`, `"intensity"`
  are requested. Issue #289.
  • Loading branch information
jorainer committed May 25, 2023
1 parent 3203594 commit 6556a26
Show file tree
Hide file tree
Showing 7 changed files with 133 additions and 97 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: Spectra
Title: Spectra Infrastructure for Mass Spectrometry Data
Version: 1.11.2
Version: 1.11.3
Description: The Spectra package defines an efficient infrastructure
for storing and handling mass spectrometry spectra and functionality to
subset, process, visualize and compare spectra data. It provides different
Expand Down
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Spectra 1.11

## Changes in 1.11.3

- `peaksData,MsBackendMemory` returns a `data.frame` if additional peak
variables (in addition to `"mz"` and `"intensity"`) are requested. For
`columns = c("mz", "intensity")` (the default) a `list` of `matrix` is
returned.

## Changes in 1.11.2

- Add `deisotopeSpectra` and `reduceSpectra` functions.
Expand Down
42 changes: 23 additions & 19 deletions R/MsBackend.R
Original file line number Diff line number Diff line change
Expand Up @@ -422,28 +422,30 @@
#' the number of spectra in `object`. `NA` are reported for MS1
#' spectra of if no precursor information is available.
#'
#' - `peaksData` returns a `list` with the spectras' peak data, i.e. numeric
#' `matrix` with peak values. The length of the list is equal to the number
#' of spectra in `object`. Each element of the list is a `numeric` `matrix`
#' - `peaksData` returns a `list` with the spectras' peak data, i.e. m/z and
#' intensity values or other *peak variables*. The length of the list is
#' equal to the number of spectra in `object`. Each element of the list has
#' to be a two-dimensional array (`matrix` or `data.frame`)
#' with columns depending on the provided `columns` parameter (by default
#' `"mz"` and `"intensity"`, but depends on the backend's available
#' `peaksVariables`). For an empty spectrum, a `matrix` with 0 rows and
#' columns according to `columns` is returned. The optional parameter
#' `columns`, if supported by the backend, allows to define which peak
#' variables should be returned in the `numeric` peak `matrix`. As a default
#' `c("mz", "intensity")` should be used.
#' `peaksVariables`). For an empty spectrum, a `matrix` (`data.frame`) with
#' 0 rows and columns according to `columns` is returned. The optional
#' parameter `columns`, if supported by the backend, allows to define which
#' peak variables should be returned in the `numeric` peak `matrix`. As a
#' default `c("mz", "intensity")` should be used.
#'
#' - `peaksData<-` replaces the peak data (m/z and intensity values) of the
#' backend. This method expects a `list` of `matrix` objects with columns
#' `"mz"` and `"intensity"` that has the same length as the number of
#' spectra in the backend. Note that just writeable backends support this
#' method.
#' backend. This method expects a `list` of two dimensional arrays (`matrix`
#' or `data.frame`) with columns representing the peak variables. All
#' existing peaks data is expected to be replaced with these new values. The
#' length of the `list` has to match the number of spectra of `object`.
#' Note that just writeable backends support this method.
#'
#' - `peaksVariables`: lists the available variables for mass peaks. Default
#' peak variables are `"mz"` and `"intensity"` (which all backends need to
#' support and provide), but some backends might provide additional variables.
#' These variables correspond to the column names of the `numeric` `matrix`
#' representing the peak data (returned by `peaksData`).
#' All these variables are expected to be returned (if requested) by the
#' `peaksData` function.
#'
#' - `reset` a backend (if supported). This method will be called on the backend
#' by the `reset,Spectra` method that is supposed to restore the data to its
Expand Down Expand Up @@ -569,7 +571,8 @@
#' to be spectra variables! Also, while it is possible to change the values of
#' existing peaks variables using the `$<-` method, this method does **not**
#' allow to add new peaks variables to an existing `MsBackendMemory`. New
#' peaks variables should be added using the `backendInitialize` method.
#' peaks variables can at present only be added using the `backendInitialize`
#' method.
#'
#' Suggested columns of this `DataFrame` are:
#'
Expand Down Expand Up @@ -598,10 +601,11 @@
#'
#' Additional columns are allowed too.
#'
#' For the `MsBackendMemory`, any column in the provided `data.frame` which
#' contains a `list` of vectors each with length equal to the number of peaks
#' for a spectrum will be used as additional *peak variable* (see examples
#' below for details).
#' The `peaksData` function for `MsBackendMemory` returns a `list` of
#' `numeric` `matrix` by default (with parameter
#' `columns = c("mz", "intensity")`). If other peak variables are requested,
#' a `list` of `data.frame` is returned (to ensure m/z and intensity values
#' are always `numeric`).
#'
#' The `MsBackendDataFrame` ignores parameter `columns` of the `peaksData`
#' function and returns **always** m/z and intensity values.
Expand Down
62 changes: 32 additions & 30 deletions R/MsBackendMemory.R
Original file line number Diff line number Diff line change
Expand Up @@ -352,32 +352,33 @@ setReplaceMethod("mz", "MsBackendMemory", function(object, value) {
})

#' @rdname hidden_aliases
setMethod("peaksData", "MsBackendMemory", function(object,
columns = c("mz", "intensity")) {
if (length(object)) {
cns <- colnames(object@peaksData[[1L]])
if (length(columns) == length(cns) && all(cns == columns))
return(object@peaksData)
if (!all(columns %in% peaksVariables(object)))
stop("Some of the requested peaks variables are not ",
"available", call. = FALSE)
pcol <- intersect(columns, c("mz", "intensity"))
pdcol <- setdiff(columns, c("mz", "intensity"))
## request columns only from peaksData
if (length(pcol) & !length(pdcol))
return(lapply(object@peaksData,
function(z) z[, pcol, drop = FALSE]))
## request columns only from peaksDataFrame
if (length(pdcol) & !length(pcol))
return(lapply(object@peaksDataFrame,
function(z) as.matrix(z[, pdcol, drop = FALSE])))
## request columns from both
mapply(object@peaksData, object@peaksDataFrame, FUN = function(a, b) {
as.matrix(cbind(a[, pcol, drop = FALSE],
b[, pdcol, drop = FALSE])[, columns])
}, SIMPLIFY = FALSE)
} else list()
})
setMethod(
"peaksData", "MsBackendMemory", function(object,
columns = c("mz", "intensity")) {
if (length(object)) {
cns <- colnames(object@peaksData[[1L]])
if (length(columns) == length(cns) && all(cns == columns))
return(object@peaksData)
if (!all(columns %in% peaksVariables(object)))
stop("Some of the requested peaks variables are not ",
"available", call. = FALSE)
pcol <- intersect(columns, c("mz", "intensity"))
pdcol <- setdiff(columns, c("mz", "intensity"))
## request columns only from peaksData
if (length(pcol) & !length(pdcol))
return(lapply(object@peaksData,
function(z) z[, pcol, drop = FALSE]))
## request columns only from peaksDataFrame
if (length(pdcol) & !length(pcol))
return(lapply(object@peaksDataFrame,
function(z) z[, pdcol, drop = FALSE]))
## request columns from both
mapply(object@peaksData, object@peaksDataFrame, FUN = function(a, b) {
data.frame(a[, pcol, drop = FALSE],
b[, pdcol, drop = FALSE])[, columns]
}, SIMPLIFY = FALSE, USE.NAMES = FALSE)
} else list()
})

#' @rdname hidden_aliases
setReplaceMethod("peaksData", "MsBackendMemory", function(object, value) {
Expand All @@ -386,8 +387,9 @@ setReplaceMethod("peaksData", "MsBackendMemory", function(object, value) {
stop("'value' has to be a list-like object")
if (length(value) != length(object))
stop("Length of 'value' has to match length of 'object'")
if (!is.matrix(value[[1L]]))
stop("'value' is expected to be a 'list' of 'matrix'")
if (!(is.matrix(value[[1L]]) | is.data.frame(value[[1L]])))
stop("'value' is expected to be a 'list' of 'matrix' ",
"or 'data.frame'")
cn <- colnames(value[[1L]])
lcn <- length(cn)
lapply(value, function(z) {
Expand All @@ -397,12 +399,12 @@ setReplaceMethod("peaksData", "MsBackendMemory", function(object, value) {
})
## columns mz and intensity go into peaksData.
if (lcn == 2 && all(cn == c("mz", "intensity")))
object@peaksData <- value
object@peaksData <- lapply(value, base::as.matrix)
else {
pcn <- intersect(c("mz", "intensity"), cn)
if (length(pcn))
object@peaksData <- lapply(
value, function(z) z[, pcn, drop = FALSE])
value, function(z) base::as.matrix(z[, pcn, drop = FALSE]))
pcn <- setdiff(cn, c("mz", "intensity"))
if (length(pcn))
object@peaksDataFrame <- lapply(
Expand Down
42 changes: 23 additions & 19 deletions man/MsBackend.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion man/Spectra.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

71 changes: 44 additions & 27 deletions tests/testthat/test_MsBackendMemory.R
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,7 @@ test_that("peaksData,MsBackendMemory works", {
intensity = test_df$intensity[[3L]]))
expect_error(peaksData(be, "other"), "not available")
res <- peaksData(be, c("intensity", "mz"))
expect_true(is.matrix(res[[1L]]))
expect_equal(res[[1L]], cbind(intensity = test_df$intensity[[1L]],
mz = test_df$mz[[1L]]))
expect_equal(res[[2L]], cbind(intensity = test_df$intensity[[2L]],
Expand All @@ -317,41 +318,46 @@ test_that("peaksData,MsBackendMemory works", {
expect_equal(res[[3L]], cbind(mz = test_df$mz[[3L]],
intensity = test_df$intensity[[3L]]))
res <- peaksData(be, c("mz", "pk_ann"))
expect_equal(res[[1L]], cbind(mz = tmp$mz[[1L]],
pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], cbind(mz = tmp$mz[[2L]],
pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], cbind(mz = tmp$mz[[3L]],
pk_ann = tmp$pk_ann[[3L]]))
expect_true(is.data.frame(res[[1L]]))
expect_equal(res[[1L]], data.frame(mz = tmp$mz[[1L]],
pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], data.frame(mz = tmp$mz[[2L]],
pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], data.frame(mz = tmp$mz[[3L]],
pk_ann = tmp$pk_ann[[3L]]))
res <- peaksData(be, c("pk_ann", "mz"))
expect_equal(res[[1L]], cbind(pk_ann = tmp$pk_ann[[1L]],
mz = tmp$mz[[1L]]))
expect_equal(res[[2L]], cbind(pk_ann = tmp$pk_ann[[2L]],
mz = tmp$mz[[2L]]))
expect_equal(res[[3L]], cbind(pk_ann = tmp$pk_ann[[3L]],
mz = tmp$mz[[3L]]))
expect_true(is.data.frame(res[[1L]]))
expect_equal(res[[1L]], data.frame(pk_ann = tmp$pk_ann[[1L]],
mz = tmp$mz[[1L]]))
expect_equal(res[[2L]], data.frame(pk_ann = tmp$pk_ann[[2L]],
mz = tmp$mz[[2L]]))
expect_equal(res[[3L]], data.frame(pk_ann = tmp$pk_ann[[3L]],
mz = tmp$mz[[3L]]))
tmp$add_ann <- list(1:3, 1:2, 1:4)
be <- backendInitialize(
be, tmp, peaksVariables = c("mz", "intensity", "pk_ann", "add_ann"))
res <- peaksData(be, c("mz", "pk_ann"))
expect_equal(res[[1L]], cbind(mz = tmp$mz[[1L]],
pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], cbind(mz = tmp$mz[[2L]],
pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], cbind(mz = tmp$mz[[3L]],
pk_ann = tmp$pk_ann[[3L]]))
expect_true(is.data.frame(res[[1L]]))
expect_equal(res[[1L]], data.frame(mz = tmp$mz[[1L]],
pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], data.frame(mz = tmp$mz[[2L]],
pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], data.frame(mz = tmp$mz[[3L]],
pk_ann = tmp$pk_ann[[3L]]))
res <- peaksData(be, c("add_ann", "pk_ann"))
expect_equal(res[[1L]], cbind(add_ann = tmp$add_ann[[1L]],
pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], cbind(add_ann = tmp$add_ann[[2L]],
pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], cbind(add_ann = tmp$add_ann[[3L]],
pk_ann = tmp$pk_ann[[3L]]))
expect_true(is.data.frame(res[[1L]]))
expect_equal(res[[1L]], data.frame(add_ann = tmp$add_ann[[1L]],
pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], data.frame(add_ann = tmp$add_ann[[2L]],
pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], data.frame(add_ann = tmp$add_ann[[3L]],
pk_ann = tmp$pk_ann[[3L]]))

res <- peaksData(be, "pk_ann")
expect_equal(res[[1L]], cbind(pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], cbind(pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], cbind(pk_ann = tmp$pk_ann[[3L]]))
expect_true(is.data.frame(res[[1L]]))
expect_equal(res[[1L]], data.frame(pk_ann = tmp$pk_ann[[1L]]))
expect_equal(res[[2L]], data.frame(pk_ann = tmp$pk_ann[[2L]]))
expect_equal(res[[3L]], data.frame(pk_ann = tmp$pk_ann[[3L]]))
})

test_that("peaksData<-,MsBackendMemory works", {
Expand Down Expand Up @@ -386,6 +392,17 @@ test_that("peaksData<-,MsBackendMemory works", {
expect_equal(be@peaksDataFrame, list(data.frame(add_col = 15),
data.frame(add_col = 5:7),
data.frame(add_col = 100)))

lst2 <- list(
data.frame(intensity = 10.1, mz = 1, add_col = 15),
data.frame(intensity = c(12.1, 12.4, 12.4), mz = 1:3, add_col = 5:7),
data.frame(intensity = 100, mz = 3.1, add_col = 100))
peaksData(be) <- lst2
expect_equal(peaksData(be), lst)
expect_equal(be@peaksDataFrame, list(data.frame(add_col = 15),
data.frame(add_col = 5:7),
data.frame(add_col = 100)))

})

test_that("$,MsBackendMemory works", {
Expand Down

0 comments on commit 6556a26

Please sign in to comment.