Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chromPeakSummary function and a fix #772

Open
wants to merge 24 commits into
base: devel
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
2c57df3
feat: report m/z values in chromPeaks matrix for XChromatograms
jorainer Sep 16, 2024
dacd665
fix: peak shape quality calculation for gap filling
jorainer Sep 18, 2024
188541c
feat: add chromPeakSummary generic (issue #705)
jorainer Sep 18, 2024
e5aa3ff
docs: add documentation for chromPeakSummary generic
jorainer Sep 18, 2024
b59b15e
Add internal function to calculate beta metrics
pablovgd Sep 18, 2024
c017cba
Fixed requested changes for PR #767
pablovgd Sep 19, 2024
fe7509c
Fix for PR #767
pablovgd Sep 19, 2024
670de04
Merge pull request #767 from pablovgd/issue705
jorainer Sep 19, 2024
c581cde
added chromPeakSummary method
pablovgd Sep 20, 2024
f9d4b0d
requested changes for PR #768
pablovgd Sep 20, 2024
e7ee120
Merge pull request #768 from pablovgd/issue705
jorainer Sep 23, 2024
dd68e2f
Added section on peak quality to vignette.
pablovgd Sep 23, 2024
a49539e
Fixed typos in peak quality vignette.
pablovgd Sep 23, 2024
13cfaf3
Merge pull request #770 from pablovgd/issue705
jorainer Sep 24, 2024
156788a
refactor: little fixes
jorainer Sep 24, 2024
fd9cf65
Address William's comments
jorainer Oct 1, 2024
39cffa7
Merge branch 'devel' into jomain
jorainer Oct 8, 2024
a3bc33b
bump version
jorainer Oct 8, 2024
9f1b7b6
Merge pull request #776 from sneumann/phili
jorainer Oct 29, 2024
93d8b2d
Merge branch 'devel' into jomain
jorainer Oct 30, 2024
97b781e
Merge remote-tracking branch 'origin/jomain' into jomain
jorainer Oct 30, 2024
46ca5ae
Update NEWS.md
jorainer Oct 30, 2024
2d658b6
Merge branch 'devel' into jomain
jorainer Dec 16, 2024
6a67604
feat: add c and coerce methods for XcmsExperiment
jorainer Dec 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Package: xcms
Version: 4.5.2
Version: 4.5.3
Title: LC-MS and GC-MS Data Analysis
Description: Framework for processing and visualization of chromatographically
separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF,
Expand Down Expand Up @@ -159,4 +159,3 @@ Collate:
'writemztab.R'
'xcmsSource.R'
'zzz.R'

8 changes: 5 additions & 3 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ S3method(plot, xcmsEIC)
S3method(split, xcmsSet)
S3method(c, xcmsSet)
S3method(c, XCMSnExp)

S3method(c, XcmsExperiment)
S3method(split, xcmsRaw)

exportClasses(
Expand Down Expand Up @@ -461,7 +461,8 @@ export("CentWaveParam",
"CleanPeaksParam",
"MergeNeighboringPeaksParam",
"FilterIntensityParam",
"ChromPeakAreaParam")
"ChromPeakAreaParam",
"BetaDistributionParam")
## Param class methods.

## New Classes
Expand Down Expand Up @@ -530,7 +531,8 @@ exportMethods("hasChromPeaks",
"featureSpectra",
"chromPeakSpectra",
"chromPeakChromatograms",
"featureChromatograms"
"featureChromatograms",
"chromPeakSummary"
)

## feature grouping functions and methods.
Expand Down
27 changes: 20 additions & 7 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,26 @@
# xcms 4.5.2
# xcms 4.5

## Changes in version 4.5.3

- Address issue #765: peak detection on chromatographic data: report a
chromatogram's `"mz"`, `"mzmin"` and `"mzmax"` as the mean m/z and lower and
upper m/z in the `chromPeaks()` matrix.
- Fix calculation of the correlation coefficient for peak shape similarity with
an idealized bell shape (*beta*) during gap filling for centWave-based
chromatographic peak detection with parameter `verboseBetaColumns = TRUE`.
- Add `chromPeakSummary` generic (issue #705).
- Add `chromPeakSummary()` method to calculate the *beta* quality metrics.
- Add `c()` method to combine multiple `XcmsExperiment` objects into one.
- Add a method to coerce from `XCMSnExp` to `XcmsExperiment` objects.

## Changes in version 4.5.2

- Small update to `featureSpectra()` and `chromPeakSpectra()` to allow addition
of `chromPeaks()` and `featuresDefinitions()` columns to be added to the
`Spectra` output.
- Tidied the `xcms` vignette, to order the filtering of features and remove
the outdated normalisation paragraph.In depth discussion on this subject can
be found on `metabonaut`.
`Spectra` output.
- Tidied the `xcms` vignette, to order the filtering of features and remove
the outdated normalisation paragraph.In depth discussion on this subject can
be found on `metabonaut`.

## Changes in version 4.5.1

Expand All @@ -18,8 +31,8 @@
## Changes in version 4.3.4

- Small update to the `matchLamaChromPeaks()` function to get the chromPeaksId
of the chromPeaks matched with Lamas.
- Small fix to the .yml file for the github actions, so they do not crash on
of the chromPeaks matched with Lamas.
- Small fix to the .yml file for the github actions, so they do not crash on
warnings.

## Changes in version 4.3.3
Expand Down
122 changes: 95 additions & 27 deletions R/AllGenerics.R
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ setGeneric("addProcessHistory", function(object, ...)
#' parameter in \code{\link{profile-matrix}} documentation for more details.
#'
#' @param BPPARAM parallel processing setup. Defaults to `BPPARAM = bpparam()`.
#' See [bpparam()] for details.
#' See [BiocParallel::bpparam()] for details.
#'
#' @param centerSample \code{integer(1)} defining the index of the center sample
#' in the experiment. It defaults to
Expand Down Expand Up @@ -143,7 +143,7 @@ setGeneric("addProcessHistory", function(object, ...)
#'
#' @param family For `PeakGroupsParam`: `character(1)` defining the method for
#' loess smoothing. Allowed values are `"gaussian"` and `"symmetric"`. See
#' [loess()] for more information.
#' [stats::loess()] for more information.
#'
#' @param gapExtend For `ObiwarpParam`: `numeric(1)` defining the penalty for
#' gap enlargement. The default value for `gapExtend` depends on the value
Expand Down Expand Up @@ -177,8 +177,8 @@ setGeneric("addProcessHistory", function(object, ...)
#' @param msLevel For `adjustRtime`: `integer(1)` defining the MS level on
#' which the alignment should be performed.
#'
#' @param object For `adjustRtime`: an [OnDiskMSnExp()], [XCMSnExp()],
#' [MsExperiment()] or [XcmsExperiment()] object.
#' @param object For `adjustRtime`: an [MSnbase::OnDiskMSnExp()], [XCMSnExp()],
#' [MsExperiment::MsExperiment()] or [XcmsExperiment()] object.
#'
#' @param param The parameter object defining the alignment method (and its
#' setting).
Expand Down Expand Up @@ -212,7 +212,7 @@ setGeneric("addProcessHistory", function(object, ...)
#'
#' @param span For `PeakGroupsParam`: `numeric(1)` defining
#' the degree of smoothing (if `smooth = "loess"`). This parameter is
#' passed to the internal call to [loess()].
#' passed to the internal call to [stats::loess()].
#'
#' @param subset For `ObiwarpParam` and `PeakGroupsParam`: `integer` with the
#' indices of samples within the experiment on which the alignment models
Expand Down Expand Up @@ -463,7 +463,8 @@ setGeneric("chromPeakData<-", function(object, value)
#' The columns will be named as they are written in the `chromPeaks` object
#' with a prefix `"chrom_peak_"`. Defaults to `c("mz", "rt")`.
#'
#' @param BPPARAM parallel processing setup. Defaults to [bpparam()].
#' @param BPPARAM parallel processing setup. Defaults to
#' [BiocParallel::bpparam()].
#'
#' @param ... ignored.
#'
Expand Down Expand Up @@ -545,6 +546,66 @@ setGeneric("chromPeakData<-", function(object, value)
setGeneric("chromPeakSpectra", function(object, ...)
standardGeneric("chromPeakSpectra"))

#' @title Chromatographic peak summaries
#'
#' @name chromPeakSummary
#'
#' @description
#'
#' The `chromPeakSummary()` method calculates summary statistics or other
#' metrics for each of the identified chromatographic peaks in an *xcms* result
#' object, such as the [XcmsExperiment()]. Different metrics can be calculated,
#' depending upon (and configured by) using dedicated *parameter* classes. As a
#' result, the method returns a `matrix` or `data.frame` with one row per
#' chromatographic peak. Each column contains calculated values, depending on
#' the used method/parameter class.
#'
#' Currently implemented methods/parameter classes are:
#'
#' - `BetaDistributionParam`: calculates the *beta_cor* and *beta_snr* quality
#' metrics as described in Kumler 2023 representing the result from a
#' (correlation) test of similarity (using Pearson's correlation coefficient)
#' to a bell curve and the signal-to-noise ratio calculated on the residuals
#' of this test.
#'
#' @param BPPARAM Parallel processing setup. See
#' [BiocParallel::bpparam()] for details.
#'
#' @param chunkSize `integer(1)` defining the number of samples from which data
#' should be loaded and processed at a time.
#'
#' @param msLevel `integer(1)` with the MS level of the chromatographic peaks
#' on which the metric should be calculated.
#'
#' @param object an *xcms* result object containing information on
#' identified chromatographic peaks.
#'
#' @param param a parameter object defining the method/summaries that should
#' be calculated (see description above for supported parameter classes).
#'
#' @param ... additional arguments passed to the method implementation.
#'
#' @return
#'
#' A `matrix` or `data.frame` with the same number of rows as there are
#' chromatographic peaks. Columns contain the calculated values. The number of
#' columns, their names and content depend on the used parameter object. See
#' the respective documentation above for more details.
#'
#' @author Pablo Vangeenderhuysen, Johannes Rainer, William Kumler
#'
#' @md
#'
#' @references
#'
#' Kumler W, Hazelton B J and Ingalls A E (2023) "Picky with peakpicking:
#' assessing chromatographic peak quality with simple metrics in metabolomics"
#' *BMC Bioinformatics* 24(1):404. doi: 10.1186/s12859-023-05533-4
#'
#' @export
setGeneric("chromPeakSummary", function(object, param, ...)
standardGeneric("chromPeakSummary"))

setGeneric("collect", function(object, ...) standardGeneric("collect"))
setGeneric("consecMissedLimit", function(object, ...)
standardGeneric("consecMissedLimit"))
Expand Down Expand Up @@ -642,8 +703,8 @@ setGeneric("family<-", function(object, value) standardGeneric("family<-"))
#' chromatogram.
#'
#' @param BPPARAM For `object` being an `XcmsExperiment`: parallel processing
#' setup. Defaults to `BPPARAM = bpparam()`. See [bpparam()] for more
#' information.
#' setup. Defaults to `BPPARAM = bpparam()`. See [BiocParallel::bpparam()]
#' for more information.
#'
#' @param chunkSize For `object` being an `XcmsExperiment`: `integer(1)`
#' defining the number of files from which the data should be loaded at
Expand Down Expand Up @@ -810,7 +871,8 @@ setGeneric("featureDefinitions<-", function(object, value)
#' spectra per feature).
#'
#' The information from `featureDefinitions` for each feature can be included
#' in the returned [Spectra()] object using the `featureColumns` parameter.
#' in the returned [Spectra::Spectra()] object using the `featureColumns`
#' parameter.
#' This is useful for keeping details such as the median retention time (`rtmed`)
#' or median m/z (`mzmed`). The columns will retain their names as specified
#' in the `featureDefinitions` object, prefixed by `"feature_"`
Expand All @@ -819,9 +881,11 @@ setGeneric("featureDefinitions<-", function(object, value)
#' as a metadata column named `"feature_id"`.
#'
#' See also [chromPeakSpectra()], as it supports a similar parameter for
#' including columns from the chromatographic peaks in the returned spectra object.
#' including columns from the chromatographic peaks in the returned spectra
#' object.
#' These parameters can be used in combination to include information from both
#' the chromatographic peaks and the features in the returned [Spectra()].
#' the chromatographic peaks and the features in the returned
#' [Spectra::Spectra()].
#' The *peak ID* (i.e., the row name of the peak in the `chromPeaks` matrix)
#' is added as a metadata column named `"chrom_peak_id"`.
#'
Expand All @@ -847,7 +911,8 @@ setGeneric("featureDefinitions<-", function(object, value)
#'
#' @return
#'
#' The function returns either a [Spectra()] (for `return.type = "Spectra"`)
#' The function returns either a [Spectra::Spectra()] (for
#' `return.type = "Spectra"`)
#' or a `List` of `Spectra` (for `return.type = "List"`). For the latter,
#' the order and the length matches parameter `features` (or if no `features`
#' is defined the order of the features in `featureDefinitions(object)`).
Expand Down Expand Up @@ -1146,7 +1211,7 @@ setGeneric("filterFeatureDefinitions", function(object, ...)
#' object will remove previous results.
#'
#' @param BPPARAM Parallel processing setup. Uses by default the system-wide
#' default setup. See [bpparam()] for more details.
#' default setup. See [BiocParallel::bpparam()] for more details.
#'
#' @param chunkSize `integer(1)` for `object` being an `MsExperiment` or
#' [XcmsExperiment()]: defines the number of files (samples) for which the
Expand All @@ -1165,14 +1230,15 @@ setGeneric("filterFeatureDefinitions", function(object, ...)
#' will thus in most settings cause an out-of-memory error.
#' By setting `chunkSize = -1` the peak detection will be performed
#' separately, and in parallel, for each sample. This will however not work
#' for all `Spectra` *backends* (see eventually [Spectra()] for details).
#' for all `Spectra` *backends* (see eventually [Spectra::Spectra()] for
#' details).
#'
#' @param msLevel `integer(1)` defining the MS level on which the
#' chromatographic peak detection should be performed.
#'
#' @param object The data object on which to perform the peak detection. Can be
#' an [OnDiskMSnExp()], [XCMSnExp()], [MChromatograms()] or [MsExperiment()]
#' object.
#' an [MSnbase::OnDiskMSnExp()], [XCMSnExp()], [MSnbase::MChromatograms()]
#' or [MsExperiment::MsExperiment()] object.
#'
#' @param param The parameter object selecting and configuring the algorithm.
#'
Expand Down Expand Up @@ -1242,7 +1308,8 @@ setGeneric("findChromPeaks", function(object, param, ...)
#' more information.
#'
#' @param BPPARAM if `object` is an `MsExperiment` or `XcmsExperiment`:
#' parallel processing setup. See [bpparam()] for more information.
#' parallel processing setup. See [BiocParallel::bpparam()] for more
#' information.
#'
#' @param ... currently not used.
#'
Expand Down Expand Up @@ -1537,7 +1604,8 @@ setGeneric("loadRaw", function(object, ...) standardGeneric("loadRaw"))
#' chromatographic peaks into features by providing their index in the
#' object's `chromPeaks` matrix.
#'
#' @param BPPARAM parallel processing settings (see [bpparam()] for details).
#' @param BPPARAM parallel processing settings (see [BiocParallel::bpparam()]
#' for details).
#'
#' @param chromPeaks For `manualChromPeaks`: `matrix` defining the boundaries
#' of the chromatographic peaks with one row per chromatographic peak and
Expand Down Expand Up @@ -1745,9 +1813,9 @@ setGeneric("rawMZ", function(object, ...) standardGeneric("rawMZ"))
#' Each MS2 chromatographic peak selected for an MS1 peak will thus represent
#' one **mass peak** in the reconstructed spectrum.
#'
#' The resulting [Spectra()] object provides also the peak IDs of the MS2
#' chromatographic peaks for each spectrum as well as their correlation value
#' with spectra variables *ms2_peak_id* and *ms2_peak_cor*.
#' The resulting [Spectra::Spectra()] object provides also the peak IDs of
#' the MS2 chromatographic peaks for each spectrum as well as their
#' correlation value with spectra variables *ms2_peak_id* and *ms2_peak_cor*.
#'
#' @param object `XCMSnExp` with identified chromatographic peaks.
#'
Expand All @@ -1774,8 +1842,8 @@ setGeneric("rawMZ", function(object, ...) standardGeneric("rawMZ"))
#' `chromPeaks`) of MS1 peaks for which MS2 spectra should be reconstructed.
#' By default they are reconstructed for all MS1 chromatographic peaks.
#'
#' @param BPPARAM parallel processing setup. See [bpparam()] for more
#' information.
#' @param BPPARAM parallel processing setup. See [BiocParallel::bpparam()]
#' for more information.
#'
#' @param return.type `character(1)` defining the type of the returned object.
#' Only `return.type = "Spectra"` is supported, `return.type = "MSpectra"`
Expand All @@ -1785,14 +1853,14 @@ setGeneric("rawMZ", function(object, ...) standardGeneric("rawMZ"))
#'
#' @return
#'
#' - [Spectra()] object (defined in the `Spectra` package) with the
#' - [Spectra::Spectra()] object (defined in the `Spectra` package) with the
#' reconstructed MS2 spectra for all MS1 peaks in `object`. Contains
#' empty spectra (i.e. without m/z and intensity values) for MS1 peaks for
#' which reconstruction was not possible (either no MS2 signal was recorded
#' or the correlation of the MS2 chromatographic peaks with the MS1
#' chromatographic peak was below threshold `minCor`. Spectra variables
#' `"ms2_peak_id"` and `"ms2_peak_cor"` (of type [CharacterList()]
#' and [NumericList()] with length equal to the number of peaks per
#' `"ms2_peak_id"` and `"ms2_peak_cor"` (of type [IRanges::CharacterList()]
#' and [IRanges::NumericList()] with length equal to the number of peaks per
#' reconstructed MS2 spectrum) providing the IDs and the correlation of the
#' MS2 chromatographic peaks from which the MS2 spectrum was reconstructed.
#' As retention time the median retention times of all MS2 chromatographic
Expand Down Expand Up @@ -1888,7 +1956,7 @@ setGeneric("reconstructChromPeakSpectra", function(object, ...)
#'
#' @param BPPARAM parameter object to set up parallel processing. Uses the
#' default parallel processing setup returned by `bpparam()`. See
#' [bpparam()] for details and examples.
#' [BiocParallel::bpparam()] for details and examples.
#'
#' @param chunkSize For `refineChromPeaks` if `object` is either an
#' `XcmsExperiment`: `integer(1)` defining the number of files (samples)
Expand Down
5 changes: 5 additions & 0 deletions R/DataClasses.R
Original file line number Diff line number Diff line change
Expand Up @@ -2182,3 +2182,8 @@ setClass("FilterIntensityParam",
msg
else TRUE
})

setClass("BetaDistributionParam",
contains = "Param"
)

31 changes: 31 additions & 0 deletions R/MsExperiment-functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -546,3 +546,34 @@
x@sampleDataLinks[["spectra"]] <- sdl
x
}

#' WARNING: this only joins @sampleData, @spectra and
#' `@sampleDataLinks[["spectra"]]`! All other slots are ignored.
#'
#' @noRd
.mse_combine <- function(x) {
if (!all(vapply(x, inherits, NA, "MsExperiment")))
stop("Only objects extending 'MsExperiment' accepted as input.")
## check other slots
lapply(x, function(z) {
if (length(z@experimentFiles) || length(z@qdata) || length(z@otherData))
stop("Slots 'experimentFiles', 'qdata' or 'otherData' are not ",
"empty! Can only combine objects for which these data slots ",
"are empty.", call. = FALSE)
})
res <- x[[1L]]
res@sampleData <- do.call(MsCoreUtils::rbindFill, lapply(x, sampleData))
res@spectra <- do.call(c, lapply(x, spectra))
sl <- lapply(x, function(z) z@sampleDataLinks[["spectra"]])
nsamp <- lengths(x)
nsamp <- c(0, cumsum(nsamp)[-length(nsamp)])
nspec <- vapply(sl, nrow, NA_integer_)
nspec <- c(0, cumsum(nspec)[-length(nspec)])
res@sampleDataLinks[["spectra"]] <- do.call(
rbind, mapply(function(z, i, j) {
z[, 1L] <- z[, 1L] + i
z[, 2L] <- z[, 2L] + j
z
}, sl, nsamp, nspec, SIMPLIFY = FALSE, USE.NAMES = FALSE))
res
}
Loading
Loading