merge function doubt #369

jamorillo · 2024-05-24T15:53:29Z

Dear Chi,
I have some doubts about applying the merge function of microeco. If it collapses or combines all ASVs/OTUs into genera (for example), I would expect the row.names of the "OTU table" to be genera names, but instead they remain as OTUs. How can I perform this calculation within microeco? I need all data "collapsed" in genera.
For example:

dataset

microtable-class object:
sample_table have 90 rows and 4 columns
otu_table have 404 rows and 90 columns
tax_table have 404 rows and 7 columns
phylo_tree have 404 tips
rep_fasta have 404 sequences

dataset_Genus <- dataset$merge_taxa(taxa = "Genus")
dataset_Genus

microtable-class object:
sample_table have 90 rows and 4 columns
otu_table have 180 rows and 90 columns
tax_table have 180 rows and 6 columns

-> OK, 180 genera. BUT:

row.names(dataset_Genus$otu_table)

[1] "OTU_16" "OTU_172" "OTU_150" "OTU_12" "OTU_357" "OTU_1" "OTU_5" "OTU_34" "OTU_82" "OTU_102" "OTU_36" "OTU_2" ...

-> is here where I would expect genera names ike "Bacillus", "Pseudomonas" collapsing all ASVs belonging to the same Genus

ChiLiubio · 2024-05-25T02:42:50Z

Hi @jamorillo ,
The main reason is collapsed genera have many unclassified information for different Family, e.g. multiple "g__" in Genus column in tax_table. If we directly use genus names as rownames, these g__ will be merged into one. This can directly discard some unclassified information. If we combine all taxonomic levels names as rownames, it is not readable for users. So it is best to select one representative OTU/ASV to temprarily represent its genus. This does not affect all the following analysis, because the collopased data (microtable object) has totally same format with previous one. It is a very important principle for the pipeline. So If you want to use genera names instead of OTU/ASV, you can directly replace them. Here is an example.

library(microeco)
library(magrittr)
data(dataset)
test <- dataset$merge_taxa("Genus")
# delete those duplicated names, e.g. g__ or other same names
test$tax_table %<>% .[! duplicated(.$Genus), ]
# delete remained g__ if it is necessary
test$tax_table %<>% .[.$Genus != "g__", ]
test$tidy_dataset()
rownames(test$otu_table) <- rownames(test$tax_table) <- gsub("g__", "", test$tax_table$Genus)

jamorillo · 2024-05-27T15:38:58Z

Aha. I understand the reason. I tested your piepeline, it works perfectly. My idea of this "collapsed" table is to use it for specifc heatmaps with selected genera, then is useful to have the genera already in the row.names.
Thanks a lot once more,
jose

ChiLiubio added the documentation Improvements or additions to documentation label May 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge function doubt #369

merge function doubt #369

jamorillo commented May 24, 2024 •

edited

Loading

ChiLiubio commented May 25, 2024

jamorillo commented May 27, 2024

merge function doubt #369

merge function doubt #369

Comments

jamorillo commented May 24, 2024 • edited Loading

ChiLiubio commented May 25, 2024

jamorillo commented May 27, 2024

jamorillo commented May 24, 2024 •

edited

Loading