-
Notifications
You must be signed in to change notification settings - Fork 145
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2157 from merenlab/reaction-network-updates
Reaction network updates
- Loading branch information
Showing
7 changed files
with
779 additions
and
301 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
This artifact represents **a JSON-formatted file derived from a %(reaction-network)s**. | ||
|
||
The program, %(anvi-get-metabolic-model-file)s, produces this file from the %(reaction-network)s stored in a %(contigs-db)s. The genes, reactions, and metabolites predicted to be involved in metabolism can be inspected in this file, which is formatted for compatability with software used for flux balance analysis, such as [COBRApy](https://opencobra.github.io/cobrapy/). | ||
The program, %(anvi-get-metabolic-model-file)s, produces this file from the %(reaction-network)s stored in a %(contigs-db)s or %(pan-db)s. The genes, reactions, and metabolites predicted to be involved in metabolism can be inspected in this file, which is formatted for compatability with software used for flux balance analysis, such as [COBRApy](https://opencobra.github.io/cobrapy/). | ||
|
||
%(anvi-get-metabolic-model-file)s includes an "objective function" as the first entry of the "reactions" section of the file, a prerequisite for flux balance analysis. The objective function represents the biomass composition of metabolites in the ["core metabolism" of *E. coli*](http://bigg.ucsd.edu/models/e_coli_core). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
This artifact represents **the metabolic reaction network stored in a %(contigs-db)s by %(anvi-reaction-network)s.** | ||
This artifact represents **the metabolic reaction network stored in a %(contigs-db)s or a %(pan-db)s by %(anvi-reaction-network)s.** | ||
|
||
The program, %(anvi-reaction-network)s, generates a reaction network from genes encoding enzymes in the %(contigs-db)s. The reaction network represents biochemical reactions and the constituent metabolites predicted from the genome. The program relies upon [KEGG Orthology (KO)](https://www.genome.jp/kegg/ko.html) annotations of protein-coding genes and reference data in the [ModelSEED Biochemistry database](https://github.com/ModelSEED/ModelSEEDDatabase), and is therefore subject to all the limitations thereof, including incomplete annotation of genes with protein orthologs and imprecise knowledge of the reactions catalyzed by enzymes. | ||
The program, %(anvi-reaction-network)s, generates a reaction network from genes encoding enzymes in the %(contigs-db)s or from gene clusters with consensus enzyme annotations in the %(pan-db)s. The reaction network represents biochemical reactions and the constituent metabolites predicted from the genome or pangenome. The program relies upon [KEGG Orthology (KO)](https://www.genome.jp/kegg/ko.html) annotations of protein-coding genes and reference data in the [ModelSEED Biochemistry database](https://github.com/ModelSEED/ModelSEEDDatabase), and is therefore subject to all the limitations thereof, including incomplete annotation of genes with protein orthologs and imprecise knowledge of the reactions catalyzed by enzymes. | ||
|
||
The representation of the reaction network in two tables of the %(contigs-db)s, `gene_function_reactions` and `gene_function_metabolites`, is generalizable to other sources of metabolic data, linking genes to predicted functional orthologs and the associated reactions and metabolites. This data can be exported to a JSON-formatted file by %(anvi-get-metabolic-model-file)s for inspection and metabolic model analyses. | ||
The representation of the reaction network in two tables of the %(contigs-db)s, `gene_function_reactions` and `gene_function_metabolites`, is generalizable to other sources of metabolic data, linking genes to predicted functional orthologs and the associated reactions and metabolites. Reaction and metabolite data are likewise stored in the identically formatted tables, `gene_cluster_function_reactions` and `gene_cluster_function_metabolites`, in the %(pan-db)s. This data can be exported to a JSON-formatted file by %(anvi-get-metabolic-model-file)s for inspection and metabolic model analyses. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,60 @@ | ||
This program **exports a metabolic %(reaction-network)s from a %(contigs-db)s to a %(reaction-network-json)s file** suitable for inspection and flux balance analysis. | ||
This program **exports a metabolic %(reaction-network)s from a %(contigs-db)s OR a %(pan-db)s and %(genomes-storage-db)s to a %(reaction-network-json)s file** formatted for flux balance analysis. | ||
|
||
The required input to this program is a %(contigs-db)s in which a %(reaction-network)s has been stored by %(anvi-reaction-network)s. | ||
The required input to this program is a %(contigs-db)s OR a %(pan-db)s in which a %(reaction-network)s has been stored by %(anvi-reaction-network)s. The %(pan-db)s must be accompanied by a %(genomes-storage-db)s input. | ||
|
||
The %(reaction-network-json)s file output contains sections on the metabolites, reactions, and genes constituting the %(reaction-network)s that had been predicted from the genome. An "objective function" representing the biomass composition of metabolites in the ["core metabolism" of *E. coli*](http://bigg.ucsd.edu/models/e_coli_core) is automatically added as the first entry in the "reactions" section of the file and can be deleted as needed. An objective function is needed for flux balance analysis. | ||
The %(reaction-network-json)s file output contains sections on the metabolites, reactions, and genes (or gene clusters) constituting the %(reaction-network)s that had been predicted from the genome (or pangenome). An "objective function" representing the biomass composition of metabolites in the ["core metabolism" of *E. coli*](http://bigg.ucsd.edu/models/e_coli_core) is automatically added as the first entry in the "reactions" section of the file and can be deleted as needed. An objective function is needed for flux balance analysis. | ||
|
||
## Usage | ||
|
||
%(anvi-get-metabolic-model-file)s requires a %(contigs-db)s as input and the path to an output %(reaction-network-json)s file. | ||
%(anvi-get-metabolic-model-file)s requires a %(contigs-db)s OR a %(pan-db)s and %(genomes-storage-db)s as input, plus the path to an output %(reaction-network-json)s file. | ||
|
||
{{ codestart }} | ||
anvi-get-metabolic-model-file -c %(contigs-db)s \ | ||
anvi-get-metabolic-model-file -c /path/to/contigs-db \ | ||
-o /path/to/ouput.json | ||
{{ codestop }} | ||
|
||
An existing file at the target output location must be explicitly overwritten with the `-W` flag. | ||
{{ codestart }} | ||
anvi-get-metabolic-model-file -p /path/to/pan-db \ | ||
-g /path/to/genomes-storage-db \ | ||
-o /path/to/output.json | ||
{{ codestop }} | ||
|
||
An existing file at the target output location must be explicitly overwritten with the flag, `--overwrite-output-destinations`. | ||
|
||
{{ codestart }} | ||
anvi-get-metabolic-model-file -c %(contigs-db)s \ | ||
anvi-get-metabolic-model-file -c /path/to/contigs-db \ | ||
-o /path/to/output.json \ | ||
-W | ||
--overwrite-output-destinations | ||
{{ codestop }} | ||
|
||
The flag, `--remove-missing-objective-metabolites` must be used to remove metabolites in the *E. coli* core biomass objective function from the output file if the metabolites are not produced or consumed by the predicted %(reaction-network)s. [COBRApy](https://opencobra.github.io/cobrapy/), for instance, cannot load the JSON file if metabolites in the objective function are missing from the genomic model. | ||
The flag, `--remove-missing-objective-metabolites` must be used to remove metabolites in the *E. coli* core biomass objective function from the %(reaction-network-json)s file if the metabolites are not produced or consumed by the predicted %(reaction-network)s. [COBRApy](https://opencobra.github.io/cobrapy/), for instance, cannot load the JSON file if metabolites in the objective function are missing from the model. | ||
|
||
{{ codestart }} | ||
anvi-get-metabolic-model-file -c %(contigs-db)s \ | ||
anvi-get-metabolic-model-file -c /path/to/contigs-db \ | ||
-o /path/to/output.json \ | ||
--remove-missing-objective-metabolites | ||
{{ codestop }} | ||
|
||
It is possible that the gene KO annotations used to construct the stored reaction network have since been changed in the %(contigs-db)s or the %(genomes-storage-db)s. By default, without using the flag, `--ignore-changed-gene-annotations`, this program checks that the set of gene KO annotations that is currently stored was also that used in construction of the %(reaction-network)s, and raises an error if this is not the case. Use of this flag ignores that check, permitting the set of gene annotations to have changed since creation of the network. | ||
|
||
{{ codestart }} | ||
anvi-get-metabolic-model-file -p /path/to/contigs-db \ | ||
-o /path/to/output.json \ | ||
--ignore-changed-gene-annotations | ||
{{ codestop }} | ||
|
||
For a pangenomic network, the option `--record-genomes` determines which additional information is added to the output %(reaction-network-json)s file regarding genome membership. By default, genome names are recorded for gene clusters and reactions, which is equivalent to `--record-genomes cluster reaction`. 'cluster' records in the 'notes' section of each 'gene' (cluster) entry in the JSON file which genomes are part of the cluster. 'reaction' and 'metabolite', respectively, record the genomes predicted to encode enzymes associated with reaction and metabolite entries. The arguments, 'cluster', 'reaction', and 'metabolite', are valid, and are all used in the following example. | ||
|
||
{{ codestart }} | ||
anvi-get-metabolic-model-file -p /path/to/pan-db \ | ||
-g /path/to/genomes-storage-db \ | ||
--record-genomes cluster reaction metabolite | ||
{{ codestop }} | ||
|
||
The use of `--record-genomes` as a flag without any arguments prevents genome membership from being recorded at all in the %(reaction-network-json)s file. | ||
|
||
{{ codestart }} | ||
anvi-get-metabolic-model-file -p /path/to/pan-db \ | ||
-g /path/to/genomes-storage-db \ | ||
--record-genomes | ||
{{ codestop }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.