You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just spotted this while rewriting the code for diamond. If the subunit nomenclature of the enzyme is denoted by letters, subunits I, V and X will be get wrong numbers in complex_detection.R, because they will be treated as roman numbers.
So I would expect
A -> 1
B -> 2
C -> 3
...
I -> 9
but instead it does
I -> 1
You can test this for example with either the single uniprot entrance: A0A7V5FFT7
Or you can just use seq/Bacteria/unrev/1.6.5.3.fasta from the repository.
Furthermore in the very same test case the extraction of the subunits can fail if one of the keywords for detecting subunits is preceded by a single capital letter. For example if the header of the faster looks like this:
"UniRef50_U3TYP0 NADH dehydrogenase I chain F n=1 Tax=Plautia stali symbiont TaxID=891974 RepID=U3TYP0_9ENTR"
the script will extract: "I chain" as the subunit, instead of the expected "chain F"
My current plan is to implement diamond only for the -p all option, where I will try to correct these errors, however, I thought I report it here, as I am not sure when and if I find the time to finish it.
The text was updated successfully, but these errors were encountered:
I just spotted this while rewriting the code for diamond. If the subunit nomenclature of the enzyme is denoted by letters, subunits I, V and X will be get wrong numbers in
complex_detection.R
, because they will be treated as roman numbers.So I would expect
A -> 1
B -> 2
C -> 3
...
I -> 9
but instead it does
I -> 1
You can test this for example with either the single uniprot entrance: A0A7V5FFT7
Or you can just use
seq/Bacteria/unrev/1.6.5.3.fasta
from the repository.Furthermore in the very same test case the extraction of the subunits can fail if one of the keywords for detecting subunits is preceded by a single capital letter. For example if the header of the faster looks like this:
"UniRef50_U3TYP0 NADH dehydrogenase I chain F n=1 Tax=Plautia stali symbiont TaxID=891974 RepID=U3TYP0_9ENTR"
the script will extract: "I chain" as the subunit, instead of the expected "chain F"
My current plan is to implement diamond only for the -p all option, where I will try to correct these errors, however, I thought I report it here, as I am not sure when and if I find the time to finish it.
The text was updated successfully, but these errors were encountered: