-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identical read_len1 and read_len2 reported with different real read lengths in R1&2 pairs #262
Comments
It looks like your use case implies that read lengths for left and right alignments in the pair should have their lengths reported separately. Do you think that would be more expected behavior if there were |
There are read_len1 and read_len2 for each pair already, and they are reported separately - and correctly in case of R1-2 pairs. |
Yeah, I think it's because when the pair is present on both R1 and R2, we report all the properties of this pair on read1. It's not only about the read length, but all the properties of the alignment. Reporting all the info about the alignments originating from both read pairs is a logistic nightmare, and I do not see the reason why pairtools/pairtools/lib/parse.py Lines 1093 to 1105 in 9a0b894
|
If this is very painful to implement, I understand, no problem. But still I think in principle it would be better/cleaner and less confusing... Off the top of my head I can't imagine a scenario where it actually makes a practical difference, but possibly it exists. |
Yes, for now we have to make choice whether to report info for only one read if the same pair is present on both. You can always report readIDs of R1&2 and make a more thorough analysis of these reads. |
I have some data with different lengths of read1 (81 nt) and read2 (221 nt) and I want to report the length of the read in the pairs. I add
to parse2, but in the pairs, I think depending on walk pair type, the read length is sometimes reported incorrectly.
In the case of R1-2 reads both values are reported, in case R1 or R2 reads only the appropriate value is reported. I think this is correct behaviour, but perhaps should be clarified in the docs?
However, in R1&2 pairs both read lengths are reported as 81 nt, while I would expect both 81 and 221 nt reported. This is not what I would expect in this situation, is this intended behaviour?
The text was updated successfully, but these errors were encountered: