-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
single-end mode is broken in v1.1.0 #247
Comments
I just encountered the same problem. Furthermore, the |
hi folks! |
I am having the same issue! Did you ever figure out what the problem was? If not, I am happy to provide a sample with a subset of the alignment I am using. |
@Salvacasani That will be excellent! Please provide the command that you run on that sample file and a description of your expectations vs what you observe. |
Unfortunately I'm working with private data, so unable to share anything. |
Here it is, I don't know how many of these should have ligated junctions, but there should be a good bunch. If there are no examples with separate mapping sites in this subset, let me know the best way to get them. I am processing the data in the previous version of pairtools. I will let you know if that works. |
Hi! Have you been able to look into it? I tried to run it with the previous version of pairtools, but I get the same result. |
Hi @Salvacasani , You can check out the latest pull request here: #251. It seems that some API changes didn’t fully align with the old functionality, which caused an issue with read side detection in certain datasets. By the way, in your sample data, most of the reads are NU (unmapped-mapped) pairs, so I wasn’t able to fully test the pair expansion. However, pair expansion should be working now as well. @Phlya, since you had questions about the read sides in your recent dataset, could you test if any part of your data produces the same results after this fix? In theory, this issue should have affected both paired-end and single-end data, but it’s only appearing with single-end data, which is a bit puzzling. There might be another underlying issue that remains unresolved. |
Thank you for looking into it. I am still getting very strange results. This is the code that I am using To use one specific example. I looked into the results of read 150627-BC36-0251488282 Which seems to have two mapping sites in the bam file: but in the pairs file it shows as unmapped: 150627-BC36-0251488282 ! 0 ! 0 - - NN 0 0 0 0 0 0 Do you know why this may be? Thank you |
Hi @Salvacasani, Hmmm, can you try with
With that version in progress, I receive "correct" results with your command:
Output two pairs, both have some kind of issues (multi-mapper and unmapped segment):
Is that what you would expect for that read? |
Thanks I can reproduce this result. Since there is a multimaper (not sure why since I don't find it in the alignment) maybe this is not the best example. If we look at this other example 180786-UGAv3-23-4079533886 This is the bam file: 180786-UGAv3-23-4079533886 0 chr8 310577 42 22S162M57S * 0 0 CTACACGACGCTCTTCCGATCTACTATACCTTCATTCATTAATGTGTCATTTCTTTCAGGAACTATTTTCTGAGTCTCAAACATATTTCATAGCACTCTCAAGCTTGAGGTTCTGCCTGAACATGCTCCTCCTCCTTTTCTTTTGTGACTCTTCATTTCGTATGGAGTTAAACCTGAGCTCCTAAGCTTTGTCTGTTTCCACTTTTGTCAGATTCGATGGCAACAAGATGAATGCGCCCTG 88IIIIIIIIIIIIIIIIIIII;II7IIIIIIIIIIIIIIIIIII:IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIICCIIIIII@DD@I9==9IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII7I7I"III?I?==II'88'&-IIII11II,I++1//I++86,$99I6#I&?&I4 NM:i:0 MD:Z:162 AS:i:162 XS:i:142 SA:Z:chr8,312049,+,191S37M1D13M,48,2; XA:Z:chr1,-848200,57S162M22S,4; MQ:i:48 ip:i:310738 mp:i:312049 ep:i:310555 rt:A:0 cb:Z:chr8_chr8_0_1_0_16_000310738_000312049 RG:Z:HIC13904 this is the pairtools result: 180786-UGAv3-23-4079533886 chr8 310577 ! 0 + - UN 180786-UGAv3-23-40795338860chr83105774222S162M57S00CTACACGACGCTCTTCCGATCTACTATACCTTCATTCATTAATGTGTCATTTCTTTCAGGAACTATTTTCTGAGTCTCAAACATATTTCATAGCACTCTCAAGCTTGAGGTTCTGCCTGAACATGCTCCTCCTCCTTTTCTTTTGTGACTCTTCATTTCGTATGGAGTTAAACCTGAGCTCCTAAGCTTTGTCTGTTTCCACTTTTGTCAGATTCGATGGCAACAAGATGAATGCGCCCTG88IIIIIIIIIIIIIIIIIIII;II7IIIIIIIIIIIIIIIIIII:IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIICCIIIIII@DD@I9==9IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII7I7I"III?I?==II'88'&-IIII11II,I++1//I++86,$99I6#I&?&I4NM:i:0MD:Z:162AS:i:162XS:i:142SA:Z:chr8,312049,+,191S37M1D13M,48,2;XA:Z:chr1,-848200,57S162M22S,4;MQ:i:48ip:i:310738mp:i:312049ep:i:310555rt:A:0cb:Z:chr8_chr8_0_1_0_16_000310738_000312049RG:Z:HIC13904Yt:Z:UN 42 0 310577 0 310738 0 Maybe I misunderstand the alignment, but isn't it indicating that these two are chimeric alignments from the same read? If that is the case, shouldn't these be pairs and be in the same line of the pairs file as linked fragments? Thank you! Salva |
@Salvacasani , that happens because your bam file is not sorted by read ID, and pairtools parse relies on parts of the same read arranged next to each other. Simple reordering of your .bam file with:
produces these three nice pairs for the read
I thought that we mention in the manual that the input alignments shall be sorted by the read ID, but I cannot find any mention of that. Attention @golobor , it might be worthy to add it. |
Gotcha, Thank you so much for the help!! |
I tried it with the new version This is the command I used:
Output example for
Output example for
I can as well provide a small subset bam file. Just tell me where I should send it. |
@Henrikkoe , thanks! That might be due to readID transformation. I believe I have brought it now to the working state in the same branch: Do you mind pulling the most recent version of the branch and checking the output? If the problem persists, you can run the following:
and post the file either through zip attachements to github messages or though OSF for further debug. |
Thanks, this seems to be working now. I have a follow up question to the Could I use the
|
@Henrikkoe Good catch! This issue was inherited from earlier versions of |
I recently updated my conda installation of pairtools to v1.1.0 and it appears that the --single-end mode no longer works for single-end reads in pairtools parse2:
Comparing two versions:
pairtools 1.0.3 py310hb45ccb3_0 bioconda
pairtools 1.1.0 py312hac03d35_1 bioconda
Same command:
pairtools parse2 --single-end --nproc-in 12 --nproc-out 32 --min-mapq 1 --flip --add-columns mapq --assembly B73 -c ../ref/B73.chrom_sizes.tsv --output-stats test.tsv -o test.pairs test.input.bam
v1.0.3 gives the expected result, but v1.1.0 reports everything as unmapped:
v.1.10 results are entirely NN:
read1 ! 0 ! 0 - - NN 0 0
The text was updated successfully, but these errors were encountered: