Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Q] Setting --mindist parameter as 0, but not contain all number of loop(input loop.bedpe) #145

Open
Miracle-Yao opened this issue May 3, 2024 · 7 comments

Comments

@Miracle-Yao
Copy link

State the question
A clear and concise description of the question.

What have you tried?

  • [√ ] Have you read the documentation and tutorials about your question?

Additional context
Add any other context about the problem here.

Hi, a small question.

The size of the loops in my input bedpe file ranges from 20kb to 2Mb, I have set --mindist to 0, why is the Total number of piled up windows still less than the number of loops. How to plot a piled up graph using all the loops?

thanks.

@Phlya
Copy link
Member

Phlya commented May 3, 2024

If some of your loops are too close to ends of the chromosomes (so that the snippet with the loop would extend beyond the start/end of the chromosome), they will also be ignored. Maybe that's the reason?

@Miracle-Yao
Copy link
Author

Wow, thanks for your quick reply.

If I want to compare the loop strengths of the two groups, would choosing the default --mindist (2*pad+2) and one of the BALANCE methods (GW_SCALE, KR, SCALE, VC, VC_SQRT, weight) be the best match?

@Phlya
Copy link
Member

Phlya commented May 3, 2024

Generally mindist can be set to 0 in practice, in most datasets just --ignore-diags 2 is good enough to remove very short range artifacts. If your data can't support that you'll see some noise in the bottom left corner of the pileup.

Assuming by "weight" you mean the output of cooler balance, it's the safest option. Default filters in cooler remove more artefacts, while juicer is very lenient and a lot of bins with extreme coverage variation are retained.
Also, fyi, when looking into loops, using mapq>30 filtering is quite important.

@jiangshan529
Copy link

Hi, Phyla. I am using this code: coolpup.py all_50bp.mcool::resolutions/1000 1_0.88.bedpe --outname aa.txt --flank 100000 --n_proc 16 --clr_weight_name "", the windows been calculated are far less than the rows in bedpe file(781 vs 10419), I don't think more than 9000 loops are around edge of chromosome. if I change to --flank 10000, the windows been plotted are 8000, but still not all loops in my bedpe file. how should I deal with it. Thanks!

@Phlya
Copy link
Member

Phlya commented Dec 13, 2024

Have you tried --mindist 0? That should have the biggest impact.

@jiangshan529
Copy link

Have you tried --mindist 0? That should have the biggest impact.

Thanks! Hi, Phyla, do you have and idea why I see the blue stripes in diagnol direction in some APA analysis? This is in 1kb resolution and in 50kb window.

image

@Phlya
Copy link
Member

Phlya commented Jan 10, 2025

This is what can sometimes happen with --mindist 0 and --ignore-diags <2, since the first two diagonals contain artefactual contacts and more noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants