Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bcftools norm shifts symbolic <DEL> to position 1 without warning if the END tag is missing from VCF #2216

Open
davmlaw opened this issue Jul 1, 2024 · 2 comments

Comments

@davmlaw
Copy link

davmlaw commented Jul 1, 2024

Found while testing the changes in #1919

Leaving off the "END" tag causes <DEL> symbolic alts to shift to position 1 with no warning (DUP are fine).

Sample output line:

NC_000003.11	1	.	N	<DEL>	.	PASS	SVTYPE=DEL;SVLEN=-2666;BCFTOOLS_OLD_VARIANT=NC_000003.11|128204048|G|<DEL>

Command:

bcftools norm --fasta-ref=/data/annotation/fasta/GCF_000001405.25_GRCh37.p13_genomic.fna.gz --old-rec-tag=BCFTOOLS_OLD_VARIANT del_normalize_test_no_end.GRCh37.vcf

File: del_normalize_test_no_end.GRCh37.vcf.txt

It is not clear to me from the VCF spec whether the END tag is required for symbolic variants.

an explicit END INFO field provides variant span information that is otherwise unknown. ... This field is used to compute BCF’s rlen field

Ideally, you should be able to use SVLEN to get the rlen, but if the END tag is required, it would be better to:

  • Throw an error
  • Give a warning about missing END tag on symbolic alt, and skip the record

If it is an error or warning, it would be nice for it to be noted in bcftools view as well. Thanks!

@davmlaw
Copy link
Author

davmlaw commented Jul 24, 2024

FYI the END info has been deprecated in VCF 4.5

@davmlaw
Copy link
Author

davmlaw commented Aug 12, 2024

I think bcftools does the right thing here using rlen and instead this is a htslib issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant