Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Handling of Bismark2Report Uninitialized Error due to missing nucleotide key in nucleotide.stats report #711

Open
rhondene opened this issue Nov 4, 2024 · 1 comment

Comments

@rhondene
Copy link

rhondene commented Nov 4, 2024

Hi @FelixKrueger,

There are cases where the nucleotide_stats does not generate the full list of nucleotide headers as expected by the bismark2report script. Below I post the source ode causing a unitialized error and potentially a division by error if nuc_exp=0.

  • I also include the traceback of the error and a screenshot of the input that triggered the error:

For now I can implement my own error handling in the pipeline I am running bismark in.

** relevant bismark2report source code**

foreach my $key (sort {$a cmp $b} keys %nucs){
	foreach my $key ('A','T','C','G','AC','CA','TC','CT','CC','CG','GC','GG','AG','GA','TG','GT','TT','TA','AT','AA'){
	    my $nuc_obs = $nucs{$key}->{obs}->{percent};
	    my $nuc_exp = $nucs{$key}->{exp}->{percent};
	    my $counts_obs = $nucs{$key}->{obs}->{counts};
	    my $counts_exp = $nucs{$key}->{exp}->{counts};
	    my $cov = $nucs{$key}->{obs}->{coverage};

	    # calculating log2 observed/expected
	    my $ratio = $nuc_obs/$nuc_exp;
	    #  my $logratio = sprintf ("%.2f",log($ratio)/log(2));
	    # if (abs($logratio) > $minmax){
	    # $minmax = abs($logratio);
	    # }

ERROR

Traceback of the Error (abbreviated)


`Using the following nucleotide coverage report: > LAMdirect-364-NTG-200-5_S202_R1_trimmed_bismark_bt2_pe.nucleotide_stats.txt <
Processing nucleotide coverage report './LAMdirect-364-NTG-200-5_S202_R1_trimmed_bismark_bt2_pe.nucleotide_stats.txt' ...
Use of uninitialized value $nuc_exp in division (/) at /home/ec2-user/anaconda3/envs/bismark_env/bin/bismark2report line 640, <NUC> line 9.
Use of uninitialized value $nuc_obs in division (/) at /home/ec2-user/anaconda3/envs/bismark_env/bin/bismark2report line 640, <NUC> line 9.
Illegal division by zero at /home/ec2-user/anaconda3/envs/bismark_env/bin/bismark2report line 640, <NUC> line 9.

Traceback (most recent call last):
 ...
  File "/home/ec2-user/SageMaker/NGS-Analysis/amp_biseq/amp_biseq/report_utils.py", line 112, in run_bismark2report
  
subprocess.CalledProcessError: Command 'bismark2report --alignment_report ./LAMdirect-364-NTG-200-5_S202_R1_trimmed_bismark_bt2_PE_report.txt --splitting_report ./methylation_extraction_results/LAMdirect-364-NTG-200-5_S202_R1_trimmed_bismark_bt2_pe_splitting_report.txt --mbias_report ./methylation_extraction_results/LAMdirect-364-NTG-200-5_S202_R1_trimmed_bismark_bt2_pe.M-bias.txt --nucleotide_report ./LAMdirect-364-NTG-200-5_S202_R1_trimmed_bismark_bt2_pe.nucleotide_stats.txt --dir ./MY _Amplicon_1' returned non-zero exit status 255.

Data nucleotide_stats table that triggered the error:

image
@FelixKrueger
Copy link
Owner

FelixKrueger commented Nov 12, 2024

I have just tried to handle division by 0 better, can you please clone the Bismark dev branch and test it? Alternatively, would you mind sending the files you used in the example above so I can test it over here? Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants