Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance -i performace #32

Merged
merged 1 commit into from
May 18, 2024
Merged

enhance -i performace #32

merged 1 commit into from
May 18, 2024

Conversation

JMencius
Copy link
Contributor

@JMencius JMencius commented May 18, 2024

Hi @wdecoster In the last version I submitted, a significant drop of performance using -i or --input for .gz file is observed. I did some modifications to the code to enhance the --input performace, breifly:

  1. Use different version of flate2 to achieve the best performance as mentioned in https://github.com/rust-lang/flate2-rs#Backends
  2. Add a 512 k buf.
    The performance is shown below:
Data File size command Version Run time
DM.fastq.gz 21G gunzip -c DM.fastq.gz | ./chopper -q 10 -l 500 > test.fastq Old version (0.8.0) 658 s
DM.fastq.gz 21G ./chopper -i DM.fastq.gz -q 10 -l 500 > test.fastq Old version (0.8.0) 3060 s
DM.fastq.gz 21G ./chopper -i DM.fastq.gz -q 10 -l 500 > test.fastq Current pull request version 759 s

Which is still worse than system-level gunzip, but close.

@wdecoster wdecoster merged commit 98e0dc9 into wdecoster:master May 18, 2024
1 check passed
@wdecoster
Copy link
Owner

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants