Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.3.3 #37

Merged
merged 5 commits into from
Sep 17, 2024
Merged

V0.3.3 #37

merged 5 commits into from
Sep 17, 2024

Conversation

JakeWnuk
Copy link
Owner

  • Implemented a read buffer for -f reading of 4 GB. This means that when using -f a large 4 GB buffer will be allocated and reused for loading large files.
    • This is overkill but results in large time savings when processing files. If the buffer is under used, it is reallocated. This may have an impact on users with low total RAM installed, but only -f allocates this buffer and I would imagine users would have ~ 8 GB at minimum.
  • Max rule size is now less than 93 instead of equal to
  • Added the new transformation mode regram. This mode takes in sentences similar to -u and "regrams" them into strings with a given number of words.

- change rule to 93 limit to not be inclusive
- revert prior to making fs loading changes to instead investigate extending the bypass flag for stdin
testing read buffer implementation to increase reading speed of large files and seeing how memory could be optimized in scenarios
after testing between 1 GB and 5 GB there doesn't seem to be a lot of difference past 2GB estimated
Seems like the unused buffer is freed pretty quickly so having more only helps more with large files. This implementation is faster than the original in all cases just fine-tuning the default buffer size at either 2GB or 4GB.

Leaning towards 4GB because there have been examples of almost 30 second faster times than the 2GB buffer and I would expect users to use -f on a system with at least 8 GB of RAM.
added a new mode called regram
@JakeWnuk JakeWnuk merged commit 02c0eba into main Sep 17, 2024
1 check failed
@JakeWnuk JakeWnuk deleted the v0.3.3 branch September 17, 2024 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant