Multithread option is much slower than simply using GNU parallel to spawn multiple jpegoptim processes #122
Replies: 5 comments
-
Testing on a 12-core / 24-thread AMD64 desktop system: Single-thread test:
12-thread test (69% time reduction):
24-thread test (67% time reduction):
24-thread GNU parallel test (93% time reduction):
|
Beta Was this translation helpful? Give feedback.
-
Summary: On a 4-core ARM VPS, using GNU parallel to spawn multiple jpegoptim processes reduces execution time by 70-73%, whereas using jpegoptim's new built-in multithread support only reduces execution time by 55-60% On a 12-core / 24-thread AMD desktop, using GNU parallel to spawn multiple jpegoptim processes reduces execution time by 93%, whereas using jpegoptim's new built-in multithread support only reduces execution time by 67-69% |
Beta Was this translation helpful? Give feedback.
-
@catharsis71 , interesting findings, thanks! How many images (roughly) you used to run these test? These are probably rather small JPEGs (do you know what is the average size?) I would guess that this is likely due to the fact that with -w option, jpegoptim will fork new process for each image. So processing large number of small images likely yield pretty high "overhead".... It would be interesting to se results when parallel was run with option |
Beta Was this translation helpful? Give feedback.
-
The files are various sizes but capped at 2MB post-optimization I'll do some more testing using a directory with 1861 JPGs, 329MB, so somewhere around 180KB per file, using the 4-core system
So the But parallel pulls ahead using even a small -n value of 5, and continues getting faster up to around -n 100, then slows down a bit if you go beyond that If all files required an equal time to process, probably parallel would be fastest using n = 466 (total number of files divided by 4 rounding up), so that only 4 total processes would need to be spawned, but in reality you end up with unequal workloads and on process will take longer to finish than the others |
Beta Was this translation helpful? Give feedback.
-
I guess could always implement something like the |
Beta Was this translation helpful? Give feedback.
-
My normal use cases for jpegoptim involve using
find | parallel jpegoptim
utilizing GNU parallel to spawn multiple jpegoptim processes. I tried out the new multithreading in 1.5.0 however it seems much slower than what you get with parallel.testing on a 4-core ARM VPS:
Single-core performance test:
4-core performance test (only a 60% time reduction when I'd expect a ~75% time reduction):
4-core test using GNU parallel to spawn multiple jpegoptim processes (72% time reduction versus single core test):
Using a larger set of files:
Single-core:
4-core (only a 58% time reduction, expected ~75%):
4-core using GNU parallel (70% time reduction versus single core test):
Using an even larger set of files:
Single-core:
Four-core (only 55% time reduction, expected around ~75% reduction):
Four-core with GNU parallel (73% time reduction versus single core test):
Note: I don't explicitly tell parallel how many threads to use because it automatically matches the number of system cores (4). I do use "-n 100" because spawning a jpegoptim process for every file is wasteful, and 1 process per 100 files is much better. So parallel spawns 4 jpegoptim processes each processing 100 files, and when each batch finishes it starts another until all are done.
Beta Was this translation helpful? Give feedback.
All reactions