You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ERROR Detected
[mindtouch2zim::Thread-10 (worker)::2025-01-09 18:15:57,141] WARNING:Exception while processing asset from https://bio.libretexts.org/@api/deki/files/82600/Emissions-by-sector-%2525E2%252580%252593-pie-charts.svg?revision=1 used by page ID 110266 (https://bio.libretexts.org/Bookshelves/Biochemistry/Fundamentals_of_Biochemistry_(Jakubowski_and_Flatt)/Unit_IV_-_Special_Topics/32%3A_Biochemistry_and_Climate_Change/32.18%3A__Part_4_-_Turning_Trees_into_Plexiglass%3A_Synthetic_Biology_For_Production_of_Green_Foods_and_Products): Asynchronous error: N3zim29IncoherentImplementationErrorE
Declared provider's size (3735174) is not equal to total size returned by feed() calls (0).
I will restart the recipe on same worker and see if issue happens again.
Note that it looks like the issue is transient: all assets fails to be added, then it works again, then it is again failing, ...
We should probably catch Creator is in error state. exceptions and fail the scrape on these ones, it is only going to create a broken ZIM.
The text was updated successfully, but these errors were encountered:
Declared provider's size (3735174) is not equal to total size returned by feed() calls (0).
Those are nasty ones. Let's get in touch cause this may or may not be a regression on scraperlib.
We should probably catch Creator is in error state. exceptions
Do you mean that it currently ignores all exceptions? That sounds like a bad idea. C-originated exceptions are all RuntimeError with different text. RuntimeError should not be ignored at all IMO.
Given libzim message, I suspect the race condition might be that Python is freeing the bytes faster than the libzim is consuming it. But I have no clue how it can happens / what we should do.
@rgaudin does it remind you something? did we had a recent change around this in python-libzim or python-scraperlib? About bytes and path manipulation, I recall of changes around stream_file in scraperlib, but this is something totally different.
Do you mean that it currently ignores all exceptions? That sounds like a bad idea. C-originated exceptions are all RuntimeError with different text. RuntimeError should not be ignored at all IMO.
See https://farm.openzim.org/pipeline/e7ddd2d3-eae7-43fc-94d7-0aa4b1e77d04
Source of the problem seems to be
I will restart the recipe on same worker and see if issue happens again.
Note that it looks like the issue is transient: all assets fails to be added, then it works again, then it is again failing, ...
We should probably catch
Creator is in error state.
exceptions and fail the scrape on these ones, it is only going to create a broken ZIM.The text was updated successfully, but these errors were encountered: