You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently there was this issue #121 for which a batch read workaround was implemented. I am now experiencing from what I believe to be same or similar issue but now while using JSON instead of msgpack. Basically when I do for item in job.items.iter(..., count=X, ...): if there are long intervals during iteration the count can end up being ignored. I was able to reproduce it with the following snippet:
With the sleep part removed the WTF section does not fire and the iterator stops on 168012/276/1/9999th item.
This seem to be more of a ScrapyCloud API platform problem but I am reporting it here to track nonetheless.
For now I am assuming resource/collections iteration is not robust if any delays are possible client side during retrieval (I haven't tested any other potential issues) and I will try either preloading all at once (.list()) or using .list_iter() when makes sense as a habit.
The text was updated successfully, but these errors were encountered:
Hello.
Recently there was this issue #121 for which a batch read workaround was implemented. I am now experiencing from what I believe to be same or similar issue but now while using JSON instead of msgpack. Basically when I do
for item in job.items.iter(..., count=X, ...):
if there are long intervals during iteration the count can end up being ignored. I was able to reproduce it with the following snippet:With the sleep part removed the WTF section does not fire and the iterator stops on 168012/276/1/9999th item.
This seem to be more of a ScrapyCloud API platform problem but I am reporting it here to track nonetheless.
For now I am assuming resource/collections iteration is not robust if any delays are possible client side during retrieval (I haven't tested any other potential issues) and I will try either preloading all at once (
.list()
) or using.list_iter()
when makes sense as a habit.The text was updated successfully, but these errors were encountered: