Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Download Rate Limiting #401

Closed
fritolays opened this issue Oct 1, 2024 · 12 comments · Fixed by #559
Closed

[FR] Download Rate Limiting #401

fritolays opened this issue Oct 1, 2024 · 12 comments · Fixed by #559
Assignees

Comments

@fritolays
Copy link

How can I set download sleep intervals to avoid rate limits?

yt-dlp has:

--sleep-requests SECONDS        Number of seconds to sleep between requests
                                during data extraction
--sleep-interval SECONDS        Number of seconds to sleep before each
                                download. This is the minimum time to sleep
                                when used along with --max-sleep-interval
                                (Alias: --min-sleep-interval)
--max-sleep-interval SECONDS    Maximum number of seconds to sleep. Can only
                                be used along with --min-sleep-interval
--sleep-subtitles SECONDS       Number of seconds to sleep before each
                                subtitle download

Could I set these in yt-dlp-configs/base-config.txt"?
Or would this not work as expected across multiple sources in pinchflat?

Otherwise global sleep requests and interval min-max in Config -> Settings?
Seems better than per Media Profile but your call.

@kieraneglin
Copy link
Owner

Hey there! Thanks for the report (:

Those flags won't do anything since Pinchflat (intentionally) handles the download queue itself. A global rate limit isn't something that's easy to implement, so before I look into that can you provide me with some more info? Are you actively being rate limited or is this just a precaution? Have you tried setting the YT_DLP_WORKER_CONCURRENCY to 1?

@fritolays
Copy link
Author

Back from work.

A download got stuck with the prove your not a bot message from YT. I did not notice for a while and my vpn ip got banned. No biggy, just cycle to a new one.

However, pinchflat seems to just keep hammering regardless of the error ytdlp gives. A max download retries would be handy here. Also maybe display the actual error encounter in the gui someplace? Like a history of commands for each source.

As for rate limiting, I have compiled a few playlists that have 500+ items and suspect that downloading in one go will likely burn another ip. I would like to add some element of randomness to the download pattern. So even if its not global but source specific would be nice.

As a sidenote, will you add a poToken option?
https://github.com/yt-dlp/yt-dlp/wiki/Extractors#po-token-guide

@kieraneglin
Copy link
Owner

However, pinchflat seems to just keep hammering regardless of the error ytdlp gives. A max download retries would be handy here. Also maybe display the actual error encounter in the gui someplace? Like a history of commands for each source.

Fair point! There is already a list of errors from yt-dlp which the app considers non-recoverable and won't try again - I'll look into updating that list to handle the "prove you're not a robot" message

As for rate limiting, I have compiled a few playlists that have 500+ items and suspect that downloading in one go will likely burn another ip. I would like to add some element of randomness to the download pattern. So even if its not global but source specific would be nice.

Another fair point! This is a really hard thing to architect since each download is (intentionally) independent and I don't know ahead-of-time how long they'll take so I can't preemptively schedule them in a staggered fashion. There are other ways to achieve this that I'll look into

(note to self: look into an Elixir Registry or Agent initialized in the main app's supervision tree)

As a sidenote, will you add a poToken option?

I'll check with the yt-dlp team to see if this is the official solution yet. I know there's been a LOT of discussion around it but I'm not aware if it's finally crossed over to being their recommendation. I'll dig in!

@fritolays
Copy link
Author

Another fair point! This is a really hard thing to architect since each download is (intentionally) independent and I don't know ahead-of-time how long they'll take so I can't preemptively schedule them in a staggered fashion. There are other ways to achieve this that I'll look into

So you dont know when they will end, but could you add a random sleep to before each download starts?

I'll check with the yt-dlp team to see if this is the official solution yet. I know there's been a LOT of discussion around it but I'm not aware if it's finally crossed over to being their recommendation. I'll dig in!

Ah sorry, no rush or need to pressure on this one. I understand it is an evolving situation.

@kieraneglin
Copy link
Owner

Thank you for the update! I just wanted to let you know that I'll be (mostly) out of cell service for just over a week starting tomorrow. But I'll keep thinking about this and see what I can come up with once I'm back

@stratus-ss
Copy link

I just wanted to touch base with this. Given the challenges YouTube, I had to drop back to using the cookie method mentioned in the thread. I thought it might be useful to have a rate limit. I am using a residential IP but still running into problems

Maybe download rate limits would help

@kieraneglin
Copy link
Owner

kieraneglin commented Jan 11, 2025

I have an attempted fix for this in #559. I really recommend reading through that but here's the TL;DR:

  • Adds rate limiting through yt-dlp's built in mechanisms
  • Prevents the "Sign in to confirm..." error from retrying the download. That download will be dropped until the next indexing pass
  • Stops indexing from creating/updating media items if the app detects it's being throttled

These changes haven't been released yet since I need to do some testing. If you want to help me test, here's how:

  • Switch your Docker tag from latest to issue-549
  • In the app, go to the settings page and set your Sleep Interval to some non-zero value. I don't know the optimal value here so I'd start with ~5s and go up or down from there. If you find a value that works for you, please share it!

Thank you!

@stratus-ss
Copy link

I pulled down issue-549 and I set it to 5s.

Thanks for looking into this. I haven't really put it through significant testing yet however.

I am currently passing in my cookies file due to the rampant "you need to log in" thing that google has been tossing around. I just wanted to note that in case that has an impact

Is there anything in specific I should keep an eye out for?

@kieraneglin
Copy link
Owner

kieraneglin commented Jan 11, 2025

Thank you! I'm mostly looking for two things:

  • Can you download media without getting throttled? These changes won't actually change anything if YouTube is throttling your IP from the offset so what I am interested in is the transition from "can download" to "can't download"
  • (less important) If you are currently throttled, confirmation that new media items aren't created or updated would be nice! It won't do anything for past media items until it can start scanning without error again, but preventing creation of new media items is a win

@stratus-ss
Copy link

For reference.
I have added 2 new sources. One with cookies one without. I set them each to the last 60 days. I will monitor and report back

@stratus-ss
Copy link

So after more than 24 hours, the source that used cookies completed its task. The source that didn't use the cookies has not downloaded anything. In fact, it hasn't even put any of the episodes into pending..

Are you interested in the logs from the running container? Further to this, aside from dumping the docker logs to a file is there a way to export logs from the UI?

@kieraneglin
Copy link
Owner

@stratus-ss Thanks for the help testing! That's all the info I need for now, but I'll let you know if anything changes 🤙

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants