Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Add rate limiting to yt-dlp requests; prevent saving Media Items when throttled by YouTube #559

Merged
merged 7 commits into from
Jan 14, 2025

Conversation

kieraneglin
Copy link
Owner

@kieraneglin kieraneglin commented Jan 11, 2025

What's new?

  • Adds rate limiting to all yt-dlp operations (except during Source creation). See more below
  • Adds the "Sign in to confirm you're not a bot" error to the list of errors that shouldn't result in a job retry. This means the app will drop these download jobs until the next indexing pass
  • Ensures Media Items won't save or update if it appears that our indexing attempt is being throttled
    • This is annoying since yt-dlp doesn't throw an outright error when I'm just indexing, so I have to manually check rather than relying on error handling

On rate limiting

This isn't the approach I wanted to use, but it seems to work well enough. I tried several other things first with no luck:

  • Creating a plugin for our job runner to limit attempts per interval, per queue
    • Hard. Also, so many yt-dlp operations happen within a single job that it actually didn't help as much as I had hoped
  • Adding a sleep to the end of each yt-dlp-related job
    • Easy, but same pitfalls as above. Performs even worse than the above since this doesn't account for concurrency
  • Keeping a global tracker of yt-dlp operations that I can use to delay subsequent commands
    • Hard, but more effective than the other approaches listed so far. Abandoned since this is time sensitive and I was having trouble getting it just right

What I finally ended up doing is simply integrating yt-dlp's sleep-related arguments into every command (provided the user sets a sleep interval). This still doesn't account for job runner concurrency, but when I started profiling requests, it became clear to me that FAR more HTTP requests happen in a single yt-dlp command than I had expected. If the metric to look for is sheer volume of requests then this appears to be the most impactful approach by far.

It's not clear to me whether this is the only change I'll need to make or if I'll have to also integrate some other rate limiting approach, but it's a start!

⚠️ README ⚠️

This approach sleeps between every request which will massively slow down your yt-dlp operations, especially if you're downloading dozens of subtitle languages per-video. Start conservatively here by setting your Sleep Interval to 5-ish seconds. Feel free to ramp up or down from there depending on how that goes.

What's changed?

N/A

What's fixed?

N/A

Any other comments?

N/A

@kieraneglin kieraneglin added the enhancement New feature or request label Jan 11, 2025
@kieraneglin kieraneglin self-assigned this Jan 11, 2025
@kieraneglin kieraneglin merged commit e9f6b45 into master Jan 14, 2025
1 check passed
@kieraneglin kieraneglin deleted the ke/issue-549 branch January 14, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FR] Download Rate Limiting
1 participant