Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retries to worker health check #1009

Open
SantiagoPittella opened this issue Dec 9, 2024 · 4 comments
Open

Add retries to worker health check #1009

SantiagoPittella opened this issue Dec 9, 2024 · 4 comments
Milestone

Comments

@SantiagoPittella
Copy link
Collaborator

SantiagoPittella commented Dec 9, 2024

What should be done?

In the proxy, we should be able to define a number of retries for the health check before removing it from the list of available.

Also, health checks are being performed sequentially, we need to change the implementation to use parallel execution for this.

How should it be done?

  • Add a parameter that can be configured from the CLI with the amount of retries.
  • Implement the retry logic.
  • Refactor health check calls to run in parallel.

When is this task done?

The task is done when a retry policy is in place and check calls are not sequential anymore.

Additional context

No response

@SantiagoPittella
Copy link
Collaborator Author

As part of this issue, we can move the Backend instantiation inside the Worker::new method as commented here

@SantiagoPittella
Copy link
Collaborator Author

As part of this issue, we can change a bit the strategy for selecting available workers in order to get a more even distribution of requests: #1017 (comment)

@bobbinth
Copy link
Contributor

bobbinth commented Jan 3, 2025

As part of this issue, we can change a bit the strategy for selecting available workers in order to get a more even distribution of requests: #1017 (comment)

In a couple of sentences, could you describe how we could change the strategy?

@SantiagoPittella
Copy link
Collaborator Author

Currently we are using last in first out for the workers, a change to first in first out should reduce this. We don't need to change the structure used to contain the workers, but how we get workers.

@bobbinth bobbinth added this to the v0.8 milestone Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants