-
-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[16.0][IMP] queue_job: run specific hook method after max_retries #674
base: 16.0
Are you sure you want to change the base?
Conversation
When a job has been tried for max_retries but still fails, final FailedJobError is raised and job is set to Failed. This PR enables developers to run a specific hook method when this happens.
Hi @guewen, |
@@ -527,6 +527,21 @@ def perform(self): | |||
elif not self.max_retries: # infinite retries | |||
raise | |||
elif self.retry >= self.max_retries: | |||
hook = f"{self.method_name}_on_max_retries_reached" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am generally not a fan of interpolating method names. Pass on_exception
as an additional argument to delayable
/with_delay
instead?
Perhaps the scope could be slightly broader as well? Give the developer a chance to handle all types of exception, not just FailedJobError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- interpolating method names is quite a common pattern in odoo code: see lots of
getattr
in the codebase :) - quite elegant imho to be able to define
method_name
andmethod_name_on_max_retries_reached
nearby, but of course it's a bit subjective - regarding your last point, that's an interesting idea but it feels quite natural to handle exceptions in the job code itself, e.g. in the EDI framework here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more declarative approach could be to use a decorator but it will likely add complexity.
@QuocDuong1306 could you please update the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @simahawk , I updated the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say whenever job reaches failed state, it would be useful to have a hook, to do something, not when it just failed after max retries, but failed for any reason?
For example, issue described here: #618
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Yet, I think you can subscribe to that particular event easily (job switching to failed).
In fact we could subscribe even in this case and check the max retry counter.
@guewen did you have something in mind regarding handling failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this is the kind of thing we would add to the @job
decorator, things that were configured on this decorator are now on queue.job.function
. This is akin to the "related actions" where we store the method to execute there. Different jobs can be pointed to the same error handler, and we would be able to use an handler on "no-code jobs" easily (e.g. I call an existing method with with_delay
in a script, and I want to notify slack when the max failure is reached using a handler that already exists in the code, I can create a queue job function and set this handler from the UI).
I agree with your points on triggering when switching to failed, not considering retries, then it would be worth to provide the max retry and current retry count to the handler as well.
Something to pay really attention to in the implementation is the transaction handling: I think in the current form, if the job failed with any error that causes a rollback (such as a serialization error for example), the transaction is unusable and the handler will probably fail as well! We should probably execute it in a new transaction, but then be aware that it will not be up-to-date with whatever happened in the current transaction, and could be subject to deadlocks depending of what the failed job did and the failure handler does...
Considering that, I'd also be more confortable if the handling happens somewhere in
def _try_perform_job(self, env, job):
"""Try to perform the job."""
job.set_started()
job.store()
env.cr.commit()
_logger.debug("%s started", job)
job.perform()
# Triggers any stored computed fields before calling 'set_done'
# so that will be part of the 'exec_time'
env.flush_all()
job.set_done()
job.store()
env.flush_all()
env.cr.commit()
_logger.debug("%s done", job)
So the transactional flow is more straightforward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this is the kind of thing we would add to the
@job
decorator, things that were configured on this decorator are now onqueue.job.function
. This is akin to the "related actions" where we store the method to execute there. Different jobs can be pointed to the same error handler, and we would be able to use an handler on "no-code jobs" easily (e.g. I call an existing method withwith_delay
in a script, and I want to notify slack when the max failure is reached using a handler that already exists in the code, I can create a queue job function and set this handler from the UI).I agree with your points on triggering when switching to failed, not considering retries, then it would be worth to provide the max retry and current retry count to the handler as well.
Something to pay really attention to in the implementation is the transaction handling: I think in the current form, if the job failed with any error that causes a rollback (such as a serialization error for example), the transaction is unusable and the handler will probably fail as well! We should probably execute it in a new transaction, but then be aware that it will not be up-to-date with whatever happened in the current transaction, and could be subject to deadlocks depending of what the failed job did and the failure handler does...
Considering that, I'd also be more confortable if the handling happens somewhere in
def _try_perform_job(self, env, job): """Try to perform the job.""" job.set_started() job.store() env.cr.commit() _logger.debug("%s started", job) job.perform() # Triggers any stored computed fields before calling 'set_done' # so that will be part of the 'exec_time' env.flush_all() job.set_done() job.store() env.flush_all() env.cr.commit() _logger.debug("%s done", job)
So the transactional flow is more straightforward
Hello @guewen, I started in #734, would you mind taking a look please?
893489c
to
e96970f
Compare
When a job has been tried for max_retries but still fails, final
FailedJobError
is raised and job is set toFailed
.This PR enables developers to run a specific hook method when this happens.