Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[16.0][IMP] queue_job: run specific hook method after max_retries #674

Open
wants to merge 2 commits into
base: 16.0
Choose a base branch
from

Conversation

QuocDuong1306
Copy link

When a job has been tried for max_retries but still fails, final FailedJobError is raised and job is set to Failed.

This PR enables developers to run a specific hook method when this happens.

When a job has been tried for max_retries but still fails, final FailedJobError is raised and job is set to Failed.
This PR enables developers to run a specific hook method when this happens.
@OCA-git-bot
Copy link
Contributor

Hi @guewen,
some modules you are maintaining are being modified, check this out!

@@ -527,6 +527,21 @@ def perform(self):
elif not self.max_retries: # infinite retries
raise
elif self.retry >= self.max_retries:
hook = f"{self.method_name}_on_max_retries_reached"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am generally not a fan of interpolating method names. Pass on_exception as an additional argument to delayable/with_delay instead?

Perhaps the scope could be slightly broader as well? Give the developer a chance to handle all types of exception, not just FailedJobError?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • interpolating method names is quite a common pattern in odoo code: see lots of getattr in the codebase :)
  • quite elegant imho to be able to define method_name and method_name_on_max_retries_reached nearby, but of course it's a bit subjective
  • regarding your last point, that's an interesting idea but it feels quite natural to handle exceptions in the job code itself, e.g. in the EDI framework here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more declarative approach could be to use a decorator but it will likely add complexity.
@QuocDuong1306 could you please update the docs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @simahawk , I updated the docs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say whenever job reaches failed state, it would be useful to have a hook, to do something, not when it just failed after max retries, but failed for any reason?

For example, issue described here: #618

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Yet, I think you can subscribe to that particular event easily (job switching to failed).
In fact we could subscribe even in this case and check the max retry counter.
@guewen did you have something in mind regarding handling failures?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this is the kind of thing we would add to the @job decorator, things that were configured on this decorator are now on queue.job.function. This is akin to the "related actions" where we store the method to execute there. Different jobs can be pointed to the same error handler, and we would be able to use an handler on "no-code jobs" easily (e.g. I call an existing method with with_delay in a script, and I want to notify slack when the max failure is reached using a handler that already exists in the code, I can create a queue job function and set this handler from the UI).

I agree with your points on triggering when switching to failed, not considering retries, then it would be worth to provide the max retry and current retry count to the handler as well.

Something to pay really attention to in the implementation is the transaction handling: I think in the current form, if the job failed with any error that causes a rollback (such as a serialization error for example), the transaction is unusable and the handler will probably fail as well! We should probably execute it in a new transaction, but then be aware that it will not be up-to-date with whatever happened in the current transaction, and could be subject to deadlocks depending of what the failed job did and the failure handler does...

Considering that, I'd also be more confortable if the handling happens somewhere in

    def _try_perform_job(self, env, job):
        """Try to perform the job."""
        job.set_started()
        job.store()
        env.cr.commit()
        _logger.debug("%s started", job)

        job.perform()
        # Triggers any stored computed fields before calling 'set_done'
        # so that will be part of the 'exec_time'
        env.flush_all()
        job.set_done()
        job.store()
        env.flush_all()
        env.cr.commit()
        _logger.debug("%s done", job)

So the transactional flow is more straightforward

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this is the kind of thing we would add to the @job decorator, things that were configured on this decorator are now on queue.job.function. This is akin to the "related actions" where we store the method to execute there. Different jobs can be pointed to the same error handler, and we would be able to use an handler on "no-code jobs" easily (e.g. I call an existing method with with_delay in a script, and I want to notify slack when the max failure is reached using a handler that already exists in the code, I can create a queue job function and set this handler from the UI).

I agree with your points on triggering when switching to failed, not considering retries, then it would be worth to provide the max retry and current retry count to the handler as well.

Something to pay really attention to in the implementation is the transaction handling: I think in the current form, if the job failed with any error that causes a rollback (such as a serialization error for example), the transaction is unusable and the handler will probably fail as well! We should probably execute it in a new transaction, but then be aware that it will not be up-to-date with whatever happened in the current transaction, and could be subject to deadlocks depending of what the failed job did and the failure handler does...

Considering that, I'd also be more confortable if the handling happens somewhere in

    def _try_perform_job(self, env, job):
        """Try to perform the job."""
        job.set_started()
        job.store()
        env.cr.commit()
        _logger.debug("%s started", job)

        job.perform()
        # Triggers any stored computed fields before calling 'set_done'
        # so that will be part of the 'exec_time'
        env.flush_all()
        job.set_done()
        job.store()
        env.flush_all()
        env.cr.commit()
        _logger.debug("%s done", job)

So the transactional flow is more straightforward

Hello @guewen, I started in #734, would you mind taking a look please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants