Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry partially applied no-txn migrations from last failed statement #187

Merged
merged 28 commits into from
Apr 2, 2024

Conversation

mzabani
Copy link
Owner

@mzabani mzabani commented Mar 31, 2024

Fixes #154
This involved a relatively large refactor, but it was really good as it actually uncovered some internal issues with our type-level transaction sandbox and other defects. Some nice consequences include:

  • We weren't preparing for COPY errors very well, as those are exceptions of a different type.
  • We weren't registering applied migrations at the earliest or not even the right time in some cases.

Also, new tests have been added which solidify some retry scenarios even for in-txn migrations and improve codd's output quite a bit.

mzabani added 27 commits March 14, 2024 16:53
We still don't store the count or do anything with it other than printing, but this is a start.
This is necessary to allow resuming application of failed no-txn migrations
from their point of failure onwards.

While this should be extremely rare, if we are in the business of retrying
`BEGIN ... COMMIT` blocks of no-txn migrations, we're already in
extremely rare territory.

It is also behaviour that might be useful in the darkest of hours.

We will still need to change how we collect pending migrations, because now
a migration's applied state is no longer binary (i.e. applied or not),
implement seeking and a bunch of other things.
… errors

Right now this amounts to throwing exceptions in different places, and using
a new custom exception type to carry information of where the last `BEGIN`
was higher up in the call stack (this does not feel great), if there was one at all.
But so far we don't transmit that yet.

Also, this removes calls to retry statements in no-txn migrations
from functions deeper in the call stack, so that retrying is done higher up where
it's possible to re-read files and seek the stream to the right statement. And this
actually helped clean up some of the code.

Still much to do, though.
single migration.

This invariant is not obvious and so fragile to rely on.

This commit also includes a bit more of the plumbing that is necessary
to retry no-txn migs appropriately, but there's still much to do.
Also fine-tune error messages, and some other messages, too
This will catch invalid assumptions in the codebase
This is a super specific error case of a failed bootstrapping no-txn migration
that runs enough statements to make the default connection string accessible.

With this commit codd will create codd_schema after this failure and
register this migration was partially applied.

This is the most unlikely and useless error case I can think of, and
the implementation messes with retry timeouts, so it's not clear this
is a good idea.

Plus, `codd add` might run into it. Since we have to address other issues
with `codd add` given the new code anyway, it might not be that big of a problem.

At least we can say codd will be helpful even in the most extreme cases.
Even if I can't think of a case where this particular feature will do that.
More type applications to the rescue.. but it did remove one code path
that was printing "This is an internal error.."
Now we detect if still in a transaction before sending `COMMIT`.
This is helpful now that we no longer throw exceptions when an error
happens, and instead manually send ROLLBACK.
We had to reimplement `putCopyEnd` from postgresql-simple, and some
things still aren't clear. More investigation is necessary.
This is a nicer way of avoiding libpq internals
@mzabani mzabani force-pushed the issue-154-retry-last-txn branch from dba5f35 to 6ba3513 Compare April 1, 2024 20:12
@mzabani mzabani marked this pull request as ready for review April 2, 2024 20:51
@mzabani mzabani merged commit a9c803d into master Apr 2, 2024
2 checks passed
@mzabani mzabani deleted the issue-154-retry-last-txn branch April 2, 2024 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In no-txn migrations, it's useless to retry statements inside explicitly started transactions
1 participant