Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add a Rel8 PostgreSQL adapter. #240

Merged
merged 2 commits into from
Feb 6, 2022
Merged

feat: Add a Rel8 PostgreSQL adapter. #240

merged 2 commits into from
Feb 6, 2022

Conversation

dhess
Copy link
Member

@dhess dhess commented Feb 2, 2022

Notes:

  • Rel8 communicates with the PostgreSQL database via Hasql, and Hasql
    bubbles up database exceptions to the caller in-band via Either. We
    want to handle these out-of-band, so we throw them to the creator of
    the database thread, who presumably knows more about Hasql than core
    Primer.

  • Unlike Selda, Rel8 does not include support for creating tables.
    This is possible in Hasql, but most best SQL practice seems to
    recommend that schemas are created, migrated, etc. with a separate
    tool. Therefore, I have not added an initialize action here to
    mirror the one in our Selda adapter. Instead, we'll add support for
    schema management in separate work.

@dhess dhess mentioned this pull request Feb 2, 2022
@dhess
Copy link
Member Author

dhess commented Feb 2, 2022

In the interest of keeping the PRs small and more understandable, I think I'll continue my work in new PRs. Therefore, this one is ready for review.

@dhess dhess requested a review from a team February 2, 2022 12:10
Copy link
Contributor

@georgefst georgefst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of comments, but most of them are very surface-level. The actual core logic looks good (although my experience with databases is very limited). I would like to understand:

  • Why we don't use App directly in SessionRow.
  • The so-called orphan instance.

values,
(==.),
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for the blank line here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, fixed.

@@ -1,6 +1,6 @@
let
-- TODO remove these once this code (recently ported from old frontend) is exercised
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update this comment, as it no longer applies to everything in the list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of the Selda bindings there is only temporary, so it didn't seem worth the trouble.

{ Rel8.name = "sessions"
, Rel8.schema = Nothing
, {- HLINT ignore "Redundant bracket" -}
Rel8.columns = namesFromLabels @ (SessionRow Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In GHC 9, this will parse differently. Best to be future proof and remove the space after @ now.

I suspect we can then also remove the HLINT ignore pragma. HLint has presumably been built with ghc-lib >= 9, so sees @ as an operator, hence the "Redundant bracket" warning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd, I thought that formatting was introduced by Fourmolu, but it appears not. I've fixed it.

{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE DuplicateRecordFields #-}
-- Note: this is on purpose. See 'MonadDb' instance below.
{-# OPTIONS_GHC -fno-warn-orphans #-}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I tend to put orphans in their own small modules to make it clear where they've come from, and reduce the scope of disabling this warning. But I don't feel too strongly about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more concise form {-# OPTIONS_GHC -Wno-orphans #-} is more conventional, but I think behaviour is exactly the same (incidentally, this is also the form that gets inserted if you use the HLS code action on a warning).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on, this instance isn't an orphan anyway? Rel8DbT is defined in this module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right! In the Selda bindings, we wrapped the SeldaM type from Selda, but here we're doing our own thing. Good catch — I'll remove this language and the warning disable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Fixed.)

data SessionRow f = SessionRow
{ uuid :: Column f UUID
-- ^ The session's UUID.
, gitversion :: Column f Version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this isn't camel-cased (i.e. upper-case "V")?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because it matches the database column name, which I believe is case-insensitive (though I'm not certain).


import Foreword

import Data.ByteString.Lazy as BL hiding (take)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to hide take if we use the form import qualified Data.ByteString.Lazy as BL. Is there a reason we're doing it this way?

In fact, if our style guide disallows open imports, we should be disallowing import X as Y as well as it has most of the same problems. A quick search shows nine occurrences throughout the codebase, three of them in primer-rel8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the current diff, I wonder if you accidentally "fixed" the existing code in Selda.hs, rather than this code in Rel8/Schema.hs? (Or maybe you have local changes that have not been pushed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, yes, you're right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed for real now (and Selda change undone, as it's out of scope).

-- date.
, app :: Column f BL.ByteString
-- ^ The session's 'App'. Note that the 'App' is serialized to
-- JSON before being stored as a bytestring in the database.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Isn't this the same as defining a DBType instance for App in terms of Aeson.encode and Aeson.decode, and having app :: Column f App?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. This is just a direct port from the Selda adapter, and there was some good reason for doing things like this in Selda, though I don't recall what it was. Wrapping Haskell types for the DB was a bit painful, if I recall.

I will check whether this is just a few lines and some deriving magic, and if so, make the change you suggest.

Medium term, the proper fix is to figure out how PostgreSQL deals with JSON itself. I did some brief research into this before and concluded that it was easier for the time being to serialize/deserialize ourselves. This probably has something to do with needing to install extensions, or it not being portable, or something like that.

Longer term, we need to stop serializing the whole program on every write to the database, but that's way out of scope for this PR.

Copy link
Member Author

@dhess dhess Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I am going to punt on this because Rel8 has built-in support for a native PostgreSQL binary JSON type:

https://rel8.readthedocs.io/en/latest/concepts/dbtype.html#storing-structured-data-with-jsonencoded

but this will require a DB migration. Rather than fiddling with the existing code, I'm just going to keep it as-is and spend my time working on the proper version in a subsequent PR.

(edit Ahh, it also supports a JSONEncoded type which is not jsonb. If that's de/serialized to the database as a string then this will be trivial. I'll spend a few minutes investigating.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return of the orphan instance: unless we want to pull Rel8 into core primer, the DBType instance for App is an orphan. It's worth it, though, IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented the JSONEncoded deriving instance in a 2nd commit. It's all pretty straightforward and doesn't require another review IMO, but feel free if you want to.

import qualified Data.Aeson as Aeson (
decode,
encode,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any good reason to use an import list on a qualified import? We don't tend to elsewhere. And even HLS's Make all imports explicit action doesn't go this far.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I just find those names to be incredibly generic, especially in a database setting where you might be dealing with all sorts of encodings (e.g., UTF-8), so this is nicer, IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow.

I guess I just find those names to be incredibly generic, especially in a database setting where you might be dealing with all sorts of encodings (e.g., UTF-8)

That indeed seems like a good reason for qualifying the import, rather than bringing encode and decode in to scope directly. All I'm suggesting is simplifying this to import qualified Data.Aeson as Aeson, since explicit lists seem pointless when the import is qualified (their purpose otherwise being to make it clear where identifiers have come from, and avoid bringing anything extra ones in to scope with a minor version bump of a dependency).

Anyway, I just wanted to check I wasn't missing anything. It doesn't particularly matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, we appear to do this in 12 places in Primer, all in code written by @dhess. So I guess you like this style? To be fair, a few of these involve multiple modules being imported with the same qualified name, in which case there is a clear benefit to using an explicit list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to legislate style to this degree if it doesn't break anything or enforce bad habits. It's basically just bikeshedding at this point and not a good use of code review time, IMO.

, MonadWriter w
, MonadZip
, MonadCont
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of instances! I don't even understand how some of these are valid (MonadState? where's the state?).

Copy link
Member Author

@dhess dhess Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a ReaderT under the covers, so it has a superset of what ReaderT provides, including a MonadState instance. This means you can use Rel8DbT anywhere you can use a ReaderT as far as lifting operations through a monad transformer stack. This is the whole benefit of a transformer stack, after all.

The others like MonadCatch are to accomodate our own application's stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I was misreading this. I see now that the instance being derived is MonadState s (Rel8DbT (State s)), not MonadState s Rel8Db.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Which, actually, is quite subtle. I think Haskell's syntax is perhaps too magic here)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is one of my complaints about type classes in Haskell: they're so magic that it's easy to just ignore these things because they mostly just work, until they don't, and then you realize that you don't really understand them very well at all, or that GHC is making assumptions that don't apply in your particular case and now it's really hard to reason about how the whole thing works under the hood.

@dhess dhess force-pushed the dhess/rel8 branch 2 times, most recently from f8f17e1 to c24ffdb Compare February 2, 2022 15:39
Comment on lines +155 to +152
-- always return a number equal to or greater than '0' (per
-- the Rel8 documentation). However, we have no good way to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth a link to where in the documentation this is stated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, if we did that, then to be truly useful we'd have to maintain it to keep it in sync with the version we're currently using. It's not worth it.

Copy link
Contributor

@brprice brprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two very minor comments left above. Otherwise it looks good to me (with a similar caveat about inexperience with DBs)

dhess added 2 commits February 2, 2022 21:24
This will replace the Selda adapter in a subsequent commit. Tests are
forthcoming.

Notes:

- Rel8 communicates with the PostgreSQL database via Hasql, and Hasql
bubbles up database exceptions to the caller in-band via `Either`. We
want to handle these out-of-band, so we throw them to the creator of
the database thread, who presumably knows more about Hasql than core
Primer.

- Unlike Selda, Rel8 does not include support for creating tables.
This is possible in Hasql, but most best SQL practice seems to
recommend that schemas are created, migrated, etc. with a separate
tool. Therefore, I have not added an `initialize` action here to
mirror the one in our Selda adapter. Instead, we'll add support for
schema management in separate work.
This makes our code a bit simpler. However, one major ramification of
this change is that if an `App` cannot be successfully decoded when
fetched from the database, this error condition will be dealt with by
throwing an out-of-band exception, which is how we handle errors
detected in the Rel8 or Hasql layers. Prior to this change, because
our code in `querySessionId` was doing the decoding, we could handle
decoding errors ourselves, and we would report these as a cryptic and
somewhat unhelpful error to the student in the UI, in-band.

In any case, this new way of handling errors as proper exceptions is
more consistent with how we handle other catastrophic errors. After
all, if a student's program can't be decoded when fetched from the
database, this probably means that something has gone horribly
wrong (a failed migration, data corruption, etc.), and should be
handled appropriately.
@dhess dhess merged commit b20fcef into main Feb 6, 2022
@dhess dhess deleted the dhess/rel8 branch February 6, 2022 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants