-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replacing selda
#201
Comments
Having written all that, at the moment I'm leaning towards Opaleye, which seems to tick most of the boxes except for SQLite support. Rel8 and Hasql are intriguing, but I'm concerned about the newness of the former and, ironically, the flexibility of the latter. (Upon further reading, it sounds like maybe Rel8 adds the "opinionated" to the flexibility and performance of Hasql: circuithub/rel8#82) |
I'm finally getting around to this now. Upon review, Rel8 and Opaleye seem like the obvious choices to replace Selda. Neither supports SQLite, which means that, for testing, even locally, we'll always need access to a PostgreSQL server. Given the changes I am planning to make to how we deploy & test Primer (a containerized solution, probably running on Kubernetes, or possibly Fly.io if that proves too complicated), I don't think the lack of SQLite support will be an issue for us, because it isn't really suited for a container environment, anyway. But before I proceed, I want to get some input from @brprice and @georgefst on how put out you would be if we dropped SQLite support. (I am currently imagining that our local testing setup would require a Docker instance running on our local machines, where the local PostgreSQL and Primer containers would run.) |
I would be ok with this. It would be nice to not require spinning up a Postgresql server for "small" tests. I.e. it would be nice to not take the performance hit of containers just to run a |
Agreed, and if our in-memory DB isn't sufficient already for this use case (I can't imagine why it wouldn't be), we can make sure that it is. |
Between our null DB implementation and #230, I’m no longer concerned about losing SQLite support in the backend, so I believe this removes any obstacles to switching to Opaleye or Rel8. |
Ironically, I discovered this evening while doing a bit more research that neither Opaleye nor Rel8 supports database migrations, which was one of the reasons my eye starting wandering from Selda. Oh well — I am still confident that either of them will be a better choice in the long run than Selda. On the subject of migrations, Dennis Gosnell created this interesting Haskell DB comparison and addressed the migration issue by pointing towards a few services/utilities that help with this process: https://gist.github.com/cdepillabout/a570d673b2341f109d30a91d243125dd None of them are Haskell-specific, but perhaps we'd be better off using a well-tested migration service than trying to roll our own in Haskell code, anyway. Sqitch looks particularly interesting as it makes it possible to sync code changes in Git with migrations: https://github.com/sqitchers/sqitch The author is also heavily involved in PostgreSQL extension development, so we'd probably be in good hands. |
There is a Haskell PostgreSQL migration library for Hasql, which Rel8 uses under the hood. It doesn't appear to be maintained, however: https://github.com/tvh/hasql-migration |
|
Some recent events have changed my mind on this. See #499. Also, I said in the OP of this issue that I didn't want to maintain 2 different database backends, but the potential benefits of deploying containers with local (replicated) SQLite databases may be sufficient to change my mind on this. Anyway, I'll continue this discussion in #499, but I wanted to make a note of this here for posterity. |
I'm having second thoughts about my choice of Selda. (This is squarely on me as I'm the one who wrote the database code in Primer):
If we're going to switch, and I am very much leaning in that direction, now's the time to do it. Database schema changes are imminent, and we don't need to worry about preserving compatibility with our existing DBs, so we could effectively just toss what we have now and rewrite the (currently) simple, small DB engine from scratch with a new library.
So what are the viable alternatives? I think we should be pragmatic and choose something that, ideally, has the following qualities, in rough order of importance:
(Note that 3. and 5. might be difficult to verify. Performance, especially, will not be something we can really measure until we have a realistic workload. The good news is that our asynchronous design shields us a bit from performance concerns, or at least makes us more burst-tolerant.)
I purposely didn't mention "nice Haskell-y API" in that list because, as I said earlier, I think we should choose pragmatism here over elegance. However, it would be nice to have. Perhaps we could use it as a tie-breaker.
Many (most?) Haskell SQL libraries use TH. I've previously expressed my preference to avoid TH where possible, but I think that it should be a low-priority selection criterion. I don't think the rest of the team share my concerns about TH, in any case.
I am no expert in this area, and there are probably at least a few Haskell SQL bindings I've not heard of, but these seem to me to be the leading contenders:
Opaleye
In the past, I've tried Opaleye and found it pretty difficult to use, and this was the main reason I reached for Selda when I wrote our initial DB engine. However, I last tried Opaleye quite a few years ago, so maybe things have changed. I have seen the maintainer responding to questions of ergonomics in various Haskell sub-Reddit threads over the years, though I can't cite any specific examples offhand. In any case, the author does seem quite keen to make users of the package happy.
Speaking of the Haskell sub-Reddit, Opaleye is often cited there by different people as the best-in-class SQL library for Haskell. It is actively maintained, the issue tracker is pretty lively, the maintainer appears to be responsive, it is claimed to have commercial support, and, unlike any other GitHub project I've encountered, it even has a documented handoff plan in case the maintainer goes missing: https://github.com/tomjaguarpaw/haskell-opaleye#backup-maintainers
Opaleye definitely has a high-level API and is not just a wrapper around a low-level connector, although this API was my source of frustration with it previously. (I believe, at the time, that you had to write quite a lot of boilerplate and similar-but-not-quite-the-same types for your SQL and Haskell data representations.)
One drawback to switching to Opaleye is the lack of support for SQLite: tomjaguarpaw/haskell-opaleye#528. This would probably mean we'd need to drop SQLite ourselves, as I don't want to maintain 2 engines with different APIs. This, in turn, would have the following impacts on the project:
Re: point 1, I've been thinking about this a lot recently, anyway, because I think we eventually need to be testing, even "locally," in an environment that's as close to production as is practical. So I think we can address this point reasonably well, and in fact will probably need to do so regardless of whether we keep SQLite support.
Re: point 2, this is unfortunate, because it's likely that whatever system we put together for our own use to solve point 1 will not be available to contributors, or will at least be impractical for them to implement on their own. One solution to this problem might be to encourage early contributors who want to help to write a SQLite adapter for
primer-db
, or whatever name we end up choosing for the post-Selda Primer database package.Hasql
Hasql is more of an ecosystem than a package. The base package is pretty close to the PostgreSQL metal, but there are many other packages that build upon it that provide higher-level APIs in what appears to be a kind of pick 'n mix approach to writing a database connector.
I have no experience using Hasql directly, but I've heard generally good things about it, and it seems to be well-regarded. It's also supposedly pretty fast, for a Haskell database driver, anyway. The author is highly regarded in the Haskell community. I believe we use some of his packages already.
In an ideal world, Hasql's flexible approach might be the most appealing of all our options, but I do worry that it'll take us too much time to figure out how to glue the pieces together into the right abstraction for our needs. As we're not database experts, we'd probably be better off with a more opinionated library like Opaleye, where the authors have made these choices for us based on their expertise. On the other hand, if we started off conservatively and tried to stay relatively low-level and close to native SQL, it would probably be easier to switch from Hasql to something higher level once we're in a better position to measure performance, we understand how to write efficient queries, etc.
Hasql appears to be reasonably active, though not as much as Opaleye. It's not clear whether it has any commercial users, at least not directly. (Hasql is a dependency of Rel8; see below.)
Hasql does not support SQLite: nikita-volkov/hasql#29
Persistent & Esqueletto
These are typically used together, I believe. I get the impression they're used pretty widely, perhaps more than any other Haskell SQL solution. They support both PostgreSQL and SQLite, which is a plus. They both appear to be actively developed, especially Persistent. They're maintained by FP Complete, so presumably are used in their own offerings and production-proven.
It's said that Esqueletto is pretty close to "SQL in Haskell" and that it's relatively easy to translate back-and-forth. That sounds like it could be helpful if we want to understand exactly what a query is doing and how to make it efficient.
On the other hand, apparently it's possible to write queries in it that crash at runtime, or at least, it used to be: https://www.reddit.com/r/haskell/comments/8qxvir/a_comparison_among_various_database_edsls_selda/e0o2wad/. And we don't currently have the expertise to make SQL go fast, anyway, so it might make more sense for us to choose something where the authors have tried to optimize common use cases for us by presenting us with higher-level abstractions.
I don't know much about it other than that, partly because I've previously been a bit spooked by the complaints and have never given it a try.
Beam
I don't know much about it, but have previously read mostly positive things about it. It supports both PostgreSQL and SQLite, seems pretty active, and the maintainer responds to most GitHub issues.
There do appear to be a number of fairly long-standing bugs with no apparent progress, however:
finalSolution
hangs haskell-beam/beam#578(This is a bit unfair, as I haven't done a similar analysis on the other projects I've mentioned.)
It's not clear who is using Beam, nor whether it's being used in production.
Others
I don't really know anything about these options, but I'll try to state the obvious about each, at least.
Note that I have not considered here any very low-level libraries that simply wrap a C library in Haskell. I don't want to use something like that as we don't have the skillset to pull it off, IMO, and need something more opinionated.
Squeal
Rel8
Here is a rather gushing thread about Rel8 on the Haskell sub-Reddit, though this is only in response to the ZuriHac 2021 presentation, not a reaction from users of the package: https://www.reddit.com/r/haskell/comments/o3cep5/video_ollie_charles_rel8_a_new_database_access/
Ollie Charles, the maintainer, says that Rel8 and PostgreSQL generate faster queries than CircuitHub can write by hand, which is encouraging.
References
The text was updated successfully, but these errors were encountered: