Replies: 3 comments
-
Hi @bluppfisk and welcome! 👋 This is the exact type of application that Icehcunk was designed for. Icechunk operates at the Zarr store level, keeping track of "chunk manifests" which record metadata about all chunks in the store (including timestamp). Would be happy to help you get started with Icechunk. Feel free to ask questions here or over on the Icechunk repo. |
Beta Was this translation helpful? Give feedback.
-
somehow icechunk never showed up during my searches. It looks promising. I'll have a look, thank you. I guess this can be closed |
Beta Was this translation helpful? Give feedback.
-
Icechunk is very new! We'd love your feedback. |
Beta Was this translation helpful? Give feedback.
-
I'm using zarr in an application that is constantly being written to and read from, by multiple workers.
I have a need to know exactly where data exists in the zarr (that is several billions of indexes long).
I have two requirements:
I already use the option not to write empty chunk files.
If I can guarantee that each write fills exactly one chunk, I can easily fill the first requirement (at least in v2, without sharding). I simply use the BasicIndexer to translate the indexes to keys, and I do a quick "file exists" probe on the disk. This may get harder to do in v3 if we want to make use of sharding.
The second is a lot harder to do in all circumstances, because it requires scanning an entire directory containing potentially hundreds of millions of files. Keeping track of the most recent complete timestamp is a bit expensive to do.
It might be easier to keep a database of chunks in memory, or on disk as it should be a relatively small array of smallints with three states (empty/full/partial). However, this should be done at zarr level, as it controls when to write files to disk, and thus when said database should be updated.
Is this an issue that's already tackled in some other way, or would the zarr developers advise against/for extending the existing Stores to implement this functionality?
Beta Was this translation helpful? Give feedback.
All reactions