Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-string examples #39

Open
bryanlarsen opened this issue Jan 30, 2023 · 7 comments
Open

non-string examples #39

bryanlarsen opened this issue Jan 30, 2023 · 7 comments
Labels
documentation Related to the library docs. good first issue Good for newcomers help wanted Extra attention is needed question Further information is requested

Comments

@bryanlarsen
Copy link

I'm looking at using cacache to store rust structs. But cache uses AsRef<[u8]> for data and AsRef for strings.

There's a lot of ways to turn a struct into a [u8] and back, and a lot of ways of turning a struct key into a str. It's easy to spend a lot of time evaluating the different ways of doing so when in many cases it'd probably be best just to choose one and move on.

And example in your docs might do wonders here. Presumably you have more insight into the better ways of doing this, so the mechanism used in the example could presumed as a "good" way, even if it isn't the best for every situation.

For example, it seems tempting to use the rust hash mechanism for the key, but that's double hashing and could cause collisions so I imagine that's not recommended.

It's also tempting to use Debug or Display formatting for the key since most structs already have it and it'd probably work well in some situations. Probably not something you should use in an example though because those are sometimes lossy.

Which means likely a serde format for both key & data. But which one, there are so many...

I found this overview. 2 years old so things may have changed, but likely still mostly correct: https://blog.logrocket.com/rust-serialization-whats-ready-for-production-today/

conclusion: json for key and bincode for data?

This was less of an issue report and more of a "thinking aloud" situation. But it may be helpful to others in the same situation, so I'm going to post it anyways. Feel free to close.

@bryanlarsen
Copy link
Author

The example should include a transformation of a HashMap key into a sorted vec since that's a common stumbling block.

@zkat
Copy link
Owner

zkat commented Jan 30, 2023

The general expectation right now is that you'll have strictly human-readable string keys that are custom-generated (for example, "stuff::ip::127.0.0.1::name::foo" that you generate with a plain format! call. The values, otoh, are more general. cacache is partly designed so that the values you store are, in fact, plain old byte streams of some blob storage of some sort (in the original use case, NPM package data and arbitrary files). But if you want to, say, store structured data, you can always just use serde and serde_json. I'll leave this issue open because documenting these intentions is a perfectly valid ask, but I hope this at least answers your question for now.

@zkat zkat added help wanted Extra attention is needed good first issue Good for newcomers question Further information is requested documentation Related to the library docs. labels Jan 30, 2023
@bryanlarsen
Copy link
Author

thanks. One of my reasons for using cacache is because if it was easy to use format! I'd probably just use it to create a filename and dump files onto the file system. :)

@zkat
Copy link
Owner

zkat commented Jan 30, 2023

Can you expand on that? There's no unique identifier (or collection of identifiers) in your source objects that you can use?

@bryanlarsen
Copy link
Author

Yes there are, but they're a lot easier to serialize than format. IOW, they're a vec of structures.

@zkat
Copy link
Owner

zkat commented Jan 30, 2023

I mean, sure, you could just use Debug or Display for keys if you think that'll be reasonably-sized and unique enough. I just think it's generally nicer to have more concise, bespoke keys.

@RustyNova016
Copy link
Contributor

just use serde and serde_json

A lower level solution would be to use rmp_serde to serialize directly to bytes.
There's also the matter that the data isn't typed in and out, which can be an issue as you need to keep track of the type. So a generic wrapper struct can be useful.

Here's an exemple I'm currently using (replace color_eyre by miette if needed):

pub struct SerdeCacache<D, K>
where
    D: Serialize + DeserializeOwned,
    K: AsRef<str>,
{
    name: PathBuf,
    _phantom_data: PhantomData<D>,
    _phantom_key: PhantomData<K>,
}

impl<D, K> SerdeCacache<D, K>
where
    D: Serialize + DeserializeOwned,
    K: AsRef<str>,
{
    // Set an item in the cache
    pub async fn set(&self, key: K, data: &D) -> color_eyre::Result<Integrity> {
        let serialized = rmp_serde::to_vec(data)?;
        Ok(cacache::write(&self.name, key, serialized).await?)
    }

    // Get an item from the cache
    pub async fn get(&self, key: K) -> color_eyre::Result<D> {
        let read = cacache::read(&self.name, key).await?;
        Ok(rmp_serde::from_slice(&read)?)
    }
}

Might turn it into a crate if it's actually getting bigger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Related to the library docs. good first issue Good for newcomers help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants