-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
file corruption on Windows #83
Comments
This can definitely happen in any kind of environment with high writes. I'm assuming you're properly awaiting/calling It might also have to do with mmap, so you could try building without the mmap feature. As far as what to do: I don't really expect to be able to fix all cases of cache corruption. What cacache is designed for isn't completely preventing corruption (since I don't consider that possible), but preventing you from reading bad data. My recommendation, and what I do in my own applications, is that if I read data and get an IntegrityError, the correct action is to delete the bad content and redownload the cached data. This should only happen occasionally, of course. |
The big surprise was the corruption of existing data. I did not anticipate that, I expected that writing identical content to a different key would not re-write and corrupt the content for the old key. |
I would not expect that either, that's very strange--considering there's separate files and filenames for everything. The only way I can see this happening is if you're trying to write a bunch of empty files (which would explain the NULs), and they all end up writing to the same empty content file, which will share the filename |
I wrote a wrapper for my usage of cacache::write that calls cacache::exists to ensure it doesn't exist before calling cacache::write. I assumed that it would significantly cut down on the amount of file corruption we're seeing in the wild. If your code already does something like that my wrapper will be ineffective and I need to do something more drastic. |
@bryanlarsen I wouldn't do that: cacache takes care of doing things as atomically as possible, which is how it can operate completely lockless. Having this kind of two-step operation might inject a race condition which will almost definitely be hit under high-concurrency environments where you're hitting the same data. That said! I have run into some weird cache corruption stuff when I was using cacache in orogene (which is VERY high throughput), but it only happened occasionally and I couldn't really track down why. |
Why? If I run two threads of |
because if something happens after the |
Could you briefly elaborate on what specifically in the design of cacache makes corruption inevitable? I ask because a) one of the feature bulletpoints in README advertises: |
A significant number of users are reporting file corruption on Windows. It's a write heavy workload, and investigation reveals that the content file is filled with NUL's and the index file's last line has a bunch of NUL's appended to it. On read, IntegrityError is unsurprisingly returned.
This occurs even when writing content that already exists in the cache.
The text was updated successfully, but these errors were encountered: