Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with cache invalidation related to parallel queries. #6292

Merged
merged 5 commits into from
May 13, 2024

Conversation

jleibs
Copy link
Member

@jleibs jleibs commented May 11, 2024

What

Issue 1: data invalidation before data use

This failure mode happens where:

  • Thread 1: produces results and returns them
  • Thread 2: invalidates results
  • Thread 1: uses the new results which point to a non-overlapping range and finds nothing of interest.

I worked around this by cloning the data and orphaning the Inner when we execute an invalidation. This feels a bit heavy-handed for the cases where we don't need it. Maybe there's an alternative way to keep the structure locked for read until the results have been consumed?

Issue 2: not considering pending data

The time_range logic wasn't factoring in the possibility of pending_data. This meant we could have pending gaps in queries that had not yet been resolved, but wouldn't count them as gaps. Subsequently as the promises are resolved we would still end up with gaps in the data.

time_range() has now been replaced with pending_time_range() to do the proper book-keeping.

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!

To run all checks from main, comment on the PR with @rerun-bot full-check.

@jleibs jleibs added ⛃ re_datastore affects the datastore itself do-not-merge Do not merge this PR labels May 11, 2024
@jleibs jleibs requested a review from teh-cmc May 11, 2024 00:09
Copy link

Deployed docs

Commit Link
b5807be https://landing-78djifik8-rerun.vercel.app/docs

@jleibs jleibs force-pushed the jleibs/cache_issues branch from b5807be to 44c686b Compare May 11, 2024 00:10
(promises_back.last().map(|(index, _)| index), indices.back())
{
if let (Some(p_index), Some(i_index)) = (
promises_back.first().map(|(index, _)| index),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this this didn't catch any new sanity issues, but I think this was just a typo unless I'm misunderstanding the intent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, nice catch

@jleibs jleibs force-pushed the jleibs/cache_issues branch from 44c686b to ee90c86 Compare May 13, 2024 13:50
@jleibs jleibs changed the title Attempt at fixing the cache issues demonstrated in #6279 Fix issues with cache invalidation related to parallel queries. May 13, 2024
@jleibs jleibs force-pushed the jleibs/cache_issues branch from 0cb35f5 to a8d374f Compare May 13, 2024 14:14
@jleibs jleibs removed the do-not-merge Do not merge this PR label May 13, 2024
@jleibs jleibs marked this pull request as ready for review May 13, 2024 14:19
@jleibs jleibs added 🪳 bug Something isn't working exclude from changelog PRs with this won't show up in CHANGELOG.md labels May 13, 2024
Copy link
Member

@teh-cmc teh-cmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice

@jleibs jleibs merged commit 95b7910 into main May 13, 2024
35 of 43 checks passed
@jleibs jleibs deleted the jleibs/cache_issues branch May 13, 2024 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working exclude from changelog PRs with this won't show up in CHANGELOG.md ⛃ re_datastore affects the datastore itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some range caching is broken when using the same visible time range across different views
2 participants