Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta based quicksync - client side #335

Open
pigmej opened this issue Oct 1, 2024 · 1 comment
Open

Delta based quicksync - client side #335

pigmej opened this issue Oct 1, 2024 · 1 comment

Comments

@pigmej
Copy link
Member

pigmej commented Oct 1, 2024

After a few experiments, we can introduce delta-based quicksync.

From the client side, the process of quicksync would look like:

  • IF the DB exists, just download batches and restore batch after batch
  • IF DB does not exist, download normal quicksync (full db state)

Process of batch downloading:

  • each batch is produced with start_layer,end_layer,start_layer_minus_one_db_hash. The layer hash NEEDS TO BE COMPARED before restoring with the local state
  • IF hash is different then proceed to PREVIOUS batch till the hash matches
  • after restoring from partial you can continue without layer checks.

The server side publishes partials in predefined sizes OR times so do not hardcode the start-end span. All values in metadata.csv are sorted by start_layer.

  1. Restoring same partial twice should be noop and is considered safe. BE WARNED that there are NO deletes during the restore process (that's why absolutely make sure that you compare the layer_hash).
  2. It is safe to stop / restart / retry at any stage after restoring from any number of partials

Objects that are not tied to time or layer are remembered on the server side and added to the first batch when they appeared in the DB.

The process should be integrated into quicksync-rs and the initial integration was provided in spacemeshos/quicksync-rs#49

@pigmej
Copy link
Member Author

pigmej commented Dec 19, 2024

Currently, the open items would include:

  • how to properly handle reorgs - when syncing partial data, we need to detect reorg earlier and act accordingly; we have the layer hashes so that we could figure it out based on that, too. Probably deletes / updates are needed.

Besides that open topic logic in theory works, and delta based quicksync is possible, there is:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant