-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recovery data usage improvements #6635
base: master
Are you sure you want to change the base?
Conversation
goal: reduce data usage for recovery - changed from hex to base64: 33% saving - client can only download from single peer and verify using signature: num peers times saving
df42b92
to
51fd30e
Compare
/// access to a decoder registry also makes decoding structs impossible that | ||
/// themselves contain module dyn-types (e.g. a module output containing a | ||
/// fedimint transaction). | ||
pub fn try_into_inner_known_module_kind(&self, decoder: &Decoder) -> Result<T, DecodeError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sus. Why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied everything from original
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: verify that the original wasn't written by me and silly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so it's used to decode session/output outcomes in client modules sometimes, doesn't panic. We always take it from current consensus of peers (except now, where we check the signature so effectively the same thing). Looks OK.
async fn get_session_status_raw( | ||
&self, | ||
block_index: u64, | ||
decoders: &ModuleDecoderRegistry, | ||
) -> anyhow::Result<SessionStatus> { | ||
debug!(target: LOG_CLIENT_NET_API, block_index, "Fetching block's outcome from Federation"); | ||
let verify = |_: &SignedSessionOutcome| -> bool { todo!() }; | ||
// TODO: check api version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just make it a new function, and let the caller verify. I think all over the code we do the api versioning checks in the higher layer. Though passing api version here wouldn't be bad either.
@@ -129,12 +132,56 @@ where | |||
.map_err(|e| anyhow!(e.to_string())) | |||
} | |||
|
|||
fn select_peers_for_status(&self) -> impl Iterator<Item = PeerId> + '_ + Clone { | |||
// TODO: better selection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shuffle should do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was thinking about avoiding the down guardian but random is still pretty good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still a massive improvement and whatever is simple should work.
c0b7822
to
ef7a1f6
Compare
simplified the control flow |
9bf8539
to
9912576
Compare
@maan2003 I might be off and not have time to fully bring it all in, but there were two caches because session status can have partial results, or something. I would avoid adding too much changes to one PR for now unless absolutely necessary. |
9912576
to
ac9db1a
Compare
Await blocks, while holding the lock there, right. So there is an issue, or at least at some point I believe there was (could have been wrong). Later this/next week I could investigate to fully understand the issue, but for now not worth blocking on. |
ac9db1a
to
e8b4413
Compare
Right. So the goal of "await" is blocking until the result is there, while a concurrent "check status" for the same session wants to get a partial result, and not block, so would get stuck if there was a blocked "await", leading to random and unwanted hanging in certain cases (which we might add later). Even if there's a way to somehow unify these caches, the complexity of it is probably not worth the tiny memory savings. |
241d5c0
to
8a99bb9
Compare
just following the Keychain impl on server side
8a99bb9
to
3c30dc9
Compare
👀 looks like recovery tests is failing, will debug |
goal: reduce data usage for recovery
for 4 guardian federation: overall 6x less data download
Happy new year!