-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Perf] Skip costly validation for stored data #2574
base: staging
Are you sure you want to change the base?
[Perf] Skip costly validation for stored data #2574
Conversation
@@ -146,6 +146,11 @@ impl<N: Network> Subdag<N> { | |||
// Ensure the leader certificate is an even round. | |||
Ok(Self { subdag }) | |||
} | |||
|
|||
/// Initializes a new subdag. | |||
pub fn from_unchecked(subdag: BTreeMap<u64, IndexSet<BatchCertificate<N>>>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a more std
-like name for such a function would be from_unchecked_parts
or from_raw_parts
; I'd personally prefer it, as it clearly indicates that the underlying objects are of interest, as opposed to an alternative representation of the same object
the same suggestion applies to the other instances of this name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very much in favor of this direction; once something is in our own local storage, it should be considered trustworthy/valid.
Check that Deserialize::deserialize is only used when getting data from disk, consider enforcing this somehow.
If we feel that we need to be more careful about the handling of objects that may not be verified, we could introduce a wrapper struct to indicate with certainty that they either are or are not (determined by how they were deserialized), and use it as a parameter/generic in the relevant functions/methods. This, however, could have a large impact on the diff (but not performance, as it would be zero-cost).
Motivation
Currently, every time an object is deserialized, input validation is performed. Ideally, we only perform input validation on data received from the network, not from disk.
This PR takes an initial stab at skipping the most expensive input validation with the smallest possible code diff. Deserialization of a
Subdag
and its contents is now about 10x faster, see the newdag.rs
benchmark (half of which is due to skippingis_in_correct_subgroup_assuming_on_curve
).Whenever we (de)serialize from disk, we use
bincode
(which in its current use just prepends some bytes toto_bytes_le
). Bincode calls intoDeserialize::deserialize
, which now calls intoFromBytesUncheckedDeserializer
, which now calls intofrom_bytes_le_unchecked
.Test Plan
Open TODOs
Deserialize::deserialize
is only used when getting data from disk, consider enforcing types areVerified
in theDataMap
interface.from_x_coordinate_unchecked