-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix note tag #642
Fix note tag #642
Conversation
- Remove network execution, since currently this concept is not really used and resulted in a mismatch among the Rust and MASM code - Changed the conversions to the NoteTag to include validation - Fixed the NoteTag validation - Fixed the tag validation in the `create_note` kernel code - Fixed the tests to create valid note tags
@@ -111,7 +111,7 @@ export.create_note | |||
assertz.err=ERR_NOTE_INVALID_TAG_HIGH_BIT_SET | |||
# => [tag_low, note_type, ASSET, tag, note_type, RECIPIENT] | |||
|
|||
u32shr.30 u32and assertz.err=ERR_NOTE_INVALID_TAG_PREFIX_FOR_TYPE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug, the u32shr.30
gets the two highest bits of the tag. That should be asserted to be equal to the note's type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this was actually an error. I don't think we can enforce that the first two bits of the tag are equal to the note type (if we could, we could just use these two bits to encode note type). The rules are as follows:
- For public notes: second tag big must be
0
. - For off-chain notes: first tag big must be
0
. - For encrypted notes: first two tag bits must be
00
.
I think doing a bitwise AND between the first two bits of the note tag and the note type achieves this:
- For public notes (
type = 0b01
), we get(note_tag >> 30) & note_type == 0
, only if the second bit of the tag is0
. - For off-chain notes (
type = 0b10
), we get(note_tag >> 30) & note_type == 0
, only if the first bit of the tag is0
. - For encrypted notes (
type = 0b11
), we get(note_tag >> 30) & note_type == 0
, only if the first two bits the tag are00
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can enforce that the first two bits of the tag are equal to the note type
Why is that?
The rules are as follows
Seems the rule is that the note type is a bitmaks, and the corresponding bits must be zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can enforce that the first two bits of the tag are equal to the note type
Why is that?
Note type by itself is not enough to capture all the data we want to convey (specifically, it is missing the part about who is the intended recipient of the note - i.e. a single account or any account).
Seems the rule is that the note type is a bitmaks, and the corresponding bits must be zero?
Given the note tag formats described in #402 (comment), that's how it works out to be (though, I didn't plan for it to be like that originally).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note type by itself is not enough to capture all the data we want to convey (specifically, it is missing the part about who is the intended recipient of the note - i.e. a single account or any account).
Yes, I wasn't proposing removing the address from the tag. The question was why the note type can't be encoded directly to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the note tag formats described in #402 (comment), that's how it works out to be (though, I didn't plan for it to be like that originally).
So that was mostly by accident?
pub fn validate(&self, note_type: NoteType) -> Result<Self, NoteError> { | ||
let tag_mask = note_type as u32; | ||
if (self.0 >> 30) & tag_mask != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug, I suppose the incorrect behavior was copied from the MASM code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment - I don't think this was a bug but rather intended behavior.
/// Returns true if the note is intended for execution by a specific account. | ||
/// | ||
/// A note is intended for execution by a single account if either the first two bits are zeros | ||
/// or the first 3 bits are 0b100. | ||
pub fn is_single_target(&self) -> bool { | ||
let first_2_bit = self.0 >> 30; | ||
let first_3_bits = self.0 >> 29; | ||
first_2_bit == 0b00 || first_3_bits == 0b100 | ||
} | ||
|
||
/// Returns note execution mode defined by this tag. | ||
/// | ||
/// If the most significant bit of the tag is 0 or the 3 most significant bits are equal to | ||
/// 0b101, the note is intended for local execution; otherwise, the note is intended for | ||
/// network execution. | ||
pub fn execution_mode(&self) -> NoteExecutionMode { | ||
let first_bit = self.0 >> 31; | ||
let first_3_bits = self.0 >> 29; | ||
|
||
if first_bit == 0 || first_3_bits == 0b101 { | ||
NoteExecutionMode::Local | ||
} else { | ||
NoteExecutionMode::Network | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for removing these methods? I think they provide useful info about the tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this correct? We are only setting the highbits to the note type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean note tag? The first 3 bits of the note tag define these properties. But maybe I didn't understand the question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is to say, here we are testing against 3highbits while the kernel only checks 2highbits.
I think my confusion is coming from all these magic values, to me they are opaque and have no meaning. It doesn't make sense to me how the network execution can be asserted in the Rust code while the Masm code has no mention to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean note tag?
Yes, sorry. I meant note tag.
The first 3 bits of the note tag define these properties. But maybe I didn't understand the question?
I think I'm misunderstanding the objectives here. This table has lots of magic values, for example 101: local execution for a use case, public note.
, the third bit in there, is that bit the first bit of the OFF_CHAIN storage type? And that is intentional, not an accident?
So the validation we have here is not to guarantee that we have a valid tag, but to forbid a completely invalid one? For example, there is the rule For encrypted notes: first two tag bits must be 00.
, and at the same time we will allow an OffChain
to also start with 00
even though it is not encrypted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I guess the rules should be:
- For private notes, the first bit must be
0
. - For public notes, the first bit must be
1
. - For encrypted notes, the first two bits must be
00
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we cannot guarantee that the user constructs a tag correctly.
Question: Why don't you want to guarantee the tag construction at creation time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To spell the rules out a bit differently for various tag prefixes:
Where did these values come from? For example, given the description above:
For private notes, the first bit must be 0.
For encrypted notes, the first two bits must be 00.
It seems odd to me that 00
would pass both the validation of encrypted and private notes. And from the discussion so far it seems this is just an accident? (Edit: I now understand the validation wasn't intended to guarantee the tag's validity, but what I haven't understood why that is the case, why don't we enforce correct tags prefixes?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can: beyond the first 3 bits, everything is basically arbitrary data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems odd to me that
00
would pass both the validation of encrypted and private notes. And from the discussion so far it seems this is just an accident?
No, this part is intended. Basically, 00
means that the note is intended to be executed locally by a specific recipient. This covers all note types (i.e., public, private, and encrypted notes can be executed locally).
/// Returns a new [NoteTag] instantiated from the specified account ID. | ||
/// | ||
/// The tag is constructed as follows: | ||
/// - For local execution, the two most significant bits are set to 0b00, the following 16 bits | ||
/// are set to the 16 most significant bits of the account ID, and the remaining 14 bits are | ||
/// set to 0. | ||
/// - For network execution, the two most significant bits are set to 0b10 and the remaining | ||
/// bits are set to the 30 most significant bits of the account ID. | ||
pub fn from_account_id( | ||
account_id: AccountId, | ||
execution: NoteExecutionMode, | ||
) -> Result<Self, NoteError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method worked as intended - i.e., it constructed tags for notes indented to be executed by specific accounts (e.g., tags with prefixes 00
and 101
).
What we probably should do, is add another method - e.g., something like from_use_case()
. This method could look like this:
pub fn for_use_case(
use_case_type: u16,
payload: u16,
execution:NoteExecutionMode
) -> Result<Self, NoteError>
This constructor would create notes intended for consumption by any account (i.e., tag prefixes 01
, 101
, and 11
).
The validation that we'd need to do is that use_case_type
is actually just 13 bits (we could also create a new type for this). Alternatively, we could try to create a use case enum:
pub enum NoteUseCase {
Swap(u8, u8), // internal values are the first 8 bits of the from/to faucet IDs
}
But the question then is how to make NoteUseCase
enum extensible.
I think we can modify the tagging scheme as follows:
Some properties of the above scheme:
Main downside is that checking whether a given note type is consistent with the tag in MASM is going to be a bit more difficult - but that's probably OK. Also, maybe there is some clever way to do bitwise operations to make it simpler (I haven't spent too much time on this). |
What about when the target is a recipient? I think this scheme breaks down since the value is a hash. From what I understand, this scheme is not a guarantee but a best effort (since it is not enforced in the kernel), and the only goal is to reduce the amount of unnecessary data download by a client in a best effort manner. If that is the goal for the distinction of |
I'm not sure I fully understood the question. Recipient (in the sense defined by the linked struct) would probably not used as a tag as its value is unique for every note and the tag values should be somewhat stable (otherwise, we might as well use note IDs to query note data). So, some prefix of an account ID could be a good tag. A tag value could also be chosen completely randomly - but even here, I'm not sure i see the issue: we can always take some prefix of a random value.
I considered this initially but it complicates things in other places quite a bit. Specifically:
|
From our past conversations I was under the impression that you wanted the tag's target to be either an account id or a recipient. Isn't that why it is referred to as
What is exactly the pattern you're talking about? I meant for the situation where a tag start with Are you saying the |
It's not "either or" case - the tag could be based on account ID, recipient, or any other number of schemes.
The tag would not be "set" but rather constructed. So, If I wanted to base a tag on a public key, I could construct it as But thinking through this more, I came to the conclusion that we can simplify the tagging scheme a bit more. So, here is a new proposal:
Let's consider how these tags would be used by various parties - specifically: network operators and users for state sync requests. Network operatorsWhen an operator receives a note, it needs to quickly filter out notes which are not intended for network execution. Here is the rough process the operator would follow:
By the end of this process we'll end up with a set of notes which the operator may be able to execute and we still need to figure out if it can actually execute them, but hopefully, we would have filtered out 99% (if not more) of the irrelevant notes. So, if the user wants their note to be processed by the operator, they will need to construct the tag State sync requestsWhen making a state sync request, I want to receive a set of notes I'm interested in. There could be 2 general categories of notes: (1) notes intended for me specifically and (2) notes intended for anyone. For notes intended for anyone, I may also want to specify that I'm interested only in public notes (otherwise, I may not be able to execute them). Notes targeted to specific accountsIn this scenario we have 2 parties: the sender and the receiver. The receiver will need to communicate some information to the sender so that the sender can construct the note intended for the tem. This can be done in a number of ways - a couple example:
In the 1st and 3rd case, we could have conventions for how to construct tags from account IDs and public keys. For example:
For the second case, the receiver would also need to communicate the tag separately to the sender (i.e., the tag would be sent together with Notes targeted to any accountAn example of this could be a
So, if I, as a user, want to get some SWAP notes during state sync request, I'd set one of the tags to To summarize:At the protocol level the only thing we should enforce is: if the tag starts with In addition to this, we can have conventions defined outside of the protocol for how tags should be formed for different purposes:
It is up to the user to follow these conventions, but they don't have to. For example, I could use
|
Here is how pub fn from_account_id(
account_id: AccountId,
execution: NoteExecutionMode,
) -> Result<Self, NoteError> {
match execution {
NoteExecutionMode::Local => {
let id: u64 = account_id.into();
// select 14 most significant bits of the account ID and shift them right by 2 bits
let high_bits = (id >> 34) as u32 & 0xFFFF0000;
// set the 2 most significant bits to 11 to get the form 11 + 14 bits of account ID
Ok(Self(high_bits | 0xC0000000))
},
NoteExecutionMode::Network => {
if !account_id.is_on_chain() {
Err(NoteError::NetworkExecutionRequiresOnChainAccount)
} else {
let id: u64 = account_id.into();
// select 31 most significant bits of account ID and shift them right by 1 bit
let high_bits = (id >> 33) as u32;
// the tag will have the form 0 + 31 high bits of account ID; note that the
// second bit of the tag is guaranteed to be 0 because public account IDs start
// with 0
Ok(Self(high_bits))
}
},
}
} |
We could also add pub fn for_use_case(
use_case_id: u16,
payload: u16,
execution:NoteExecutionMode,
) -> Result<Self, NoteError> {
// TODO: verify that use_case_id is smaller than 2^14
// combine use case ID with payload
let use_case_descriptor = ((use_case_id as u32) << 16) | (payload as u32);
match execution {
NoteExecutionMode::Local => {
// set the 2 most significant bits to 10 to get the form 10 + 30 bit use case descriptor
Ok(Self(use_case_descriptor | 0x80000000))
},
NoteExecutionMode::Network => {
// set the 2 most significant bits to 01 to get the form 01 + 30 bit use case descriptor
Ok(Self(use_case_descriptor | 0x40000000))
},
}
} |
@bobbinth thanks for the extensive write up! It really helps clarify some things. After reading the above. I wonder if it wouldn't be better to have the execution mode encoded into the IMO this is great for user experience, since it would completely prevent the horror scenario of people sending money to an address that doesn't exist. It would guarantee the operators can rely on the account bit to identify network execution, instead of having it just as a hint. And all the other scenarios can still be supported, via the I also think we should reserve one bit somewhere, it could also be in the tag, for future protocol extension. Edit: This is ofcourse, a partial proposal, it doesn't make sense to flesh it out if there is no consensus this is useful. There are things like adding a hint for network execution for use_cases that needs to be flesh out. But I'd say it is an orthogonal concern. Edit2: I should add even though the above prevents sending a note for an invalid address, it doesn't fix all the issues. Specifically the note code may have logic errors, I just assume this will be a lesser problem, since I assume most people won't be writing custom code. |
I have a feeling of déjà vu, but why are we introducing these tags for network executed notes? The tag is nothing but a hint to speed up search, but I don't think we need any of that to speed up public notes (offchain/encrypted notes are something else). Given that network executed notes must be public, the complete code and note inputs must be available to be inspected. Also, since we must validate the public notes, we have to parse its complete contents, so all the data should be available to be introspected without extra cost. Any note that is intended for a specific address must have validation like what is done for The usecase scenario needs a bit more of thought, since the check above will be missing, and maybe it still needs to employ the tag for it. There is also a question of how reliably we can inspect the code which I don't know fully how it would work. So far this is more a simplified idea than anything concrete. With the above in mind, I'm also wondering why are we using a single |
We can't really do that because we may want to have accounts for which sometimes the network executes transactions, and other times, users execute local transactions.
We can try to do this only in the network execution case. When a note (whether private, public, or encrypted) is created for local execution, it may be recorded on chain even before the target account is (this allows sending notes to people who don't have accounts yet).
Agreed, and that's one of the main reasons for
Yes, the data would be available - but without any hints, the check could be quite expensive. So, one of the main reasons for these hints is to help the operator to filter out most of irrelevant notes via inexpensive checks, and perform the expensive checks (i.e., actually looking at the code) for very few notes. We don't have any concrete numbers on this, but I'm imagining a system where we have over 1K public notes created per second and over 90% of these notes are not for network execution.
Similar to above - I think the problem "given a public note figure out if you are supposed to execute it an how" is a very difficult problem to solve without some extra hints. The tag scheme described above provide some of these hints which should help narrow down the expensive checks (I'm thinking more about notes with custom code here - so, more complicated than
I haven't given it too much though, but we probably still want to keep some uniformity as a lot of things are the same for public, encrypted, and private notes. For example, we still need to prevent double-spends, and if, for example, we don't use nullifiers for public notes, we need to have another mechanism of doing this (and this also means performing these checks in the kernels). My intuition is that making them more heterogeneous at the protocol level will complicate things more than simplify. One the Rust side, we could consider refactoring note objects to make make a note something like an enum, but it is also not clear to me whether it will actually simplify things. |
I think this would be better served by an account guardian instead of the network operators. If we get the functionality of moving control of an account around like this, then we might as well just delegate control of an account to a third party service, which doesn't have to be a network operator. It allows to spread the load better, allows a user to delegate control to another entity for when they are offline, creates a market which could result in better prices for the users, removes work from the network so it can scale better, and if it is financially viable, the operators can participate as well by running a different service. However, this also needs a lot of thought: 1. the public storage is no longer full, but partial 2. this needs a way of synchronizing and transferring control 3. this need authentication and authorization to prevent unauthorized changes to the account. Feels like a much much bigger problem. With the above said. I think the idea is nice, but it doesn't invalidate the original proposal for network only accounts, each has different trade offs.
Yes, in the above proposal these notes would not be considered as targeting an account, since the account doesn't exist yet. I'm not sure if it is a good idea to have a note that is said to have an account as a target and the account doesn't exit. That is not only confusing, but it ignores the advantages of having an account_id to begin with, which is basically to figure out that an account exist by looking at the rollup state. If you have to look somewhere else for some data which is not validated, it might as well just be a random tag and the user would get the same behavior.
Yeah, I know. But the solutions are different, and I would argue complementary, one doesn't necessarily replace the other. Just like
Your call. I'm not convinced by the claim, it is also hard to predict how much of a cost the alternative would be if there is no conversation about possible implementations. If I'm underestimating the impact of the hint, it can be defined to something like
I wasn't thinking about changing how notes are consumed, but how a public note data is defined. Basically remove all the indirection layers and just layout the public data, so instead of the nullifier, just have the serial number directly. |
There are other use cases when this could be useful. For example, the account may be controlled by an exchange, and most of the time, the exchange, would execute local transactions against this account. But if the exchange stops cooperating, users would be able to submit transactions to the network, and the network would execute them against the exchange's account.
This is actually an important thing to have as it addresses a "chicken-and-egg" problem: to create an account I need to execute a transaction, and for this I need a fee. But I can't pay a fee unless someone sends me funds. But I can't receive the funds until I have an account.
Yes, we definitely will need to solve the problem of figuring out if a note can be executed by the operator - that's something we'll tackle as a part of working on network execution (probably in a couple of months). But w/e solution we come up with will almost certainly be much more complicated than checking a few bits in an integer. Now, maybe these solutions will be "cheap enough" where the operator could examine code of all public notes, and in that case, we could ignore the tag prefixes completely. But even then, I'm not seeing much downside of having them. The complexity of the current scheme is minimal (i.e., a few instructions to make sure that if tag starts with anything but |
This is a partial fix for #637 . It fixes some issues I found with the code but it doesn't do:
This PR does the following:
create_note
kernel code