Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AccountCode refactor from MerkleTree to Sequential hash for offset based storage access #763

Merged
merged 35 commits into from
Jul 30, 2024

Conversation

phklive
Copy link
Contributor

@phklive phklive commented Jun 19, 2024

In this PR I propose to refactor storage to add an offset-based storage access.

I am actively working on understanding the whole chain of dependency for this modification for now I follow this plan:

  1. Refactor the AccountCode struct to replace the Smt with a sequential hash commitment
  2. Fix all dependant functions that supported the Smt logic, replace this logic with Vec<Digest, Felt> handling (which as pointed out in the issue should be simpler and more efficient)
  3. Refactor tests
  4. Refactor insertion and verification of data in the AdviceProvider
  5. Refactor MASM code adding storage access authentication and refactor MerkleTree dependant code

Would be glad to have your input on this plan @bobbinth if you think I could go forward in a more efficient way.

Closes: #667

@phklive phklive requested a review from bobbinth June 19, 2024 22:12
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left some preliminary comments inline (mostly focusing on changes in account_code module). Also, the overall plan makes sense.

objects/src/accounts/code.rs Outdated Show resolved Hide resolved
objects/src/accounts/code.rs Outdated Show resolved Hide resolved
Comment on lines 56 to 61
let procedures: Vec<(Digest, Felt)> = procedures
.into_iter()
.enumerate()
.map(|(i, proc)| (proc, Felt::new(i as u64)))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would assign a unique index to each procedure, which is probably not what we want to do. Ideally, the offset would be specified within MASM (and in the future MAST) itself. For example, for the basic wallet contract, it could look something like:

use.miden::contracts::wallets::basic->basic_wallet
use.miden::contracts::auth::basic->basic_auth

@miden-storage-offset(0)
export.basic_wallet::receive_asset

@miden-storage-offset(0)
export.basic_wallet::send_asset

@miden-storage-offset(1)
export.basic_auth::auth_tx_rpo_falcon512

And the result of this would be:

[(receive_asset_hash, 0), (send_asset_hash, 0), (auth_tx_rpo_falcon512, 1)]

But we don't have support for attributes in MASM yet, and i'm not sure what would be a good interim solution (e.g., maybe we provide offsets as a construction parameter - it would be brittle but maybe OK for now).

cc @bitwalker and @plafer for another example of how we'd use annotations in MASM.

Two other things to consider:

  • Should we sort the procedures before computing commitment? I think probably yes.
  • Should we somehow indicate that some procedures don't need storage access? For example, in the above recieve_asset and send_asset procedures never need to touch storage). I also think yes, but not sure how to do it yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still thinking through how we could handle the input of the offsets, will add towards the end of the refactor once everything works correctly with dummy values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to address some of the above questions.

objects/src/accounts/code.rs Outdated Show resolved Hide resolved
@phklive phklive force-pushed the phklive-account-code-refactor branch from daf6e06 to cd0ef51 Compare July 18, 2024 09:12
@phklive phklive marked this pull request as ready for review July 19, 2024 09:19
@phklive phklive requested a review from bobbinth July 19, 2024 09:19
@phklive phklive changed the title WIP: AccountCode refactor for offset-based storage access AccountCode refactor from MerkleTree to Sequential hash for offset based storage access Jul 19, 2024
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! Not a full review yet, but I did review pretty much everything except for tests. Left some comments inline.

objects/src/accounts/code.rs Outdated Show resolved Hide resolved
Comment on lines 56 to 61
let procedures: Vec<(Digest, Felt)> = procedures
.into_iter()
.enumerate()
.map(|(i, proc)| (proc, Felt::new(i as u64)))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to address some of the above questions.

objects/src/accounts/code.rs Outdated Show resolved Hide resolved
objects/src/accounts/code.rs Outdated Show resolved Hide resolved
objects/src/accounts/code.rs Outdated Show resolved Hide resolved
miden-tx/src/compiler/mod.rs Outdated Show resolved Hide resolved
miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-lib/src/transaction/inputs.rs Outdated Show resolved Hide resolved
miden-lib/asm/kernels/transaction/api.masm Show resolved Hide resolved
@phklive
Copy link
Contributor Author

phklive commented Jul 25, 2024

Error: caller == [0,0,0,0]

Error context

This bug has been discovered after adding this check in get_procedure_info proc added in this PR:

  # check that index < number of procedures contained in the account code
  dup exec.memory::get_num_account_procedures lt assert.err=ERR_PROC_INDEX_OUT_OF_BOUNDS
  # => [index]

Requested here: #763 (comment)

This check asserts that the procedure index in memory that is trying to be accessed is actually in bounds with the length of the procedure vector ( number of procedures ).

This new assert makes the following tests fail:
SCR-20240725-kobg

All of these tests initiate at one point a syscall for example test_build_recipient_hash creates a note at the end which initiates the following syscall:

#! Creates a new note and returns the index of the note.
#!
#! Inputs: [tag, aux, note_type, RECIPIENT]
#! Outputs: [note_idx]
#!
#! tag is the tag to be included in the note.
#! aux is the auxiliary metadata to be included in the note.
#! note_type is the storage type of the note
#! RECIPIENT is the recipient of the note.
#! note_idx is the index of the crated note.
export.create_note
    syscall.create_note
    # => [note_idx, EMPTY_WORD, 0]

    # clear the padding from the kernel response
    movdn.4 dropw swap drop
    # => [note_idx]
end

This seems to be the common factor in all of the failures.

Error track down

The get_procedure_info procedure is called by authenticate_procedure of the tx kernel:

#! Verifies that the procedure root is part of the account code
#!
#! Stack: [PROC_ROOT]
#! Output: [storage_offset]
#!
#! - PROC_ROOT is the hash of the procedure to authenticate.
#!
#! Panics if
#! - procedure root is not part of the account code.
export.authenticate_procedure
    # load procedure index
    emit.ACCOUNT_PUSH_PROCEDURE_INDEX_EVENT adv_push.1
    # => [index, PROC_ROOT]

    # get procedure info (PROC_ELEMENTS, storage_offset) from memory stored at index
    exec.get_procedure_info
    # => [PROC_ELEMENTS, storage_offset, PROC_ROOT]

    # verify that PROC_ROOT exists in memory at index
    movup.4 movdn.8 assert_eqw.err=ERR_PROC_NOT_PART_OF_ACCOUNT_CODE
    # => [storage_offset]
end

Which is itself called by authenticate_account_origin or the kernel api:

#! Authenticates that the invocation of a kernel procedure originates from the account context.
#!
#! Panics:
#!   - if the invocation of the kernel procedure does not originate from the account context.
#!
#! Stack: [...]
#! Output: [...]
proc.authenticate_account_origin
    # get the hash of the caller
    padw caller
    # => [CALLER, ...]

    # assert that the caller is from the user context
    exec.account::authenticate_procedure
    # => [storage_offset, ...]

    # TODO: use the storage_offset for storage access
    # drop the storage_offset
    drop
    # => [...]
end

What we can read above is that the authenticate_procedure function takes in a PROC_ROOT as input that is provided by the authenticate_account_origin procedure by using the caller environment input which overwrites the top four stack items with the hash of the function which initiated the current SYSCALL.

It seems that all the above tests executes the caller environment input from the root context resulting in the caller returned being [0,0,0,0]. Hence the following chain of events:

  1. authenticate_account_origin calls authenticate_procedure passing in as input caller which is [0,0,0,0]
  2. authenticate_procedure emits the following event: ACCOUNT_PUSH_PROCEDURE_INDEX that gets caught by the TransactionHost in the on_event function:
    TransactionEvent::AccountPushProcedureIndex => {
    self.on_account_push_procedure_index(process)
    },
  3. on_event calls on_account_push_procedure_index that queries the account_procedure_index_map using get_proc_index with the current process (here [0,0,0,0]).
  4. get_proc_index queries the first word of the operand stack and searches for a value in the AccountProcedureIndexMap matching that key.
  5. The returned value is sent back to the AdviceProvider to be added to the operand stack of the VM.

The problem being here that we are querying the AccountProcedureIndexMap with [0,0,0,0] (Digest::default) which will return us the element at this position and not a valid procedure index in the map.

There are 2 implementations of the AccountProcedureIndexMap a production one and a mock one, the one that is getting called during the tests is the mock one and has additional lines compared to the classical one stating that:

pub fn get_proc_index<S: ProcessState>(
&self,
process: &S,
) -> Result<u8, TransactionKernelError> {
let proc_root = process.get_stack_word(0).into();
// mock account method for testing from root context
// TODO: figure out if we can get rid of this
if proc_root == Digest::default() {
return Ok(255);
}
self.0
.get(&proc_root)
.cloned()
.ok_or(TransactionKernelError::UnknownAccountProcedure(proc_root))
}
}

This if clause will hence push 255 on the operand stack:

         if proc_root == Digest::default() { 
             return Ok(255); 
         } 

Without the index out of bounds check that was added in get_procedure_info mentioned at the top of this comment the procedure continues execution as follows:

# => [255]

push.2 mul exec.memory::get_account_procedures_section_offset add dup push.1 add
# => [1211, 1210]

# Here 2 possibilities: 
# - There are elements at  location 1210 and 1211 in memory and they are returned
# - There are no elements at location 1210 and 1211 in memory and 0's are returned
mem_load swap padw movup.4 mem_loadw
# => [0,0,0,0,0]

# Next we will be checking if the returned data from the memory matches the PROC_ROOT (which in this case is [0,0,0,0]
movup.4 movdn.8 assert_eqw.err=ERR_PROC_NOT_PART_OF_ACCOUNT_CODE
# => [0]

Hence we understand here that the tests would pass. Not because the logic is valid but because the caller in the root context is [0,0,0,0] and the returned values from the procedure in the memory are also [0,0,0,0].

@bobbinth

A few questions:

  • Why do we have 2 versions of the AccountProcedureIndexMap ? classical and mock ?
  • Is it normal that the caller here is [0,0,0,0] ?
  • From your answer yesterday I understand that it is because it comes from the root context, but shouldn't it be erroring out ?

@bobbinth
Copy link
Contributor

Why do we have 2 versions of the AccountProcedureIndexMap ? classical and mock ?

I think the reason was exactly to make this tests work (i.e., to skip the real procedure authentication check).

  • Is it normal that the caller here is [0,0,0,0] ?
  • From your answer yesterday I understand that it is because it comes from the root context, but shouldn't it be erroring out ?

It should not be possible to execute the caller instruction if we are not in a syscall, but apparently the VM doesn't check for this. Once this check is added to the VM, all instances where we execute caller like in the test in question will start failing.

But as long as there is no check, caller returning [0, 0, 0, 0] when invoked in the root context is expected (i.e., the root context has no caller and therefore the returned values are all zeros).

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left some more comments inline - but they are either pretty minor or can be addressed in the next PR (when we actually integrate offsets into storage access procedures).

miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-tx/src/testing/account_procs.rs Outdated Show resolved Hide resolved
Comment on lines 396 to 399
# move procedure data from the advice map to the advice stack and then push the number of
# procedures onto the operand stack before storing it in memory
adv.push_mapval adv_push.1 dup exec.memory::set_num_account_procedures
# => [num_procs, CODE_COMMITMENT]

This comment was marked as resolved.

miden-lib/asm/miden/kernels/tx/account.masm Show resolved Hide resolved
miden-lib/asm/miden/kernels/tx/account.masm Show resolved Hide resolved
miden-lib/asm/miden/kernels/tx/account.masm Show resolved Hide resolved
objects/src/accounts/code/mod.rs Show resolved Hide resolved
@phklive phklive requested a review from bobbinth July 29, 2024 13:56
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a few more nits inline. Once these are addressed, let's merge.

miden-tx/src/host/mod.rs Outdated Show resolved Hide resolved
miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
miden-lib/asm/miden/kernels/tx/prologue.masm Outdated Show resolved Hide resolved
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you. I left one more comment inline. Also, there seems to be a merge conflict that needs to be resolved.

miden-tx/src/host/account_procs.rs Outdated Show resolved Hide resolved
@bobbinth bobbinth merged commit b2b6621 into next Jul 30, 2024
13 checks passed
@bobbinth bobbinth deleted the phklive-account-code-refactor branch July 30, 2024 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants