From 303b23af62902cc7ba94e2af0726007e409139a5 Mon Sep 17 00:00:00 2001 From: Paul Schoenfelder Date: Fri, 6 Sep 2024 02:21:07 -0400 Subject: [PATCH 1/5] fix(driver): incorrect extension for masl output type --- midenc-session/src/outputs.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/midenc-session/src/outputs.rs b/midenc-session/src/outputs.rs index 077f48bad..b76e6e578 100644 --- a/midenc-session/src/outputs.rs +++ b/midenc-session/src/outputs.rs @@ -45,7 +45,7 @@ impl OutputType { Self::Hir => "hir", Self::Masm => "masm", Self::Mast => "mast", - Self::Masl => "mast", + Self::Masl => "masl", Self::Masp => "masp", } } From 4ad29566b3c00709c510b3cab0d1874549df13b8 Mon Sep 17 00:00:00 2001 From: Paul Schoenfelder Date: Fri, 6 Sep 2024 02:21:41 -0400 Subject: [PATCH 2/5] fix(driver): ensure mast/masl outputs are emitted on request --- midenc-compile/src/stages/assemble.rs | 2 ++ 1 file changed, 2 insertions(+) diff --git a/midenc-compile/src/stages/assemble.rs b/midenc-compile/src/stages/assemble.rs index 44ea3d192..82d7306ac 100644 --- a/midenc-compile/src/stages/assemble.rs +++ b/midenc-compile/src/stages/assemble.rs @@ -44,6 +44,8 @@ impl Stage for AssembleStage { "successfully assembled mast artifact with digest {}", DisplayHex::new(&mast.digest().as_bytes()) ); + session.emit(OutputMode::Text, &mast).into_diagnostic()?; + session.emit(OutputMode::Binary, &mast).into_diagnostic()?; Ok(Artifact::Assembled(masm::Package::new(mast, &masm_artifact, session))) } Left(masm_artifact) => { From d7ca6b2dc148ec450b373fa5e3d31d3af63c4352 Mon Sep 17 00:00:00 2001 From: Paul Schoenfelder Date: Fri, 6 Sep 2024 02:22:42 -0400 Subject: [PATCH 3/5] docs: revisit/update documentation and guides --- docs/appendix/calling_conventions.md | 54 ++--- docs/appendix/canonabi-adhocabi-mismatch.md | 152 ++++++++---- docs/appendix/known-limitations.md | 255 ++++++++++++++++++++ docs/design/frontends.md | 7 +- docs/design/overview.md | 16 +- docs/guides/develop_miden_in_rust.md | 19 +- docs/guides/rust_to_wasm.md | 62 +++-- docs/guides/wasm_to_masm.md | 119 +++++---- docs/index.md | 109 +++++++-- docs/usage/cargo-miden.md | 64 +++-- docs/usage/debugger.md | 223 +++++++++++++++++ docs/usage/midenc.md | 98 ++++++-- midenc-debug/README.md | 218 +---------------- mkdocs.yml | 5 +- 14 files changed, 957 insertions(+), 444 deletions(-) create mode 100644 docs/appendix/known-limitations.md create mode 100644 docs/usage/debugger.md diff --git a/docs/appendix/calling_conventions.md b/docs/appendix/calling_conventions.md index e89d48228..6acc64806 100644 --- a/docs/appendix/calling_conventions.md +++ b/docs/appendix/calling_conventions.md @@ -15,17 +15,17 @@ There are four calling conventions represented in the compiler: - `Kernel`, this is a special calling convention that is used when defining kernel modules in the IR. Functions which are part of the kernel's public API are required to use this convention, and it is not possible to call a function via `syscall` if the callee is not defined with this convention. Because of - the semantics of `syscall`, this convention is highly restrictive. In particular, it is not permitted to + the semantics of `syscall`, this convention is highly restrictive. In particular, it is not permitted to pass pointer arguments, or aggregates containing pointers, as `syscall` involves a context switch, and thus memory in the caller is not accessible to the callee, and vice versa. - `Contract`, this is a special calling convention that is used when defining smart contract functions, i.e. functions that can be `call`'d. The compiler will not permit you to `call` a function if the callee is not - defined with this convention, and functions with this convention cannot be called via `exec`. Like `syscall`, + defined with this convention, and functions with this convention cannot be called via `exec`. Like `syscall`, the `call` instruction involves a context switch, however, unlike the `Kernel` convention, the `Contract` convention is allowed to have types in its signature that are/contain pointers, with certain caveats around those pointers. - - + + All four conventions above are based on the System V C ABI, tailored to the Miden VM. The only exception is `Fast`, which may modify the ABI arbitrarily as it sees fit, and makes no guarantees about what modifications, if any, it will make. @@ -77,19 +77,20 @@ the section on the memory model below for more details. [^8]: An `enum` is `i32` if all members of the enumeration can be represented by an `int`/`unsigned int`, otherwise it uses i64. -> [!NOTE] -> The compiler does not support scalars larger than one word (128 bits) at this time. As a result, anything that is -> larger than that must be allocated in linear memory, or in an automatic allocation (function-local memory), and passed -> around by reference. +!!! note -The native scalar type for the Miden VM is a "field element", specifically a 64-bit value representing an integer + The compiler does not support scalars larger than one word (128 bits) at this time. As a result, anything that is + larger than that must be allocated in linear memory, or in an automatic allocation (function-local memory), and passed + around by reference. + +The native scalar type for the Miden VM is a "field element", specifically a 64-bit value representing an integer in the "Goldilocks" field, i.e. `0..(2^64-2^32+1)`. A number of instructions in the VM operate on field elements directly. -However, the native integral/pointer type, i.e. a "machine word", is actually `u32`. This is because a field element +However, the native integral/pointer type, i.e. a "machine word", is actually `u32`. This is because a field element can fully represent 32-bit integers, but not the full 64-bit integer range. Values of `u32` type are valid field element -values, and can be used anywhere that a field element is expected (barring other constraints). +values, and can be used anywhere that a field element is expected (barring other constraints). Miden also has the notion of a "word", not to be confused with a "machine word" (by which we mean the native integral -type used to represent pointers), which corresponds to a set of 4 field elements. Words are commonly used in Miden, +type used to represent pointers), which corresponds to a set of 4 field elements. Words are commonly used in Miden, particularly to represent hashes, and a number of VM instructions operate on word-sized operands. As an aside, 128-bit integer values are represented using a word, or two 64-bit limbs (each limb consisting of two 32-bit limbs). @@ -177,7 +178,7 @@ emulation will come from values which cross an element or word boundary. # Function Calls -This section describes the conventions followed when executing a function call via `exec`, including how arguments are passed on the +This section describes the conventions followed when executing a function call via `exec`, including how arguments are passed on the operand stack, stack frames, etc. Later, we'll cover the differences when executing calls via `call` or `syscall`. ## Locals and the stack frame @@ -205,11 +206,11 @@ those are described below in the section covering the operand stack. Miden is a [Harvard](https://en.wikipedia.org/wiki/Harvard_architecture) architecture; as such, code and data are not in the same memory space. More precisely, in Miden, code is only addressable via the hash of the MAST root of that code, which must correspond to code that has been loaded into the VM. The hash of the MAST root of a function can be used to call that function both directly and indirectly, but -that is the only action you can take with it. Code can not be generated and called on the fly, and it is not stored anywhere that is -accessible to code that is currently executing. +that is the only action you can take with it. Code can not be generated and called on the fly, and it is not stored anywhere that is +accessible to code that is currently executing. One consequence of this is that there are no return addresses or instruction pointers visible to executing code. The runtime call stack is -managed by the VM itself, and is not exposed to executing code in any way. This means that address-taken local C variables need to be on a +managed by the VM itself, and is not exposed to executing code in any way. This means that address-taken local C variables need to be on a separate stack in linear memory (which we refer to as a "shadow stack"). Not all functions necessarily require a frame in the shadow stack, as it cannot be used to perform unwinding, so only functions which have locals require a frame. @@ -218,7 +219,7 @@ number of locals, will be automatically allocated sufficient space for those loc you use the `locaddr` instruction to get the actual address of a local, that address can be passed as an argument to callees (within the constraints of the callee's calling convention). -Languages with more elaborate requirements with regard to the stack will need to implement their own shadow stack, and emit code in function +Languages with more elaborate requirements with regard to the stack will need to implement their own shadow stack, and emit code in function prologues/epilogues to manage it. ### The operand stack @@ -226,7 +227,7 @@ prologues/epilogues to manage it. The Miden virtual machine is a stack machine, not a register machine. Rather than having a fixed set of registers that are used to store and manipulate scalar values, the Miden VM has the operand stack, which can hold an arbitrary number of operands (where each operand is a single field element), of which the first 16 can be directly manipulated using special stack instructions. The operand -stack is, as the name implies, a last-in/first-out data structure. +stack is, as the name implies, a last-in/first-out data structure. The following are basic rules all conventions are expected to follow with regard to the operand stack: @@ -249,7 +250,7 @@ then one of the following must happen: Miden Abstract Syntax Trees (MASTs) do not have any notion of functions, and as such are not aware of parameters, return values, etc. For this document, that's not a useful level of abstraction to examine. Even a step higher, Miden Assembly (MASM) has functions (procedures -in MASM parlance), but no function signature, i.e. given a MASM procedure, there is no way to know how many arguments it expects, how +in MASM parlance), but no function signature, i.e. given a MASM procedure, there is no way to know how many arguments it expects, how many values it returns, let alone the types of arguments/return values. Instead, we're going to specify calling conventions in terms of Miden IR, which has a fairly expressive type system more or less equivalent to that of LLVM, and how that translates to Miden primitives. @@ -276,18 +277,18 @@ unions, and arrays) contains just a single scalar value and is not specified to have greater than natural alignment. The compiler will automatically generate code that follows these rules, but if emitting MASM from your own backend, it is necessary to do so manually. -For example, a function whose signature specifies that it returns a non-scalar struct by value, must actually be written such that it expects to receive -a pointer to memory allocated by the caller sufficient to hold the return value, as the first parameter of the function (i.e. the parameter is prepended +For example, a function whose signature specifies that it returns a non-scalar struct by value, must actually be written such that it expects to receive +a pointer to memory allocated by the caller sufficient to hold the return value, as the first parameter of the function (i.e. the parameter is prepended to the parameter list). When returning, the function must write the return value to that pointer, rather than returning it on the operand stack. In this example, the return value is returned indirectly (by reference). -A universal rule is that the arguments are passed in reverse order, i.e. the first argument in the parameter list of a function will be on top of the -operand stack. This is different than many Miden instructions which seemingly use the opposite convention, e.g. `add`, which expects the right-hand -operand on top of the stack, so `a + b` is represented like `push a, push b, add`. If we were to implement `add` as a function, it would instead be -`push b, push a, exec.add`. The rationale behind this is that, in general, the more frequently used arguments appear earlier in the parameter list, +A universal rule is that the arguments are passed in reverse order, i.e. the first argument in the parameter list of a function will be on top of the +operand stack. This is different than many Miden instructions which seemingly use the opposite convention, e.g. `add`, which expects the right-hand +operand on top of the stack, so `a + b` is represented like `push a, push b, add`. If we were to implement `add` as a function, it would instead be +`push b, push a, exec.add`. The rationale behind this is that, in general, the more frequently used arguments appear earlier in the parameter list, and thus we want those closer to the top of the operand stack to reduce the amount of stack manipulation we need to do. -Arguments/return values are laid out on the operand stack just like they would be as if you had just loaded it from memory, so all arguments are aligned, +Arguments/return values are laid out on the operand stack just like they would be as if you had just loaded it from memory, so all arguments are aligned, but may span multiple operands on the operand stack as necessary based on the size of the type (i.e. a struct type that contains a `u32` and a `i1` field would require two operands to represent). If the maximum number of operands allowed for the call is reached, any remaining arguments must be spilled to the caller's stack frame, or to the advice provider. The former is used in the case of `exec`/`dynexec`, while the latter is used for `call` @@ -295,4 +296,3 @@ and `syscall`, as caller memory is not accessible to the callee with those instr While ostensibly 16 elements is the maximum number of operands on the operand stack that can represent function arguments, due to the way `dynexec`/`dyncall` work, it is actually limited to 12 elements, because at least 4 must be free to hold the hash of the function being indirectly called. - diff --git a/docs/appendix/canonabi-adhocabi-mismatch.md b/docs/appendix/canonabi-adhocabi-mismatch.md index b097b6a98..efd207e9f 100644 --- a/docs/appendix/canonabi-adhocabi-mismatch.md +++ b/docs/appendix/canonabi-adhocabi-mismatch.md @@ -1,31 +1,74 @@ -TL;DR: The compiler will recognize the functions with a mismatch between the canonical ABI and the tx kernel ad-hoc ABI and generate an adapter function that will call the tx kernel function and convert function arguments and result. For most TX kernel functions, the adapter function can be generated automatically. See below for the functions that require manual adapter functions. +# Canonical ABI vs Miden ABI Incompatibility -# Canonical ABI vs Miden (tx kernel) ABI mismatch and how to resolve it. +This document describes an issue that arises when trying to map the ad-hoc calling convention/ABI +used by various Miden Assembly procedures, such as those comprising the transaction kernel, and +the "canonical" ABI(s) representable in Rust. It proposes a solution to this problem in the form +of _adapter functions_, where the details of a given adapter are one of a closed set of known +ABI _transformation strategies_. -From the analisys of all the functions in the tx kernel API the Canonical ABI rule that mostly causes the mismatch between the Canonical ABI and the Miden ABI is that anything larger than 8 bytes (i64) is returned via a pointer passed as an argument. +## Summary -We want to recognize the functions with a mismatch between the Canonical ABI and the Miden ABI and make the compiler generate an adapter function that will call the tx kernel function and convert function arguments and result. +The gist of the problem is that in Miden, the size and number of procedure results is only constrained +by the maximum addressable operand stack depth. In most programming languages, particularly those in +which interop is typically performed using some variant of the C ABI (commonly the one described +in the System V specification), the number of results is almost always limited to a single result, +and the size of the result type is almost always limited to the size of a single machine word, in +some cases two. On these platforms, procedure results of greater arity or size are typically handled +by reserving space in the caller's stack frame, and implicitly prepending the parameter list of the +callee with an extra parameter: a pointer to the memory allocated for the return value. The callee +will directly write the return value via this pointer, instead of returning a value in a register. -For the complete list of the tx kernel functions in WIT format, see the [miden.wit](https://github.com/0xPolygonMiden/compiler/blob/18ead77410b27d97e96c96d36b573e289323f737/tests/rust-apps-wasm/sdk/sdk/wit/miden.wit) -For most TX kernel functions, the adapter function can be generated automatically using the pattern recognition and adapter functions below. +In the case of Rust, this means that attempting to represent a procedure that returns multiple values, +or returns a larger-than-machine-word type, such as `Word`, will trigger the implicit transformation +described above, as this is allowed by the standard Rust calling conventions. Since various Miden +procedures that are part of the standard library and the transaction kernel are affected by this, +the question becomes "how do we define bindings for these procedures in Rust?". -## Required changes in other parts of the compiler +The solution is to have the compiler emit glue code that closes the gap between the two ABIs. It +does so by generating adapter functions, which wrap functions that have an ABI unrepresentable in +Rust, and orchestrate lifting/lowering arguments and results between the adapter and the "real" +function. -To make compiler aware of tx kernel function signatures they will be passed along the MAST hash root for every import in the Wasm component. +When type signatures are available for all Miden Assembly procedures, we can completely automate +this process. For now, we will require a manually curated list of known procedures, their signatures, +and the strategy used to "adapt" those procedures for binding in Rust. -## Adapters generation +## Background -The compiler will analyze every component import to recognize the Miden ABI pattern and generate an adapter function if needed. This can be done in a transformation pass or as part of the MASM code generation. +After analyzing all of the functions in the transaction kernel API, the most common cause of a mismatch +between Miden and Rust ABIs, is due to implicit "sret" parameters, i.e. the transformation mentioned +above which inserts an implicit pointer to the caller's stack frame for the callee to write the return +value to, rather than doing so in a register (or in our case, on the operand stack). This seems to +happen for any type that is larger than 8 bytes (i64). -## Miden ABI pattern recognition +!!! tip -The following pseudo-code can be used to recognize the Miden ABI pattern: + For a complete list of the transaction kernel functions, in WIT format, see + [miden.wit](https://github.com/0xPolygonMiden/compiler/blob/main/tests/rust-apps-wasm/wit-sdk/sdk/wit/miden.wit). + +For most transaction kernel functions, the adapter function can be generated automatically using the +pattern recognition and adapter functions described below. + +### Prerequisites + +* The compiler must know the type signature for any function we wish to apply the adapter strategy to + +### Implementation + +The compiler will analyze every component import to determine if that import requires an adapter, +as determined by matching against a predefined set of patterns. The adapter generation will take +place in the frontend, as it has access to all of the needed information, and ensures that we do +not have any transformations or analyses that make decisions on the un-adapted procedure. + +The following pseudo-code can be used to recognize the various Miden ABI patterns: ```rust pub enum MidenAbiPattern { + /// Calling this procedure will require an sret parameter on the Rust side, so + /// we need to emit an adapter that will lift/lower calls according to that + /// strategy. ReturnViaPointer, - /// The Wasm core function type is the same as the tx kernel ad-hoc signature - /// The tx kernel function can be called directly without any modifications. + /// The underlying procedure is fully representable in Rust, and requires no adaptation. NoAdapterNeeded, } @@ -65,15 +108,13 @@ pub fn recognize_miden_abi_pattern( } ``` -## Adapter function code generation - -The following pseudo-code can be used to generate the adapter function: +The following pseudo-code can then be used to generate the adapter function: ```rust pub fn generate_adapter(recognition: MidenAbiPatternRecognition) { match recognition.pattern { Some(pattern) => generate_adapter( - pattern, + pattern, recognition.component_function, recognition.wasm_core_function, recognition.tx_kernel_function @@ -93,28 +134,37 @@ pub fn use_manual_adapter(...) { } ``` -The manual adapter library is a collection of adapter functions that are used when the compiler can't generate an adapter function automatically so its expected to be provided. The manual adapter library is a part of the Miden compiler. - +The manual adapter library is a collection of adapter functions that are used when the compiler +can't generate an adapter function automatically so its expected to be provided. The manual adapter +library is a part of the Miden compiler. It is not anticipated that we will have many, or any, of +these; however in the near term we are going to manually map procedures to their adapter strategies, +as we have not yet automated the pattern recognition step. ### Return-via-pointer Adapter -The return value is expected to be returned by storing its flattened representation in a pointer passed as an argument. +The return value is expected to be returned by storing its flattened representation in a pointer +passed as an argument. -Recognize this Miden ABI pattern by looking at the Wasm component function type. If the return value is bigger than 64 bits, expect the last argument in the Wasm core(HIR) signature to be `i32` (a pointer). +Recognize this Miden ABI pattern by looking at the Wasm component function type. If the return value +is bigger than 64 bits, expect the last argument in the Wasm core(HIR) signature to be `i32` (a pointer). -The adapter function calls the tx kernel function and stores the result in the provided pointer(the last argument of the wasm core function). +The adapter function calls the tx kernel function and stores the result in the provided pointer (the +last argument of the Wasm core function). + +Here is the pseudo-code for generating the adapter function for the return-via-pointer Miden ABI +pattern: -Here is the pseudo-code for generating the adapter function for the Return-via-pointer Miden ABI pattern: ```rust - let ptr = wasm_core_function.params.last(); - let adapter_function = FunctionBuilder::new(wasm_core_function.clone()); - let tx_kernel_function_params = wasm_core_function.params.drop_last(); - let tx_kernel_func_val = adapter_function.call(tx_kernel_function, tx_kernel_function_params); - adapter_function.store(tx_kernel_func_val, ptr); - adapter_function.build(); +let ptr = wasm_core_function.params.last(); +let adapter_function = FunctionBuilder::new(wasm_core_function.clone()); +let tx_kernel_function_params = wasm_core_function.params.drop_last(); +let tx_kernel_func_val = adapter_function.call(tx_kernel_function, tx_kernel_function_params); +adapter_function.store(tx_kernel_func_val, ptr); +adapter_function.build(); ``` Here is how the adapter might look like in a pseudo-code for the `add_asset` function: + ``` /// Takes an Asset as an argument and returns a new Asset func wasm_core_add_asset(v0: f64, v1: f64, v2: f64, v3: f64, ptr: i32) { @@ -124,7 +174,7 @@ func wasm_core_add_asset(v0: f64, v1: f64, v2: f64, v3: f64, ptr: i32) { } ``` -### No-adapter-needed +### No-op Adapter No adapter is needed. The Wasm core function type is the same as the tx kernel ad-hoc signature. @@ -135,12 +185,17 @@ For example, the `get_id` function falls under this Miden ABI pattern and its ca ## Transaction kernel functions that require manual adapter functions: -### `get_assets` +### `get_assets` -`get_assets:func() -> list` in the `note` interface is the only function that requires attention. In Canonical ABI, any function that returns a dynamic list of items needs to allocate memory in the caller's module due to the shared-nothing nature of the Wasm component model. For this case, a `realloc` function is passed as a part of lift/lower Canonical ABI options for the caller to allocate memory in the caller's module. +`get_assets:func() -> list` in the `note` interface is the only function that requires attention. +In Canonical ABI, any function that returns a dynamic list of items needs to allocate memory in the caller's +module due to the shared-nothing nature of the Wasm component model. For this case, a `realloc` function +is passed as a part of lift/lower Canonical ABI options for the caller to allocate memory in the caller's +module. Here are the signatures of the `get_assets` function in the WIT, core Wasm, and the tx kernel ad-hoc ABI: Comment from the `miden-base` + ``` #! Writes the assets of the currently executing note into memory starting at the specified address. #! @@ -157,16 +212,24 @@ Wasm component function type: Wasm core signature: `wasm_core_get_assets(i32) -> ()` -If we add a new `get_assets_count: func() -> u32;` function to the tx kernel and add the assets count parameter to the `get_assets` function (`get_assets: func(assets_count: u32) -> list;`) we should have everything we need to manually write the adapter function for the `get_assets` function. +If we add a new `get_assets_count: func() -> u32;` function to the tx kernel and add the assets count +parameter to the `get_assets` function (`get_assets: func(assets_count: u32) -> list;`) +we should have everything we need to manually write the adapter function for the `get_assets` +function. -The list is expected to be returned by storing the pointer to its first item in a `ptr` pointer passed as an argument and item count at `ptr + 4 bytes` address (`ptr` points to two pointers). +The list is expected to be returned by storing the pointer to its first item in a `ptr` pointer +passed as an argument and item count at `ptr + 4 bytes` address (`ptr` points to two pointers). -We could try to recognize this Miden ABI pattern by looking at the Wasm component function type. If the return value is a list, expect the last argument in the Wasm core(HIR) signature to be `i32` (a pointer). The problem is recognizing the list count parameter in the Wasm core(HIR) signature. +We could try to recognize this Miden ABI pattern by looking at the Wasm component function type. If +the return value is a list, expect the last argument in the Wasm core(HIR) signature to be `i32` +(a pointer). The problem is recognizing the list count parameter in the Wasm core(HIR) signature. -The adapter function calls allocates `asset_count * item_size` memory via the `realloc` call and passes the pointer to the newly allocated memory to the tx kernel function. +The adapter function calls allocates `asset_count * item_size` memory via the `realloc` call and +passes the pointer to the newly allocated memory to the tx kernel function. Here is how the adapter function might look like in a pseudo-code for the `get_assets` function: -``` + +```rust func wasm_core_get_assets(asset_count: u32, ptr_ptr: i32) { mem_size = asset_count * item_size; ptr = realloc(mem_size); @@ -174,17 +237,23 @@ func wasm_core_get_assets(asset_count: u32, ptr_ptr: i32) { assert(actual_asset_count == asset_count); store ptr in ptr_ptr; store account_count in ptr_ptr + 4; - } ``` -**Since the `get_assets` tx kernel function in the current form can trash the provided memory if the actual assets count differs from the returned by `get_assets_count`, we can introduce the asset count parameter to the `get_assets` tx kernel function and check that it the same as the actual assets count written to memory.** +!!! note + Since the `get_assets` tx kernel function in the current form can trash the provided memory if + the actual assets count differs from the returned by `get_assets_count`, we can introduce the + asset count parameter to the `get_assets` tx kernel function and check that it the same as the + actual assets count written to memory. -## The example of some functions signatures + +## The example of some functions signatures ### `add_asset` (return-via-pointer Miden ABI pattern) + Comment from the `miden-base` + ``` #! Add the specified asset to the vault. #! @@ -214,6 +283,7 @@ Tx kernel ad-hoc signature: ### `get_id` (no-adapter-needed Miden ABI pattern) + Comment from the `miden-base` ``` #! Returns the account id. diff --git a/docs/appendix/known-limitations.md b/docs/appendix/known-limitations.md new file mode 100644 index 000000000..b21ae34db --- /dev/null +++ b/docs/appendix/known-limitations.md @@ -0,0 +1,255 @@ +# Known Limitations + +!!! tip + + See the [issue tracker](https://github.com/0xpolygonmiden/compiler/issues) for information + on known bugs. This document focuses on missing/incomplete features, rather than bugs. + +The compiler is still in its early stages of development, so there are various features that are +unimplemented, or only partially implemented, and the test suite is still limited in scope, so +we are still finding bugs on a regular basis. We are rapidly improving this situation, but it is +important to be aware of this when using the compiler. + +The features discussed below are broken up into sections, to make them easier to navigate and +reference. + +## Rust Language Support + +### Floating Point Types + +- Status: **Unsupported** +- Tracking Issue: N/A +- Release Milestone: N/A + +In order to represent `Felt` "natively" in Rust, we were forced to piggy-back on the `f32` type, +which is propagated through to WebAssembly, and allows us to handle those values specially. + +As a result, floating-point types in Rust are not supported at all. Any attempt to use them will +result in a compilation error. We considered this a fair design tradeoff, as floating point math +is unused/rare in the context in which Miden is used, in comparison to fixed-point or field +arithmetic. In addition, implementing floating-point operations in software on the Miden VM would +be extraordinarily expensive, which generally works against the purpose for using floats in the +first place. + +At this point in time, we have no plans to support floats, but this may change if we are able to +find a better/more natural representation for `Felt` in WebAssembly. + + +### Function Call Indirection + +- Status: **Unimplemented** +- Tracking Issue: [#32](https://github.com/0xPolygonMiden/compiler/issues/32) +- Release Milestone: [Beta 1](https://github.com/0xPolygonMiden/compiler/milestone/4) + +This feature corresponds to `call_indirect` in WebAssembly, and is associated with Rust features +such as trait objects (which use indirection to call trait methods), and closures. Note that the +Rust compiler is able to erase the indirection associated with certain abstractions statically +in some cases, shown below. If Rust is unable to statically resolve all call targets, then `midenc` +will raise an error when it encounters any use of `call_indirect`. + +!!! warning + + The following examples rely on `rustc`/LLVM inlining enough code to be able to convert indirect + calls to direct calls. This may require you to enable link-time optimization with `lto = "fat"` + and compile all of the code in the crate together with `codegen-units = 1`, in order to maximize + the amount of inlining that can occur. Even then, it may not be possible to remove some forms of + indirection, in which case you will need to find another workaround. + +#### Iterator Lowered to Loop + +```rust +pub fn is_zeroed(bytes: &[u8; 32]) -> bool { + // Rust is able to convert this to a loop, erasing the closure completely + bytes.iter().copied().all(|b| b == 0) +} +``` + +#### Monomorphization + Inlining + +```rust +pub fn call(fun: F) -> T +where + F: Fn() -> T, +{ + fun() +} + +#[inline(never)] +pub fn foo() -> bool { true } + +fn main() { + // Rust is able to inline the body of `call` after monomorphization, which results in + // the call to `foo` being resolved statically. + call(foo) +} +``` + +#### Inlined Trait Impl + +```rust +pub trait Foo { + fn is_foo(&self) -> bool; +} + +impl Foo for u32 { + #[inline(never)] + fn is_foo(&self) -> bool { true } +} + +fn has_foo(items: &[dyn Foo]) -> bool { + items.iter().any(|item| item.is_foo()) +} + +fn main() -> u32 { + // Rust inlines `has_foo`, converts the iterator chain to a loop, and is able to realize + // that the `dyn Foo` items are actually `u32`, and resolves the call to `is_foo` to + // `::is_foo`. + let foo: &dyn Foo = &u32::MAX as &dyn Foo; + has_foo(&[foo]) as u32 +} +``` + +### Miden SDK + +- Status: **Incomplete** +- Tracking Issue: [#159](https://github.com/0xPolygonMiden/compiler/issues/159) and [#158](https://github.com/0xPolygonMiden/compiler/issues/158) +- Release Milestone: [Beta 1](https://github.com/0xPolygonMiden/compiler/milestone/4) + +The Miden SDK for Rust, is a Rust crate that provides the implementation of native Miden types, as +well as bindings to the Miden standard library and transaction kernel APIs. + +Currently, only a very limited subset of the API surface has had bindings implemented. This means +that there is a fair amount of native Miden functionality that is not yet available from Rust. We +will be expanding the SDK rapidly over the next few weeks and months, but for the time being, if +you encounter a missing API that you need, let us know, so we can ensure it is prioritized above +APIs which are lesser used. + +### Rust/Miden FFI (Foreign Function Interface) and Interop + +- Status: **Internal Use Only** +- Tracking Issue: [#304](https://github.com/0xPolygonMiden/compiler/issues/304) +- Release Milestone: TBD + +While the compiler has functionality to link against native Miden Assembly libraries, binding +against procedures exported from those libraries from Rust can require glue code to be emitted +by the compiler in some cases, and the set of procedures for which this is done is currently +restricted to a hardcoded whitelist of known Miden procedures. + +This affects any procedure which returns a type larger than `u32` (excluding `Felt`, which for +this purpose has the same size). For example, returing a Miden `Word` from a procedure, a common +return type, is not compatible with Rust's ABI - it will attempt to generate code which allocates +stack space in the caller, which it expects the callee to write to, inserting a new parameter at +the start of the parameter list, and expecting nothing to be returned by value. The compiler handles +situations like these using a set of ABI "transformation strategies", which lift/lower differences +between the Rust and Miden ABIs at call boundaries. + +To expose the FFI machinery for use with any Miden procedure, we need type signatures for those +procedures at a minimum, and in some cases we may require details of the calling convention/ABI. +This metadata does not currently exist, but is on the roadmap for inclusion into Miden Assembly +and Miden packaging. Once present, we can open up the FFI for general use. + +## Core Miden Functionality + +### Dynamic Procedure Invocation + +- Status: **Unimplemented** +- Tracking Issue: [#32](https://github.com/0xPolygonMiden/compiler/issues/32) +- Release Milestone: [Beta 1](https://github.com/0xPolygonMiden/compiler/milestone/4) + +This is a dependency of [Function Call Indirection](#function-call-indirection) described above, +and is the mechanism by which we can perform indirect calls in Miden. In order to implement support +for indirect calls in the Wasm frontend, we need underlying support for `dynexec`, which is not yet +implemented. + +This feature adds support for lowering indirect calls to `dynexec` or `dyncall` instructions, +depending on the ABI of the callee. `dyncall` has an additional dependency on support for +[Cross-Context Procedure Invocation](#cross-context-procedure-invocation). + +A known issue with this feature is that `dyn(exec|call)` consumes a word on the operand stack +for the hash of the callee being invoked, but this word _remains_ on the stack when entering the +callee, which has the effect of requiring procedures to have a different ABI depending on whether +they expect to be dynamically-invoked or not. + +Our solution to that issue is to generate stubs which are used as the target of `dyn(exec|call)`, +the body of which drop the callee hash, fix up the operand stack as necessary, and then uses a +simple `exec` or `call` to invoke the "real" callee. We will emit a single stub for every function +which has its "address" taken, and use the hash of the stub in place of the actual callee hash. + +### Cross-Context Procedure Invocation + +- Status: **Unimplemented** +- Tracking Issue: [#303](https://github.com/0xPolygonMiden/compiler/issues/303) +- Release Milestone: [Beta 2](https://github.com/0xPolygonMiden/compiler/milestone/5) + +This is required in order to support representing Miden accounts and note scripts in Rust, and +compilation to Miden Assembly. + +Currently, you can write code in Rust that is very close to how accounts and note scripts will +look like in the language, but it is not possible to actually implement either of those in Rust +today. The reasons for this are covered in depth in the tracking issue linked above, but to +briefly summarize, the primary issue has to do with the fact that Rust programs are compiled +for a "shared-everything" environment, i.e. you can pass references to memory from caller to +callee, write to caller memory from the callee, etc. In Miden however, contexts are "shared-nothing" +units of isolation, and thus cross-context operations, such as performing a `call` from a note script +to a method on an account, are not compatible with the usual calling conventions used by Rust and +LLVM. + +The solution to this relies on compiling the Rust code for the `wasm32-wasip2` target, which emits +a new kind of WebAssembly module, known as a _component_. These components adhere to the rules of +the [WebAssembly Component Model](https://component-model.bytecodealliance.org/). Of primary +interest to us, is the fact that components in this model are "shared-nothing", and the ABI used to +communicate across component boundaries, is specially designed to enforce shared-nothing semantics +on caller and callee. In addition to compiling for a specific Wasm target, we also rely on some +additional tooling for describing component interfaces, types, and to generate Rust bindings for +those descriptions, to ensure that calls across the boundary remain opaque, even to the linker, +which ensures that the assumptions of the caller and callee with regard to what address space they +operate in are preserved (i.e. a callee can never be inlined into the caller, and thus end up +executing in the caller's context rather than the expected callee context). + +This is one of our top priorities, as it is critical to being able to use Rust to compile code for +the Miden rollup, but it is also the most complex feature on our roadmap, hence why it is scheduled +for our Beta 2 milestone, rather than Beta 1 (the next release), as it depends on multiple other +subfeatures being implemented first. + +## Packaging + +### Package Format + +- Status: **Experimental** +- Tracking Issue: [#121](https://github.com/0xPolygonMiden/compiler/issues/121) +- Release Milestone: [Beta 1](https://github.com/0xPolygonMiden/compiler/milestone/4) + +This feature represents the ability to compile and distribute a single artifact that contains +the compiled MAST, and all required and optional metadata to make linking against, and executing +packages as convenient as a dynamic library or executable. + +The compiler currently produces, by default, an experimental implementation of a package format +that meets the minimum requirements to support libraries and programs compiled from Rust: + +- Name and semantic version information +- Content digest +- The compiled MAST and metadata about the procedures exported from it +- Read-only data segments and their hashes (if needed by the program, used to load data into the +advice provider when a program is loaded, and to write those segments into linear memory when the +program starts) +- Dependency information (optional, specifies what libraries were linked against during compilation) +- Debug information (optional) + +However, this package format is not yet understood by the Miden VM itself. This means you cannot, +currently, compile a package and then run it using `miden run` directly. Instead, you can use +`midenc debug` to load and run code from a package, as the interactive debugger has native support +for it. See [Debugging Programs](../usage/debugger.md) for more information on how to use the +debugger. + +!!! note + + In the next patch release, we expect to implement a `midenc run` command that simply executes + a program without attaching the debugger, which will largely resemble the eventual `miden run` + functionality. Once the package format is stabilized, using `midenc run` will no longer be + necessary. + +While it is possible to emit raw MAST from `midenc`, rather than the experimental package format, +the resulting artifact cannot be run without some fragile and error-prone manual setup, in order +to ensure that the advice provider is correctly initialized with any read-only data segments. For +now, it is recommended that you use the `midenc` tooling for testing programs, until the format +is stabilized. diff --git a/docs/design/frontends.md b/docs/design/frontends.md index ce1a29882..30d624bbd 100644 --- a/docs/design/frontends.md +++ b/docs/design/frontends.md @@ -1,7 +1,8 @@ -# Compiler frontends +# Supported Frontends -## Wasm frontend +## WebAssembly (Wasm) TODO -For the list of the unsupported Wasm core types, instructions and features, see [Wasm frontend](https://github.com/0xPolygonMiden/compiler/frontend-wasm/README.md). \ No newline at end of file +For the list of the unsupported Wasm core types, instructions and features, see the +[README](https://github.com/0xPolygonMiden/compiler/frontend-wasm/README.md). diff --git a/docs/design/overview.md b/docs/design/overview.md index b7bc73b4f..95fb3cbbf 100644 --- a/docs/design/overview.md +++ b/docs/design/overview.md @@ -1,4 +1,18 @@ # Compiler Architecture -TODO +This is an index of various design documents for the compiler and its components. Some of these +are planned topics, and some have documentation that hasn't been polished up yet. We'll slowly +start to flesh out the documentation in this section as the compiler matures. +* Driver +* [Frontends](frontends.md) +* Intermediate Representation (HIR) +* Data Layout +* Inline Assembly +* Analysis +* Rewrite Passes +* Code Generation + * Instruction Scheduling + * Instruction Selection + * Operand Stack Management +* Packaging diff --git a/docs/guides/develop_miden_in_rust.md b/docs/guides/develop_miden_in_rust.md index abdf3a088..2c2524453 100644 --- a/docs/guides/develop_miden_in_rust.md +++ b/docs/guides/develop_miden_in_rust.md @@ -1,6 +1,8 @@ # Developing Miden Programs In Rust -This chapter will walk through how to develop Miden programs in Rust using the standard library provided by the `miden-stdlib-sys` crate (see the [README](https://github.com/0xPolygonMiden/compiler/sdk/stdlib-sys/README.md)). +This chapter will walk through how to develop Miden programs in Rust using the standard library +provided by the `miden-stdlib-sys` crate (see the +[README](https://github.com/0xPolygonMiden/compiler/sdk/stdlib-sys/README.md)). ## Getting Started @@ -12,9 +14,11 @@ use miden_stdlib_sys::*; ## Using `Felt` (field element) type -The `Felt` type is a field element type that is used to represent the field element values of the Miden VM. +The `Felt` type is a field element type that is used to represent the field element values of the +Miden VM. -To initialize a `Felt` value from an integer constant checking the range at compile time, use the `felt!` macro: +To initialize a `Felt` value from an integer constant checking the range at compile time, use the +`felt!` macro: ```rust let a = felt!(42); @@ -26,8 +30,11 @@ Otherwise, use the `Felt::new` constructor: let a = Felt::new(some_integer_var).unwrap(); ``` -The constructor returns an error if the value is not a valid field element, e.g. if it is not in the range `0..=M` where `M` is the modulus of the field (2^64 - 2^32 + 1). +The constructor returns an error if the value is not a valid field element, e.g. if it is not in the +range `0..=M` where `M` is the modulus of the field (2^64 - 2^32 + 1). -The `Felt` type implements the standard arithmetic operations, e.g. addition, subtraction, multiplication, division, etc. which are accessible through the standard Rust operators `+`, `-`, `*`, `/`, etc. All arithmetic operations are wrapping, i.e. performed modulo `M`. +The `Felt` type implements the standard arithmetic operations, e.g. addition, subtraction, +multiplication, division, etc. which are accessible through the standard Rust operators `+`, `-`, +`*`, `/`, etc. All arithmetic operations are wrapping, i.e. performed modulo `M`. -TODO: Add examples of using operations on `Felt` type and available functions (`assert*`, etc.). \ No newline at end of file +TODO: Add examples of using operations on `Felt` type and available functions (`assert*`, etc.). diff --git a/docs/guides/rust_to_wasm.md b/docs/guides/rust_to_wasm.md index c1309ce75..bafa054dc 100644 --- a/docs/guides/rust_to_wasm.md +++ b/docs/guides/rust_to_wasm.md @@ -14,15 +14,14 @@ Start by creating a new library crate: cargo new --lib wasm-fib && cd wasm-fib -To compile to WebAssembly, you must have the appropriate Rust toolchain installed, and we -will also need additional Cargo nightly features to build for Miden, so let's add a toolchain -file to our project root so that `rustup` and `cargo` will know what we need, and use them by -default: +To compile to WebAssembly, you must have the appropriate Rust toolchain installed, so let's add +a toolchain file to our project root so that `rustup` and `cargo` will know what we need, and use +them by default: cat < rust-toolchain.toml [toolchain] - channel = "nightly" - targets = ["wasm32-unknown-unknown"] + channel = "stable" + targets = ["wasm32-wasip1"] EOF Next, edit the `Cargo.toml` file as follows: @@ -42,17 +41,17 @@ crate-type = ["cdylib"] # Use a tiny allocator in place of the default one, if we want # to make use of types in the `alloc` crate, e.g. String. We # don't need that now, but it's good information to have in hand. -#wee_alloc = "0.4" +#miden-sdk-alloc = "0.0.5" # When we build for Wasm, we'll use the release profile [profile.release] - # Explicitly disable panic infrastructure on Wasm, as # there is no proper support for them anyway, and it # ensures that panics do not pull in a bunch of standard # library code unintentionally panic = "abort" - +# Enable debug information so that we get useful debugging output +debug = true # Optimize the output for size opt-level = "z" ``` @@ -63,22 +62,18 @@ going to benefit from less code, even if conventionally that code would be less to the difference in proving time accumulated due to extra instructions. That said, there are no hard and fast rules, but these defaults are good ones to start with. -> [!TIP] -> We recommended `wee_alloc` here, but any simple allocator will do, including a hand-written -> bump allocator. The trade offs made by these small allocators are not generally suitable for long- -> running, or allocation-heavy applications, as they "leak" memory (generally because they make little -> to no attempt to recover freed allocations), however they are very useful for one-shot programs that -> do minimal allocation, which is going to be the typical case for Miden programs. +!!! tip -Next, edit `src/lib.rs` as shown below: + We reference a simple bump allocator provided by `miden-sdk-alloc` above, but any simple + allocator will do. The trade offs made by these small allocators are not generally suitable for + long-running, or allocation-heavy applications, as they "leak" memory (generally because they + make little to no attempt to recover freed allocations), however they are very useful for + one-shot programs that do minimal allocation, which is going to be the typical case for Miden + programs. -```rust,noplayground -// This allows us to abort if the panic handler is invoked, but -// it is gated behind a perma-unstable nightly feature -#![feature(core_intrinsics)] -// Disable the warning triggered by the use of the `core_intrinsics` feature -#![allow(internal_features)] +Next, edit `src/lib.rs` as shown below: +```rust // Do not link against libstd (i.e. anything defined in `std::`) #![no_std] @@ -92,12 +87,13 @@ Next, edit `src/lib.rs` as shown below: // a good idea to use the allocator we pulled in as a dependency // in Cargo.toml, like so: //#[global_allocator] -//static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT; +//static ALLOC: miden_sdk_alloc::BumpAlloc = miden_sdk_alloc::BumpAlloc::new(); // Required for no-std crates #[panic_handler] fn panic(_info: &core::panic::PanicInfo) -> ! { - core::intrinsics::abort() + // Compiles to a trap instruction in WebAssembly + core::arch::wasm32::unreachable() } // Marking the function no_mangle ensures that it is exported @@ -125,17 +121,17 @@ This exports our `fib` function from the library, making it callable from within All that remains is to compile to WebAssembly: - cargo build --release --target=wasm32-unknown-unknown + cargo build --release --target=wasm32-wasip1 -This places a `wasm_fib.wasm` file under the `target/wasm32-unknown-unknown/release/` directory, which +This places a `wasm_fib.wasm` file under the `target/wasm32-wasip1/release/` directory, which we can then examine with [wasm2wat](https://github.com/WebAssembly/wabt) to set the code we generated: - wasm2wat target/wasm32-unknown-unknown/release/wasm_fib.wasm + wasm2wat target/wasm32-wasip1/release/wasm_fib.wasm Which dumps the following output (may differ slightly on your machine, depending on the specific compiler version): ```wat -(module +(module $wasm_fib.wasm (type (;0;) (func (param i32) (result i32))) (func $fib (type 0) (param i32) (result i32) (local i32 i32 i32) @@ -166,17 +162,13 @@ Which dumps the following output (may differ slightly on your machine, depending end) (memory (;0;) 16) (global $__stack_pointer (mut i32) (i32.const 1048576)) - (global (;1;) i32 (i32.const 1048576)) - (global (;2;) i32 (i32.const 1048576)) (export "memory" (memory 0)) - (export "fib" (func $fib)) - (export "__data_end" (global 1)) - (export "__heap_base" (global 2))) + (export "fib" (func $fib))) ``` Success! ## Next Steps -In the next chapter, we will walk through how to take the WebAssembly module we just compiled, and lower -it to Miden Assembly using `midenc`! +In [Compiling WebAssembly to Miden Assembly](wasm_to_masm.md), we walk through how to take the +WebAssembly module we just compiled, and lower it to Miden Assembly using `midenc`! diff --git a/docs/guides/wasm_to_masm.md b/docs/guides/wasm_to_masm.md index c07be88ef..d8a20c5c2 100644 --- a/docs/guides/wasm_to_masm.md +++ b/docs/guides/wasm_to_masm.md @@ -1,37 +1,47 @@ # Compiling WebAssembly to Miden Assembly -This chapter will walk you through compiling a WebAssembly (Wasm) module, in binary form -(i.e. a `.wasm` file), to a corresponding Miden Assembly (Masm) module (i.e. a `.masm` file). +This guide will walk you through compiling a WebAssembly (Wasm) module, in binary form +(i.e. a `.wasm` file), to Miden Assembly (Masm), both in its binary package form (a `.masp` file), +and in textual Miden Assembly syntax form (i.e. a `.masm` file). ## Setup We will be making use of the example crate we created in [Compiling Rust to WebAssembly](rust_to_wasm.md), -which produces a small, lightweight Wasm module that is easy to examine in Wasm -text format, and demonstrates a good set of default choices for a project compiling -to Miden Assembly via WebAssembly. +which produces a small Wasm module that is easy to examine in Wasm text format, and demonstrates a +good set of default choices for a project compiling to Miden Assembly from Rust. -In this chapter, we will be compiling Wasm to MASM using the `midenc` executable, so ensure that -you have followed the instructions in the [Getting Started (midenc)](../usage/midenc.md) guide +In this chapter, we will be compiling Wasm to Masm using the `midenc` executable, so ensure that +you have followed the instructions in the [Getting Started with `midenc`](../usage/midenc.md) guide and then return here. +!!! note + + While we are using `midenc` for this guide, the more common use case will be to use the + `cargo-miden` Cargo extension to handle the gritty details of compiling from Rust to Wasm + for you. However, the purpose of this guide is to show you what `cargo-miden` is handling + for you, and to give you a foundation for using `midenc` yourself if needed. + ## Compiling to Miden Assembly In the last chapter, we compiled a Rust crate to WebAssembly that contains an implementation -of the Fibonacci function called `fib`, that was emitted to `target/wasm32-unknown-unknown/release/wasm_fib.wasm`. -All that remains is to tell `midenc` to compile this module to WebAssembly, as shown below: +of the Fibonacci function called `fib`, that was emitted to +`target/wasm32-wasip1/release/wasm_fib.wasm`. All that remains is to tell `midenc` to compile this +module to Miden Assembly. + +Currently, by default, the compiler will emit an experimental package format that the Miden VM does +not yet support. To demonstrate what using compiled code with the VM will look like, we're going to +tell the compiler to emit a Miden Assembly library (a `.masl` file), as well as Miden Assembly text +format, so that we can take a look at what the actual Masm looks like: -> [!NOTE] -> The compiler is still under heavy development, so there are some known bugs that -> may interfere with compilation depending on the flags you use - for the moment, the compiler -> invocation we have to use is quite verbose, but this is a short term situation while we -> address various other higher-priority tasks. Ultimately, using `midenc` directly will be -> less common than other use cases (such as using `cargo miden`, or using the compiler as a -> library for your own language frontend). +```bash +midenc compile --emit masm=wasm_fib.masm,masl target/wasm32-wasip1/release/wasm_fib.wasm +``` - midenc compile -o wasm_fib.masm --emit=masm target/wasm32-unknown-unknown/release/wasm_fib.wasm +This will compile our Wasm module to a Miden Assembly library with the `.masl` extension, and also +emit the textual Masm to `wasm_fib.masm` so we can review it. The `wasm_fib.masl` file will be +emitted in the current directory by default. -This will place the generated Miden Assembly code for our `wasm_fib` crate in the current directory. -If we dump the contents of this file, we'll see the following generated code: +If we dump the contents of `wasm_fib.masm`, we'll see the following generated code: ``` export.fib @@ -70,44 +80,51 @@ end If you compare this to the WebAssembly text format, you can see that this is a fairly faithful translation, but there may be areas where we generate sub-optimal Miden Assembly. -At the moment the compiler does only minimal optimization, late in the pipeline during codegen, -and only in regards to operand stack management. In other words, if you see an instruction -sequence you think is bad, certainly bring it to our attention, but we can't guarantee that -the code we generate will match what you would write by hand. +!!! note + + At the moment the compiler does only minimal optimization, late in the pipeline during codegen, + and only in an effort to minimize operand stack management code. So if you see an instruction + sequence you think is bad, bring it to our attention, and if it is something that we can solve + as part of our overall optimization efforts, we will be sure to do so. There _are_ limits to + what we can generate compared to what one can write by hand, particularly because Rust's + memory model requires us to emulate byte-addressable memory on top of Miden's word-addressable + memory, however our goal is to keep this overhead within an acceptable bound in the general case, + and easily-recognized patterns that can be simplified using peephole optimization are precisely + the kind of thing we'd like to know about, as those kinds of optimizations are likely to produce + the most significant wins. ## Testing with the Miden VM -> [!NOTE] -> This example is more complicated than it needs to be at the moment, bear with us! +!!! note -Assuming you have followed the instructions for installing the Miden VM locally, -we can test this program out as follows: + For the moment, the `miden run` command does not support running a compiled MAST program + directly, so we are compiling to a library, and then providing a thin executable module + which will execute the `fib` function. This is expected to change in an upcoming release. -First, we need to define a program to link our `wasm_fib.masm` module into, since -it is not a program, but a library module: +Assuming you have followed the instructions for installing the Miden VM locally, we can test our +compiled program out as follows: - cat < main.masm - use.wasm_fib::wasm_fib +First, we need to define an executable module which will invoke the `fib` procedure from our +compiled `wasm_fib.masl` library: - begin - exec.wasm_fib::fib - end - EOF +```bash +cat < main.masm +begin + exec.::wasm_fib::fib +end +EOF +``` We will also need a `.inputs` file to pass arguments to the program: - cat < wasm_fib.inputs - { - "operand_stack": ["10"], - "advice_stack": ["0"] - } - EOF - -Next, we need to build a MASL library (normally `midenc` would do this, but there is a bug -blocking it at the moment, this example will be updated accordingly soon): - - mkdir -p wasm_fib && mv wasm_fib.masm wasm_fib/ - miden bundle -n wasm_fib wasm_fib +```bash +cat < wasm_fib.inputs +{ + "operand_stack": ["10"], + "advice_stack": [] +} +EOF +``` With these in place, we can put it all together and run it: @@ -115,7 +132,7 @@ With these in place, we can put it all together and run it: ============================================================ Run program ============================================================ - Reading library file `wasm_fib/wasm_fib.masl` + Reading library file `wasm_fib.masl` Reading program file `main.masm` Parsing program... done (0 ms) Compiling program... done (2 ms) @@ -136,6 +153,6 @@ Success! We got the expected result of `55`. ## Next Steps This guide is not comprehensive, as we have not yet examined in detail the differences between -compiling libraries vs programs, linking together multiple libraries, emitting a `.masl` library, -or discussed some of the compiler options. We will be updating this documentation with those -details and more in the coming days, so bear with us while we flesh out our guides! +compiling libraries vs programs, linking together multiple libraries, packages, or discussed some of +the more esoteric compiler options. We will be updating this documentation with those details and +more in the coming weeks and months, so bear with us while we flesh out our guides! diff --git a/docs/index.md b/docs/index.md index aee331a58..37b52ddce 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,22 +1,103 @@ # Getting Started -This page attempts to provide a thorough reference for compiling to Miden Assembly -using one or more components of the Miden compiler suite. Which components you use, -and how you use them is likely to differ depending on the project, but we've tried -to provide good coverage regardless. +Welcome to the documentation for the Miden compiler toolchain! -There are a set of guides which are focused on documenting the workflows for specific -use cases that we wish to ensure are well supported, or have encountered so far, but -if you feel there is anything missing, feel free to open an issue and we will try to -address the missing docs as soon as possible! +!!! warning -## Installation + The compiler is currently in an experimental state, and has known bugs and limitations, it is + not yet ready for production usage. However, we'd encourage you to start experimenting with it + yourself, and give us feedback on any issues or sharp edges you encounter. + +The documentation found here should provide a good starting point for the current capabilities of +the toolchain, however if you find something that is not covered, but is not listed as +unimplemented or a known limitation, please let us know by reporting an issue on the compiler +[issue tracker](https://github.com/0xpolygonmiden/compiler/issues). + +## What is provided? + +The compiler toolchain consists of the following primary components: + +- An intermediate representation (IR), which can be lowered to by compiler backends wishing to +support Miden as a target. The Miden IR is an SSA IR, much like Cranelift or LLVM, providing a +much simpler path from any given source language (e.g. Rust), to Miden Assembly. It is used +internally by the rest of the Miden compiler suite. +- A WebAssembly (Wasm) frontend for Miden IR. It can handle lowering both core Wasm modules, as +well as basic components using the experimental WebAssembly Component Model. Currently, the Wasm +frontend is known to work with Wasm modules produced by `rustc`, which is largely just what LLVM +produces, but with the shadow stack placed at the start of linear memory rather than after +read-only data. In the future we intend to support more variety in the structure of Wasm modules +we accept, but for the time being we're primarily focused on using this as the path for lowering +Rust to Miden. +- The compiler driver, in the form of the `midenc` executable, and a Rust crate, `midenc-compiler` +to allow integrating the compiler into other tools. This plays the same role as `rustc` does in +the Rust ecosystem. +- A Cargo extension, `cargo-miden`, that provides a convenient developer experience for creating +and compiling Rust projects targeting Miden. It contains a project template for a basic Rust crate, +and handles orchestrating `rustc` and `midenc` to compile the crate to WebAssembly, and then to +Miden Assembly. +- A terminal-based interactive debugger, available via `midenc debug`, which provides a UI very +similar to `lldb` or `gdb` when using the TUI mode. You can use this to run a program, or step +through it cycle-by-cycle. You can set various types of breakpoints; see the source code, call +stack, and contents of the operand stack at the current program point; as well as interatively +read memory and format it in various ways for display. +- A Miden SDK for Rust, which provides types and bindings to functionality exported from the Miden +standard library, as well as the Miden transaction kernel API. You can use this to access native +Miden features which are not provided by Rust out-of-the-box. The project template generated by +`cargo miden new` automatically adds this as a dependency. + +## What can I do with it? + +That all sounds great, but what can you do with the compiler today? The answer depends a bit on what +aspect of the compiler you are interested in: + +### Rust + +The most practically useful, and interesting capability provided by the compiler currently, is the +ability to compile arbitrary Rust programs to Miden Assembly. See the guides for more information +on setting up and compiling a Rust crate for execution via Miden. + +### WebAssembly + +More generally, the compiler frontend is capable of compiling WebAssembly modules, with some +constraints, to Miden Assembly. As a result, it is possible to compile a wider variety of languages +to Miden Assembly than just Rust, so long as the language can compile to WebAssembly. However, we +do not currently provide any of the language-level support for languages other than Rust, and +have limited ability to provide engineering support for languages other than Rust at this time. -There are three ways you might use the Miden compiler: +Our Wasm frontend does not support all of the extensions to the WebAssembly MVP, most notably the +reference types and GC proposals. +### Miden IR + +If you are interested in compiling to Miden from your own compiler, you can target Miden IR, and +invoke the driver from your compiler to emit Miden artifacts. At this point in time, we don't have +the resources to provide much in the way of engineering support for this use case, but if you find +issues in your efforts to use the IR in your compiler, we would certainly like to know about them! + +We do not currently perform any optimizations on the IR, since we are primarily working with the +output of compiler backends which have already applied optimizations, at this time. This may change +in the future, but for now it is expected that you implement your own optimization passes as needed. + +## Known Bugs and Limitations + +For the latest information on known bugs, see the [issue tracker](https://github.com/0xpolygonmiden/compiler/issues). + +See [Known Limitations](appendix/known-limitations.md) for details on what functionality is +missing or only partially implemented. + + +## Where to start? + +Provided here are a set of guides which are focused on documenting a couple of supported workflows +we expect will meet the needs of most users, within the constraints of the current feature set of +the compiler. If you find that there is something you wish to do that is not covered, and is not +one of our known limitations, please open an issue, and we will try to address the missing docs as +soon as possible. + +## Installation -1. As an executable (via `midenc`) -2. As a library (most likely via the `midenc-compile` and `midenc-hir` crates) -3. As a Cargo extension (via `cargo miden`) +To get started, there are a few ways you might use the Miden compiler. Select the one that applies +to you, and the corresponding guide will walk you through getting up and running: -Each of these is described in the following chapters, we hope you find this book useful! +1. [Using the Cargo extension](usage/cargo-miden.md) +2. [Using the `midenc` executable](usage/midenc.md) diff --git a/docs/usage/cargo-miden.md b/docs/usage/cargo-miden.md index ba0ce6223..18db2361e 100644 --- a/docs/usage/cargo-miden.md +++ b/docs/usage/cargo-miden.md @@ -1,48 +1,64 @@ -# Miden Cargo Extension +# Getting Started with Cargo -`cargo-miden` crate provides a cargo extension that allows you to compile Rust code to Miden VM MASM. +As part of the Miden compiler toolchain, we provide a Cargo extension, `cargo-miden`, which provides +a template to spin up a new Miden project in Rust, and takes care of orchestrating `rustc` and +`midenc` to compile the Rust crate to a Miden package. ## Installation -In order to install(build) the extension, you need to have the nightly Rust toolchain installed: +!!! warning -```bash -rustup toolchain install nightly-2024-05-07 -``` + Currently, `midenc` (and as a result, `cargo-miden`), requires the nightly Rust toolchain, so + make sure you have it installed first: + + ```bash + rustup toolchain install nightly-2024-05-07 + ``` -To install the extension, run: + NOTE: You can also use the latest nightly, but the specific nightly shown here is known to + work. + +To install the extension, simply run the following in your shell: ```bash cargo +nightly-2024-05-07 install cargo-miden ``` -## Usage +This will take a minute to compile, but once complete, you can run `cargo help miden` or just +`cargo miden` to see the set of available commands and options. -### Getting help -To get help with the extension, run: +To get help for a specific command, use `cargo miden help ` or `cargo miden --help`. -```bash -cargo miden -``` +## Creating a new project -Or for help with a specific command: +Your first step will be to create a new Rust project set up for compiling to Miden: ```bash -cargo miden --help +cargo miden new foo ``` -### Creating a new project -To create a new Miden VM project, run: +In this above example, this will create a new directory `foo`, containing a Cargo project for a +crate named `foo`, generated from our Miden project template. -```bash -cargo miden new -``` +The template we use sets things up so that you can pretty much just build and run. Since the +toolchain depends on Rust's native WebAssembly target, it is set up just like a minimal WebAssembly +crate, with some additional tweaks for Miden specfically. + +Out of the box, you will get a Rust crate that depends on the Miden SDK, and sets the global +allocator to a simple bump allocator we provide as part of the SDK, and is well suited for most +Miden use cases, avoiding the overhead of more complex allocators. + +As there is no panic infrastructure, `panic = "abort"` is set, and the panic handler is configured +to use the native WebAssembly `unreachable` intrinsic, so the compiler will strip out all of the +usual panic formatting code. + +### Compiling to Miden Assembly -### Compiling a project -To compile a Rust crate to Miden VM MASM, run: +Now that you've created your project, compiling it to Miden Assembly is as easy as running the +following command from the root of the project directory: ```bash -cargo miden build +cargo miden build ``` -Without any additional arguments, this will compile the library target in the target directory in the `miden` folder. +This will emit the compiled artifacts to `target/miden`. diff --git a/docs/usage/debugger.md b/docs/usage/debugger.md new file mode 100644 index 000000000..14ebf0810 --- /dev/null +++ b/docs/usage/debugger.md @@ -0,0 +1,223 @@ +# Debugging Programs + +A very useful tool in the Miden compiler suite, is its TUI-based interactive debugger, accessible +via the `midenc debug` command. + +!!! warning + + The debugger is still quite new, and while very useful already, still has a fair number of + UX annoyances. Please report any bugs you encounter, and we'll try to get them patched ASAP! + +## Getting Started + +The debugger is launched by executing `midenc debug`, and giving it a path to a program compiled +by `midenc compile`. See [Program Inputs](#program-inputs) for information on how to provide inputs +to the program you wish to debug. Run `midenc help debug` for more detailed usage documentation. + +The debugger may also be used as a library, but that is left as an exercise for the reader for now. + +## Example + +```shell +# Compile a program to MAST from a rustc-generated Wasm module +midenc compile foo.wasm -o foo.masl + +# Load that program into the debugger and start executing it +midenc debug foo.masl +``` + +## Program Inputs + +To pass arguments to the program on the operand stack, or via the advice provider, you have two +options, depending on the needs of the program: + +1. Pass arguments to `midenc debug` in the same order you wish them to appear on the stack. That + is, the first argument you specify will be on top of the stack, and so on. +2. Specify a configuration file from which to load inputs for the program, via the `--inputs` option. + +### Via Command Line + +To specify the contents of the operand stack, you can do so following the raw arguments separator `--`. +Each operand must be a valid field element value, in either decimal or hexadecimal format. For example: + +```shell +midenc debug foo.masl -- 1 2 0xdeadbeef +``` + +If you pass arguments via the command line in conjunction with `--inputs`, then the command line arguments +will be used instead of the contents of the `inputs.stack` option (if set). This lets you specify a baseline +set of inputs, and then try out different arguments using the command line. + +### Via Inputs Config + +While simply passing operands to the `midenc debug` command is useful, it only allows you to specify +inputs to be passed via operand stack. To provide inputs via the advice provider, you will need to use +the `--inputs` option. The configuration file expected by `--inputs` also lets you tweak the execution +options for the VM, such as the maximum and expected cycle counts. + +An example configuration file looks like so: + +```toml +# This section is used for execution options +[options] +max_cycles = 5000 +expected_cycles = 4000 + +# This section is the root table for all inputs +[inputs] +# Specify elements to place on the operand stack, leftmost element will be on top of the stack +stack = [1, 2, 0xdeadbeef] + +# This section contains input options for the advice provider +[inputs.advice] +# Specify elements to place on the advice stack, leftmost element will be on top +stack = [1, 2, 3, 4] + +# The `inputs.advice.map` section is a list of advice map entries that should be +# placed in the advice map before the program is executed. Entries with duplicate +# keys are handled on a last-write-wins basis. +[[inputs.advice.map]] +# The key for this entry in the advice map +digest = '0x3cff5b58a573dc9d25fd3c57130cc57e5b1b381dc58b5ae3594b390c59835e63' +# The values to be stored under this key +values = [1, 2, 3, 4] + +[[inputs.advice.map]] +digest = '0x20234ee941e53a15886e733cc8e041198c6e90d2a16ea18ce1030e8c3596dd38'' +values = [5, 6, 7, 8] +``` + +## Usage + +Once started, you will be dropped into the main debugger UI, stopped at the first cycle of +the program. The UI is organized into pages and panes, with the main/home page being the +one you get dropped into when the debugger starts. The home page contains the following panes: + +* Source Code - displays source code for the current instruction, if available, with + the relevant line and span highlighted, with syntax highlighting (when available) +* Disassembly - displays the 5 most recently executed VM instructions, and the current + cycle count +* Stack Trace - displays a stack trace for the current instruction, if the program was + compiled with tracing enabled. If frames are unavailable, this pane may be empty. +* Operand Stack - displays the contents of the operand stack and its current depth +* Breakpoints - displays the set of current breakpoints, along with how many were hit + at the current instruction, when relevant + +### Keyboard Shortcuts + +On the home page, the following keyboard shortcuts are available: + +Shortcut | Mnemonic | Description | +---------|----------------|---------------| +`q` | quit | exit the debugger | +`h` | next pane | cycle focus to the next pane | +`l` | prev pane | cycle focus to the previous pane | +`s` | step | advance the VM one cycle | +`n` | step next | advance the VM to the next instruction | +`c` | continue | advance the VM to the next breakpoint, else to completion | +`e` | exit frame | advance the VM until we exit the current call frame, a breakpoint is triggered, or execution terminates | +`d` | delete | delete an item (where applicable, e.g. the breakpoints pane) | +`:` | command prompt | bring up the command prompt (see below for details) | + +When various panes have focus, additional keyboard shortcuts are available, in any pane +with a list of items, or multiple lines (e.g. source code), `j` and `k` (or the up and +down arrows) will select the next item up and down, respectively. As more features are +added, I will document their keyboard shortcuts below. + +### Commands + +From the home page, typing `:` will bring up the command prompt in the footer pane. + +You will know the prompt is active because the keyboard shortcuts normally shown there will +no longer appear, and instead you will see the prompt, starting with `:`. It supports any +of the following commands: + +Command | Aliases | Action | Description | +-------------|--------------|-------------------|---------------| +`quit` | `q` | quit | exit the debugger | +`debug` | | show debug log | display the internal debug log for the debugger itself | +`reload` | | reload program | reloads the program from disk, and resets the UI (except breakpoints) | +`breakpoint` | `break`, `b` | create breakpoint | see [Breakpoints](#breakpoints) | +`read` | `r` | read memory | inspect linear memory (see [Reading Memory](#reading-memory) | + +## Breakpoints + +One of the most common things you will want to do with the debugger is set and manage breakpoints. +Using the command prompt, you can create breakpoints by typing `b` (or `break` or `breakpoint`), +followed by a space, and then the desired breakpoint expression to do any of the following: + +* Break at an instruction which corresponds to a source file (or file and line) whose name/path + matches a pattern +* Break at the first instruction which causes a call frame to be pushed for a procedure whose name + matches a pattern +* Break any time a specific opcode is executed +* Break at the next instruction +* Break after N cycles +* Break at CYCLE + +The syntax for each of these can be found below, in the same order (shown using `b` as the command): + +Expression | Description | +--------------------|---------------| +`b FILE[:LINE]` | Break when an instruction with a source location in `FILE` (a glob pattern)
_and_ that occur on `LINE` (literal, if provided) are hit. | +`b in NAME` | Break when the glob pattern `NAME` matches the fully-qualified procedure name
containing the current instruction | +`b for OPCODE` | Break when the an instruction with opcode `OPCODE` is exactly matched
(including immediate values) | +`b next` | Break on the next instruction | +`b after N` | Break after `N` cycles | +`b at CYCLE` | Break when the cycle count reaches `CYCLE`.
If `CYCLE` has already occurred, this has no effect | + +When a breakpoint is hit, it will be highlighted, and the breakpoint window will display the number +of hit breakpoints in the lower right. + +After a breakpoint is hit, it expires if it is one of the following types: + +* Break after N +* Break at CYCLE +* Break next + +When a breakpoint expires, it is removed from the breakpoint list on the next cycle. + +## Reading Memory + +Another useful diagnostic task is examining the contents of linear memory, to verify that expected +data has been written. You can do this via the command prompt, using `r` (or `read`), followed by +a space, and then the desired memory address and options: + +The format for read expressions is `:r ADDR [OPTIONS..]`, where `ADDR` is a memory address in +decimal or hexadecimal format (the latter requires the `0x` prefix). The `read` command supports +the following for `OPTIONS`: + +Option | Alias | Values | Default | Description | +----------------|-------|-----------------|---------|--------------| +`-mode MODE` | `-m` |
  • `words` (`word` ,`w`)
  • `bytes` (`byte`, `b`)
| `words` | Specify a memory addressing mode | +`-format FORMAT`| `-f` |
  • `decimal` (`d`)
  • `hex` (`x`)
  • `binary` (`bin`, `b`)
| `decimal` | Specify the format used to print integral values | +`-count N` | `-c` | | `1` | Specify the number of units to read | +`-type TYPE` | `-t` | See [Types](#types) | `word` | Specify the type of value to read
This also has the effect of modifying the default `-format` and unit size for `-count` | + +Any invalid combination of options, or invalid syntax, will display an error in the status bar. + +### Types + +Type | Description | +--------|--------------| +`iN` | A signed integer of `N` bits | +`uN` | An unsigned integer of `N` bits | +`felt` | A field element | +`word` | A Miden word, i.e. an array of four field elements | +`ptr` or `pointer` | A 32-bit memory address (implies `-format hex`) | + +## Roadmap + +The following are some features planned for the near future: + +* **Watchpoints**, i.e. cause execution to break when a memory store touches a specific address +* **Conditional breakpoints**, i.e. only trigger a breakpoint when an expression attached to it + evaluates to true +* More DYIM-style breakpoints, i.e. when breaking on first hitting a match for a file or + procedure, we probably shouldn't continue to break for every instruction to which that + breakpoint technically applies. Instead, it would make sense to break and then temporarily + disable that breakpoint until something changes that would make breaking again useful. + This will rely on the ability to disable breakpoints, not delete them, which we don't yet + support. +* More robust type support in the `read` command +* Display procedure locals and their contents in a dedicated pane diff --git a/docs/usage/midenc.md b/docs/usage/midenc.md index ec53aa6a4..0a2ed6beb 100644 --- a/docs/usage/midenc.md +++ b/docs/usage/midenc.md @@ -1,31 +1,40 @@ -# As an Executable +# Getting Started with `midenc` -At the present time, we do not yet have prebuilt packages of the compiler toolchain -available, so it must be built from source, but the requirements for this are minimal, -as shown below: +The `midenc` executable is the command-line interface for the compiler driver, as well as other +helpful tools, such as the interactive debugger. + +While it is a lower-level tool compared to `cargo-miden`, just like the difference between `rustc` +and `cargo`, it provides a lot of functionality for emitting diagnostic information, controlling +the output of the compiler, and configuring the compilation pipeline. Most users will want to use +`cargo-miden`, but understanding `midenc` is helpful for those times where you need to get your +hands dirty. ## Installation -First, you'll need to have Rust installed (at time of writing, we're doing development -against Rust 1.74). +First, you'll need to have Rust installed, with the nightly toolchain (currently we're building +against the `nightly-2024-05-07` toolchain, but we regularly update this). + +Then, simply install `midenc` using Cargo in one of the following ways: -Then, simply install `midenc` using Cargo, like so: + # From crates.io: + cargo +nightly install midenc # If you have cloned the git repo, and are in the project root: - cargo install --path midenc midenc - + cargo make install + # If you have Rust installed, but have not cloned the git repo: - cargo install --git https://github.com/0xpolygonmiden/compiler --branch develop midenc + cargo install --git https://github.com/0xpolygonmiden/compiler midenc + +!!! advice -> [!NOTE] -> This installation method relies on Cargo-managed binaries being in your shell `PATH`, -> which is almost always the case, but if you have disabled this functionality, you'll need -> to add `midenc` to your `PATH` manually. + This installation method relies on Cargo-managed binaries being in your shell `PATH`, + which is almost always the case, but if you have disabled this functionality, you'll need + to add `midenc` to your `PATH` manually. ## Usage -Once built, you should be able to invoke the compiler now, for example: +Once installed, you should be able to invoke the compiler, you should see output similar to this: midenc help compile Usage: midenc compile [OPTIONS] [-- ...] @@ -57,16 +66,57 @@ Once built, you should be able to invoke the compiler now, for example: Print help (see a summary with '-h') +The actual help output covers quite a bit more than shown here, this is just for illustrative +purposes. + +The `midenc` executable supports two primary functions at this time: + +* `midenc compile` to compile one of our supported input formats to Miden Assembly +* `midenc debug` to run a Miden program attached to an interactive debugger + +## Compilation + +See the help output for `midenc compile` for detailed information on its options and their +behavior. However, the following is an example of how one might use `midenc compile` in practice: + +```bash +midenc compile --target rollup \ + --entrypoint 'foo::main' \ + -lextra \ + -L ./masm \ + --emit=hir=-,masp \ + -o out.masp \ + target/wasm32-wasip1/release/foo.wasm +``` + +In this scenario, we are in the root of a Rust crate, named `foo`, which we have compiled for the +`wasm32-wasip1` target, which placed the resulting WebAssembly module in the +`target/wasm32-wasip1/release` directory. This crate exports a function named `main`, which we want +to use as the entrypoint of the program. + +Additionally, our Rust code links against some hand-written Miden Assembly code, namespaced under +`extra`, which can be found in `./masm/extra`. We are telling `midenc` to link the `extra` library, +and to add the `./masm` directory to the library search path. + +Lastly, we're configuring the output: + +* We're using `--emit` to request `midenc` to dump Miden IR (`hir`) to stdout (specified via the `-` +shorthand), in addition to the Miden package artifact (`masp`). +* We're telling `midenc` to write the compiled output to `out.masp` in the current directory, rather +than the default path that would have been used (`target/miden/foo.masp`). + +## Debugging + +See [Debugging Programs](debugger.md) for details on using `midenc debug` to debug Miden programs. + ## Next Steps -We currently have two frontends to the compiler, one that accepts the compiler's IR in textual -form (as a `.hir` file), primarily used for testing; and one that accepts a WebAssembly module -in binary form (i.e. as a `.wasm` file). +We have put together two useful guides to walk through more detail on compiling Rust to WebAssembly: -For the vast majority of people, if not everyone, the `.wasm` form will be the one you are interested -in, so we have put together a [helpful guide](../guides/wasm_to_masm.md) that walks through how to -compile a WebAssembly module (in this case, produced by `rustc`) to Miden Assembly using `midenc`. +1. To learn how to compile Rust to WebAssembly so that you can invoke `midenc compile` on the +resulting Wasm module, see [this guide](../guides/rust_to_wasm.md). +2. If you already have a WebAssembly module, or know how to produce one, and want to learn how to +compile it to Miden Assembly, see [this guide](../guides/wasm_to_masm.md). -If you aren't sure how to produce a WebAssembly module, you may be interested in -[another guide](../guides/rust_to_wasm.md) that demonstrates how to emit a WebAssembly module from -a Rust crate. +You may also be interested in our [basic account project template](https://github.com/0xpolygonmiden/rust-templates/tree/main/account/template), +as a starting point for your own Rust project. diff --git a/midenc-debug/README.md b/midenc-debug/README.md index df91dc912..6c0bbef1c 100644 --- a/midenc-debug/README.md +++ b/midenc-debug/README.md @@ -1,218 +1,4 @@ # Miden Debugger -This crate implements a TUI-based interactive debugger for the Miden VM, designed to -interoperate with `midenc`. - -# Usage - -The easiest way to use the debugger, is via `midenc debug`, and giving it a path to a -program compiled by `midenc compile`. See [Program Inputs](#program-inputs) for information -on how to provide inputs to the program you wish to debug. Run `midenc help debug` for more -detailed usage documentation. - -The debugger may also be used as a library, but that is left as an exercise for the reader for now. - -## Example - -```shell -# Compile a program to MAST from a rustc-generated Wasm module -midenc compile foo.wasm -o foo.masl - -# Load that program into the debugger and start executing it -midenc debug foo.masl -``` - -## Program Inputs - -To pass arguments to the program on the operand stack, or via the advice provider, you have two -options, depending on the needs of the program: - -1. Pass arguments to `midenc debug` in the same order you wish them to appear on the stack. That - is, the first argument you specify will be on top of the stack, and so on. -2. Specify a configuration file from which to load inputs for the program, via the `--inputs` option. - -### Via Command Line - -To specify the contents of the operand stack, you can do so following the raw arguments separator `--`. -Each operand must be a valid field element value, in either decimal or hexadecimal format. For example: - -```shell -midenc debug foo.masl -- 1 2 0xdeadbeef -``` - -If you pass arguments via the command line in conjunction with `--inputs`, then the command line arguments -will be used instead of the contents of the `inputs.stack` option (if set). This lets you specify a baseline -set of inputs, and then try out different arguments using the command line. - -### Via Inputs Config - -While simply passing operands to the `midenc debug` command is useful, it only allows you to specify -inputs to be passed via operand stack. To provide inputs via the advice provider, you will need to use -the `--inputs` option. The configuration file expected by `--inputs` also lets you tweak the execution -options for the VM, such as the maximum and expected cycle counts. - -An example configuration file looks like so: - -```toml -# This section is used for execution options -[options] -max_cycles = 5000 -expected_cycles = 4000 - -# This section is the root table for all inputs -[inputs] -# Specify elements to place on the operand stack, leftmost element will be on top of the stack -stack = [1, 2, 0xdeadbeef] - -# This section contains input options for the advice provider -[inputs.advice] -# Specify elements to place on the advice stack, leftmost element will be on top -stack = [1, 2, 3, 4] - -# The `inputs.advice.map` section is a list of advice map entries that should be -# placed in the advice map before the program is executed. Entries with duplicate -# keys are handled on a last-write-wins basis. -[[inputs.advice.map]] -# The key for this entry in the advice map -digest = '0x3cff5b58a573dc9d25fd3c57130cc57e5b1b381dc58b5ae3594b390c59835e63' -# The values to be stored under this key -values = [1, 2, 3, 4] - -[[inputs.advice.map]] -digest = '0x20234ee941e53a15886e733cc8e041198c6e90d2a16ea18ce1030e8c3596dd38'' -values = [5, 6, 7, 8] -``` - -# Debugger Usage - -Once started, you will be dropped into the main debugger UI, stopped at the first cycle of -the program. The UI is organized into pages and panes, with the main/home page being the -one you get dropped into when the debugger starts. The home page contains the following panes: - -* Source Code - displays source code for the current instruction, if available, with - the relevant line and span highlighted, with syntax highlighting (when available) -* Disassembly - displays the 5 most recently executed VM instructions, and the current - cycle count -* Stack Trace - displays a stack trace for the current instruction, if the program was - compiled with tracing enabled. If frames are unavailable, this pane may be empty. -* Operand Stack - displays the contents of the operand stack and its current depth -* Breakpoints - displays the set of current breakpoints, along with how many were hit - at the current instruction, when relevant - -On the home page, the following keyboard shortcuts are available: - -* `q` (quit) - exit the debugger -* `h`,`l` (pane movement) - cycle focus to the next pane (`h`) or previous pane (`l`) -* `s` (step) - advance the VM one cycle -* `n` (step next) - advance the VM to the next instruction (i.e. skip over all the cycles - of a multi-cycle instructions) -* `c` (continue) - advance the VM to the next breakpoint, or until execution terminates -* `e` (exit current frame) - advance the VM until we exit the current call frame, or until - another breakpoint is triggered, or execution terminates, whichever happens first -* `d` (delete) - delete an item (where applicable, for example, the breakpoints pane) -* `:` (command prompt) - bring up the command prompt (described further below) - -When various panes have focus, additional keyboard shortcuts are available, in any pane -with a list of items, or multiple lines (e.g. source code), `j` and `k` (or the up and -down arrows) will select the next item up and down, respectively. As more features are -added, I will document their keyboard shortcuts below. - -## Commands - -From the home page, typing `:` will bring up the command prompt in the footer pane. - -You will know the prompt is active because the keyboard shortcuts normally shown there will -no longer appear, and instead you will see the prompt, starting with `:`. It supports any -of the following commands: - -* `q` or `quit` (quit) - exit the debugger -* `debug` (debug log) - display internal debug log for the debugger itself -* `reload` (reload current program) - reloads the program from disk, and resets the UI, with the - exception of breakpoints, which are retained across reloads -* `b` or `break` or `breakpoint` (breakpoints) - manage breakpoints (see [Breakpoints](#breakpoints)) -* `r` or `read` (read memory) - read values from linear memory (see [Reading Memory](#read-memory)) - -## Breakpoints - -One of the most common things you will want to do with the debugger is set and manage breakpoints. -Using the command prompt, you can create breakpoints by typing `b` (or `break` or `breakpoint`), -followed by a space, and then the desired breakpoint expression to do any of the following: - -* Break at an instruction which corresponds to a source file (or file and line) whose name/path - matches a pattern -* Break at the first instruction which causes a call frame to be pushed for a procedure whose name - matches a pattern -* Break any time a specific opcode is executed -* Break at the next instruction -* Break after N cycles -* Break at CYCLE - -The syntax for each of these can be found below, in the same order (shown using `b` as the command): - -* `b FILE[:LINE]` - where `FILE` is a glob pattern matched against the source file path. The `:LINE` - part is optional, as indicated by the brackets. If specified, only instructions with source - locations in `FILE` _and_ that occur on `LINE`, will cause a hit. -* `b in NAME` - where `NAME` is a glob pattern matched against the fully-qualified procedure name -* `b for OPCODE` - where `OPCODE` is the exact opcode you want to break on (including immediates) -* `b next` -* `b after N` -* `b at CYCLE` - if `CYCLE` is in the past, this breakpoint will have no effect - -When a breakpoint is hit, it will be highlighted, and the breakpoint window will display the number -of hit breakpoints in the lower right. - -After a breakpoint is hit, it expires if it is one of the following types: - -* Break after N -* Break at CYCLE -* Break next - -When a breakpoint expires, it is removed from the breakpoint list on the next cycle. - -## Read Memory - -Another useful diagnostic task is examining the contents of linear memory, to verify that expected -data has been written. You can do this via the command prompt, using `r` (or `read`), followed by -a space, and then the desired memory address and options: - -The format for read expressions is `:r ADDR [OPTIONS..]`, where `ADDR` is a memory address in -decimal or hexadecimal format (the latter requires the `0x` prefix). The `read` command supports -the following for `OPTIONS`: - -* `-m MODE` or `-mode MODE`, specify a memory addressing mode, either `words` or `bytes` (aliases - `w`/`b`, `word`/`byte`, or `miden`/`rust` are permitted). This determines whether `ADDR` is an - address in units of words or bytes. (default `words`) -* `-f FORMAT` or `-format FORMAT`, specify the format used to print integral values - (default `decimal`): - - `d`, `decimal`: print as decimal/base-10 - - `x`, `hex`, `hexadecimal`: print as hexadecimal/base-16 - - `b`, `bin`, `binary`, `bits`: print as binary/base-2 -* `-c N` or `-count N`, specify the number of units to read (default `1`) -* `-t TYPE` or `-type TYPE`, specify the type of value to read. In addition to modifying the default - for `-format`, and the unit size for `-count`, this will also attempt to interpret the memory as - a value of the specified type, and notify you if the value is invalid. The default type is `word`. - Available types are listed below: - - `iN` and `uN`: integer of `N` bits, with the `i` or `u` prefix determining its signedness. - `N` must be a power of two. - - `felt`: a field element - - `word`: a word, i.e. an array of four `felt` - - `ptr` or `pointer`: a 32-bit memory address (defaults `-format hex`) - - In the future, more types will be supported, namely structs/arrays - -Any invalid combination of options, or invalid syntax, will display an error in the status bar. - -# Roadmap - -The following are some features planned for the near future: - -* Watchpoints, i.e. cause execution to break when a memory store touches a specific address -* Conditional breakpoints, i.e. only trigger a breakpoint when an expression attached to it - evaluates to true -* More DYIM-style breakpoints, i.e. when breaking on first hitting a match for a file or - procedure, we probably shouldn't continue to break for every instruction to which that - breakpoint technically applies. Instead, it would make sense to break and then temporarily - disable that breakpoint until something changes that would make breaking again useful. - This will rely on the ability to disable breakpoints, not delete them, which we don't yet - support. -* More robust type support in the `read` command -* Display procedure locals and their contents in a dedicated pane +See the [documentation](../docs/usage/debugger.md) for more details on what is provided by this +crate, and how to use it. diff --git a/mkdocs.yml b/mkdocs.yml index 8d9b32aa2..e6dc1b227 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -30,15 +30,16 @@ nav: - WebAssembly To Miden Assembly: guides/wasm_to_masm.md - Developing Miden Programs In Rust: guides/develop_miden_in_rust.md - Developing Miden Rollup Accounts And Note Scripts In Rust: guides/develop_miden_rollup_accounts_and_note_scripts_in_rust.md + - Debugging Programs: usage/debugger.md - Compiler Architecture: - Overview: design/overview.md - - Frontends: design/frontends.md + - Supported Frontends: design/frontends.md #- HIR: #- Code Generation: #- Testing: - Appendices: - Calling Conventions: appendix/calling_conventions.md - - Canonical ABI vs Miden ABI: appendix/canonabi-adhocabi-mismatch.md + - Canonical ABI vs Miden ABI: appendix/canonabi-adhocabi-mismatch.md markdown_extensions: - toc: From be839371f6ce171e7add5266c7f749f28e7ded1f Mon Sep 17 00:00:00 2001 From: Paul Schoenfelder Date: Fri, 6 Sep 2024 15:42:14 -0400 Subject: [PATCH 4/5] feat: implement 'midenc run' command --- midenc-debug/src/exec/state.rs | 7 ++- midenc-debug/src/exec/trace.rs | 22 ++++++-- midenc-debug/src/lib.rs | 58 ++++++++++++++++++++- midenc-debug/src/ui/mod.rs | 2 +- midenc-debug/src/ui/state.rs | 4 ++ midenc-driver/src/midenc.rs | 95 +++++++++++++++------------------- 6 files changed, 125 insertions(+), 63 deletions(-) diff --git a/midenc-debug/src/exec/state.rs b/midenc-debug/src/exec/state.rs index 5b667a3f5..5576194fc 100644 --- a/midenc-debug/src/exec/state.rs +++ b/midenc-debug/src/exec/state.rs @@ -83,16 +83,15 @@ impl DebugExecutor { /// Consume the [DebugExecutor], converting it into an [ExecutionTrace] at the current cycle. pub fn into_execution_trace(self) -> ExecutionTrace { let last_cycle = self.cycle; + let trace_len_summary = *self.iter.trace_len_summary(); let (_, _, _, chiplets, _) = self.iter.into_parts(); - let outputs = self - .result - .map(|res| res.stack().iter().copied().map(TestFelt).collect::>()) - .unwrap_or_default(); + let outputs = self.result.unwrap_or_default(); ExecutionTrace { root_context: self.root_context, last_cycle: RowIndex::from(last_cycle), chiplets: Chiplets::new(move |context, clk| chiplets.get_mem_state_at(context, clk)), outputs, + trace_len_summary, } } } diff --git a/midenc-debug/src/exec/trace.rs b/midenc-debug/src/exec/trace.rs index 3bce2b111..4e3040b80 100644 --- a/midenc-debug/src/exec/trace.rs +++ b/midenc-debug/src/exec/trace.rs @@ -8,7 +8,7 @@ use miden_assembly::Library as CompiledLibrary; use miden_core::{Program, StackInputs, Word}; use miden_processor::{ AdviceInputs, ContextId, ExecutionError, Felt, MastForest, MemAdviceProvider, Process, - ProcessState, RowIndex, StackOutputs, VmState, VmStateIterator, + ProcessState, RowIndex, StackOutputs, TraceLenSummary, VmState, VmStateIterator, }; use midenc_codegen_masm::NativePtr; pub use midenc_hir::TraceEvent; @@ -39,7 +39,8 @@ pub struct ExecutionTrace { pub(super) root_context: ContextId, pub(super) last_cycle: RowIndex, pub(super) chiplets: Chiplets, - pub(super) outputs: VecDeque, + pub(super) outputs: StackOutputs, + pub(super) trace_len_summary: TraceLenSummary, } impl ExecutionTrace { @@ -48,16 +49,29 @@ impl ExecutionTrace { where T: PopFromStack, { - let mut stack = self.outputs.clone(); + let mut stack = + VecDeque::from_iter(self.outputs.clone().stack().iter().copied().map(TestFelt)); T::try_pop(&mut stack) } /// Consume the [ExecutionTrace], extracting just the outputs on the operand stack #[inline] - pub fn into_outputs(self) -> VecDeque { + pub fn into_outputs(self) -> StackOutputs { self.outputs } + /// Return a reference to the operand stack outputs + #[inline] + pub fn outputs(&self) -> &StackOutputs { + &self.outputs + } + + /// Return a reference to the trace length summary + #[inline] + pub fn trace_len_summary(&self) -> &TraceLenSummary { + &self.trace_len_summary + } + /// Read the word at the given Miden memory address pub fn read_memory_word(&self, addr: u32) -> Option { self.read_memory_word_in_context(addr, self.root_context, self.last_cycle) diff --git a/midenc-debug/src/lib.rs b/midenc-debug/src/lib.rs index ac50cffd2..2e29ad95d 100644 --- a/midenc-debug/src/lib.rs +++ b/midenc-debug/src/lib.rs @@ -14,7 +14,7 @@ use std::rc::Rc; use midenc_session::{ diagnostics::{IntoDiagnostic, Report}, - Session, + HumanDuration, Session, }; pub use self::{ @@ -38,6 +38,62 @@ pub fn run( rt.block_on(async move { start_ui(inputs, args, session, logger).await }) } +pub fn run_noninteractively( + inputs: Option, + args: Vec, + num_outputs: usize, + session: Rc, +) -> ExecutionResult<()> { + use std::time::Instant; + + use midenc_hir::formatter::ToHex; + + println!("==============================================================================="); + println!("Run program: {}", session.inputs[0].file_name()); + println!("-------------------------------------------------------------------------------"); + + let state = ui::State::from_inputs(inputs, args, session)?; + + println!( + "Executed program with hash {} in {}", + state.package.digest.to_hex(), + HumanDuration::from(state.execution_duration), + ); + + // write the stack outputs to the screen. + println!("Output: {:?}", state.execution_trace.outputs().stack_truncated(num_outputs)); + + // calculate the percentage of padded rows + let trace_len_summary = state.execution_trace.trace_len_summary(); + let padding_percentage = (trace_len_summary.padded_trace_len() - trace_len_summary.trace_len()) + * 100 + / trace_len_summary.padded_trace_len(); + + // print the required cycles for each component + println!( + "VM cycles: {} extended to {} steps ({}% padding). +├── Stack rows: {} +├── Range checker rows: {} +└── Chiplets rows: {} +├── Hash chiplet rows: {} +├── Bitwise chiplet rows: {} +├── Memory chiplet rows: {} +└── Kernel ROM rows: {}", + trace_len_summary.trace_len(), + trace_len_summary.padded_trace_len(), + padding_percentage, + trace_len_summary.main_trace_len(), + trace_len_summary.range_trace_len(), + trace_len_summary.chiplets_trace_len().trace_len(), + trace_len_summary.chiplets_trace_len().hash_chiplet_len(), + trace_len_summary.chiplets_trace_len().bitwise_chiplet_len(), + trace_len_summary.chiplets_trace_len().memory_chiplet_len(), + trace_len_summary.chiplets_trace_len().kernel_rom_len(), + ); + + Ok(()) +} + pub fn trace( _options: Option, _args: Vec, diff --git a/midenc-debug/src/ui/mod.rs b/midenc-debug/src/ui/mod.rs index aa266bcd5..aad5746fa 100644 --- a/midenc-debug/src/ui/mod.rs +++ b/midenc-debug/src/ui/mod.rs @@ -6,4 +6,4 @@ mod state; mod syntax_highlighting; mod tui; -pub use self::{action::Action, app::App}; +pub use self::{action::Action, app::App, state::State}; diff --git a/midenc-debug/src/ui/state.rs b/midenc-debug/src/ui/state.rs index 06e5c4c85..17aff4d39 100644 --- a/midenc-debug/src/ui/state.rs +++ b/midenc-debug/src/ui/state.rs @@ -25,6 +25,7 @@ pub struct State { pub breakpoints_hit: Vec, pub next_breakpoint_id: u8, pub stopped: bool, + pub execution_duration: std::time::Duration, } #[derive(Default, Debug, Copy, Clone, PartialEq, Eq)] @@ -67,7 +68,9 @@ impl State { trace_executor.with_library(&lib); } + let now = std::time::Instant::now(); let execution_trace = trace_executor.capture_trace(&program, &session); + let execution_duration = now.elapsed(); Ok(Self { package, @@ -81,6 +84,7 @@ impl State { breakpoints_hit: vec![], next_breakpoint_id: 0, stopped: true, + execution_duration, }) } diff --git a/midenc-driver/src/midenc.rs b/midenc-driver/src/midenc.rs index 9cf226a36..10ffadf53 100644 --- a/midenc-driver/src/midenc.rs +++ b/midenc-driver/src/midenc.rs @@ -1,14 +1,13 @@ use std::{ffi::OsString, path::PathBuf, rc::Rc, sync::Arc}; -use clap::{ColorChoice, Parser, Subcommand}; +use clap::{Parser, Subcommand}; use log::Log; use midenc_compile as compile; #[cfg(feature = "debug")] use midenc_debug as debugger; -use midenc_hir::FunctionIdent; use midenc_session::{ diagnostics::{Emitter, Report}, - InputFile, Verbosity, Warnings, + InputFile, }; use crate::ClapDiagnostic; @@ -33,17 +32,26 @@ enum Commands { #[command(flatten)] options: compile::Compiler, }, - /// Execute a compiled function using the Miden VM emulator. - /// - /// The emulator is more restrictive, but is faster than the Miden VM, and - /// provides a wider array of debugging and introspection features when troubleshooting - /// programs compiled by `midenc`. - Exec { - /// Specify one or more input files to compile as part of the program to execute + /// Execute a compiled program or library, using the Miden VM. + #[cfg(feature = "debug")] + Run { + /// Specify the path to a Miden program file to execute. + /// + /// Miden Assembly programs are emitted by the compiler with a `.masl` extension. /// /// You may use `-` as a file name to read a file from stdin. #[arg(required(true), value_name = "FILE")] input: InputFile, + /// Specify the path to a file containing program inputs. + /// + /// Program inputs are stack and advice provider values which the program can + /// access during execution. The inputs file is a TOML file which describes + /// what the inputs are, or where to source them from. + #[arg(long, value_name = "FILE")] + inputs: Option, + /// Number of outputs on the operand stack to print + #[arg(long, short = 'n', default_value_t = 16)] + num_outputs: usize, /// Arguments to place on the operand stack before calling the program entrypoint. /// /// Arguments will be pushed on the operand stack in the order of appearance, @@ -51,48 +59,12 @@ enum Commands { /// Example: `-- a b` will push `a` on the stack, then `b`. /// /// These arguments must be valid field element values expressed in decimal format. - #[arg(last(true), value_name = "ARGV")] - args: Vec, - /// Specify what type and level of informational output to emit - #[arg( - long = "verbose", - short = 'v', - value_name = "LEVEL", - value_enum, - default_value_t = Verbosity::Info, - default_missing_value = "debug", - help_heading = "Diagnostics", - )] - verbosity: Verbosity, - /// Specify how warnings should be treated by the compiler. - #[arg( - long, - short = 'W', - value_name = "LEVEL", - value_enum, - default_value_t = Warnings::All, - help_heading = "Diagnostics", - )] - warn: Warnings, - /// Whether, and how, to color terminal output - #[arg(long, value_enum, default_value_t = ColorChoice::Auto, default_missing_value = "auto", help_heading = "Diagnostics")] - color: ColorChoice, - /// Write all intermediate compiler artifacts to `` - /// - /// Defaults to a directory named `target` in the current working directory - #[arg( - long, - value_name = "DIR", - hide(true), - env = "MIDENC_TARGET_DIR", - help_heading = "Output" - )] - target_dir: Option, - /// Specify the fully-qualified name of the function to invoke as the program entrypoint /// - /// For example, `foo::bar` - #[arg(long, short = 'e', value_name = "NAME")] - entrypoint: Option, + /// NOTE: These arguments will override any stack values provided via --inputs + #[arg(last(true), value_name = "ARGV")] + args: Vec, + #[command(flatten)] + options: debugger::Debugger, }, /// Run a program under the interactive Miden VM debugger /// @@ -109,7 +81,7 @@ enum Commands { /// Specify the path to a file containing program inputs. /// /// Program inputs are stack and advice provider values which the program can - /// access during execution. The inputs file is a JSON file which describes + /// access during execution. The inputs file is a TOML file which describes /// what the inputs are, or where to source them from. #[arg(long, value_name = "FILE")] inputs: Option, @@ -187,6 +159,24 @@ impl Midenc { compile::compile(Rc::new(session)) } #[cfg(feature = "debug")] + Commands::Run { + input, + inputs, + args, + num_outputs, + mut options, + } => { + log::set_boxed_logger(logger) + .unwrap_or_else(|err| panic!("failed to install logger: {err}")); + log::set_max_level(filter); + if options.working_dir.is_none() { + options.working_dir = Some(cwd); + } + let session = options.into_session(vec![input], emitter); + let args = args.into_iter().map(|felt| felt.0).collect(); + debugger::run_noninteractively(inputs, args, num_outputs, Rc::new(session)) + } + #[cfg(feature = "debug")] Commands::Debug { input, inputs, @@ -200,7 +190,6 @@ impl Midenc { let args = args.into_iter().map(|felt| felt.0).collect(); debugger::run(inputs, args, Rc::new(session), logger) } - _ => unimplemented!(), } } } From 4c43d38f67e9a92ede62ac3fe0701e3a64641d44 Mon Sep 17 00:00:00 2001 From: Paul Schoenfelder Date: Fri, 6 Sep 2024 15:42:41 -0400 Subject: [PATCH 5/5] docs: use 'midenc run' in guides for now --- docs/appendix/known-limitations.md | 16 +++---- docs/guides/wasm_to_masm.md | 74 ++++++++++-------------------- docs/usage/midenc.md | 1 + 3 files changed, 30 insertions(+), 61 deletions(-) diff --git a/docs/appendix/known-limitations.md b/docs/appendix/known-limitations.md index b21ae34db..a9e18104e 100644 --- a/docs/appendix/known-limitations.md +++ b/docs/appendix/known-limitations.md @@ -237,16 +237,12 @@ program starts) However, this package format is not yet understood by the Miden VM itself. This means you cannot, currently, compile a package and then run it using `miden run` directly. Instead, you can use -`midenc debug` to load and run code from a package, as the interactive debugger has native support -for it. See [Debugging Programs](../usage/debugger.md) for more information on how to use the -debugger. - -!!! note - - In the next patch release, we expect to implement a `midenc run` command that simply executes - a program without attaching the debugger, which will largely resemble the eventual `miden run` - functionality. Once the package format is stabilized, using `midenc run` will no longer be - necessary. +`midenc run` to load and run code from a package, as the compiler ships with the VM embedded for +use with the interactive debugger, and provides native support for packaging on top of it. You can +also use `midenc debug` to execute your program interactively in the debugger, depending on your +needs. See [Debugging Programs](../usage/debugger.md) for more information on how to use the +debugger, and `midenc help run` for more information on executing programs with the `midenc run` +command. While it is possible to emit raw MAST from `midenc`, rather than the experimental package format, the resulting artifact cannot be run without some fragile and error-prone manual setup, in order diff --git a/docs/guides/wasm_to_masm.md b/docs/guides/wasm_to_masm.md index d8a20c5c2..a7c3e998c 100644 --- a/docs/guides/wasm_to_masm.md +++ b/docs/guides/wasm_to_masm.md @@ -29,17 +29,19 @@ of the Fibonacci function called `fib`, that was emitted to module to Miden Assembly. Currently, by default, the compiler will emit an experimental package format that the Miden VM does -not yet support. To demonstrate what using compiled code with the VM will look like, we're going to -tell the compiler to emit a Miden Assembly library (a `.masl` file), as well as Miden Assembly text -format, so that we can take a look at what the actual Masm looks like: +not yet support. We will instead use `midenc run` to execute the package using the VM for us, but +once the package format is stabilized, this same approach will work with `miden run` as well. + +We also want to examine the Miden Assembly generated by the compiler, so we're going to ask the +compiler to emit both types of artifacts: ```bash -midenc compile --emit masm=wasm_fib.masm,masl target/wasm32-wasip1/release/wasm_fib.wasm +midenc compile --emit masm=wasm_fib.masm,masp target/wasm32-wasip1/release/wasm_fib.wasm ``` -This will compile our Wasm module to a Miden Assembly library with the `.masl` extension, and also -emit the textual Masm to `wasm_fib.masm` so we can review it. The `wasm_fib.masl` file will be -emitted in the current directory by default. +This will compile our Wasm module to a Miden package with the `.masp` extension, and also emit the +textual Masm to `wasm_fib.masm` so we can review it. The `wasm_fib.masp` file will be emitted in the +default output directory, which is the current working directory by default. If we dump the contents of `wasm_fib.masm`, we'll see the following generated code: @@ -97,56 +99,26 @@ faithful translation, but there may be areas where we generate sub-optimal Miden !!! note - For the moment, the `miden run` command does not support running a compiled MAST program - directly, so we are compiling to a library, and then providing a thin executable module - which will execute the `fib` function. This is expected to change in an upcoming release. - -Assuming you have followed the instructions for installing the Miden VM locally, we can test our -compiled program out as follows: - -First, we need to define an executable module which will invoke the `fib` procedure from our -compiled `wasm_fib.masl` library: - -```bash -cat < main.masm -begin - exec.::wasm_fib::fib -end -EOF -``` - -We will also need a `.inputs` file to pass arguments to the program: - -```bash -cat < wasm_fib.inputs -{ - "operand_stack": ["10"], - "advice_stack": [] -} -EOF -``` + Because the compiler ships with the VM embedded for `midenc debug`, you can run your program + without having to install the VM separately, though you should do that as well, as `midenc` only + exposes a limited set of commands for executing programs, intended for debugging. -With these in place, we can put it all together and run it: +We can test our compiled program like so: - miden run -a main.masm -n 1 -i wasm_fib.inputs -l wasm_fib/wasm_fib.masl + $ midenc run --num-outputs 1 wasm_fib.masp -- 10 ============================================================ - Run program + Run program: wasm_fib.masp ============================================================ - Reading library file `wasm_fib.masl` - Reading program file `main.masm` - Parsing program... done (0 ms) - Compiling program... done (2 ms) - Reading input file `wasm_fib.inputs` - Executing program with hash 3d965e7c6cfbcfe9d9db67262cbbc31517931a0169257f385d447d497cf55778... done (1 ms) + Executed program with hash 0xe5ba88695040ec2477821b26190e9addbb1c9571ae30c564f5bbfd6cabf6c535 in 19 milliseconds Output: [55] - VM cycles: 263 extended to 512 steps (48% padding). - ├── Stack rows: 263 + VM cycles: 295 extended to 512 steps (42% padding). + ├── Stack rows: 295 ├── Range checker rows: 67 - └── Chiplets rows: 201 - ├── Hash chiplet rows: 200 - ├── Bitwise chiplet rows: 0 - ├── Memory chiplet rows: 0 - └── Kernel ROM rows: 0 + └── Chiplets rows: 250 + ├── Hash chiplet rows: 248 + ├── Bitwise chiplet rows: 0 + ├── Memory chiplet rows: 1 + └── Kernel ROM rows: 0 Success! We got the expected result of `55`. diff --git a/docs/usage/midenc.md b/docs/usage/midenc.md index 0a2ed6beb..a3530e43f 100644 --- a/docs/usage/midenc.md +++ b/docs/usage/midenc.md @@ -73,6 +73,7 @@ The `midenc` executable supports two primary functions at this time: * `midenc compile` to compile one of our supported input formats to Miden Assembly * `midenc debug` to run a Miden program attached to an interactive debugger +* `midenc run` to run a Miden program non-interactively, equivalent to `miden run` ## Compilation