Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DWARF] Incorrect location descriptor #9900

Open
SingleAccretion opened this issue Dec 24, 2024 · 3 comments
Open

[DWARF] Incorrect location descriptor #9900

SingleAccretion opened this issue Dec 24, 2024 · 3 comments
Labels
wasmtime:debugging Issues related to debugging of JIT'ed code

Comments

@SingleAccretion
Copy link
Contributor

Reproduction steps:

  1. Check out [DWARF] Transform instance methods into static methods with a __this parameter #9898.
  2. Modify the added test to stop passing -Oopt-level=0 to wasmtime.
  3. Run the test.

Expected result: the test passes or fails with 'expression unavailable'.

Actual result:

error: Execution was interrupted, reason: Exception 0xc0000005 encountered at address 0x7ff6fcf15f90: Access violation reading location 0x1ae198763ec.
The process has been returned to the state before expression evaluation.

* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x0000028d720c15b8 JIT(0x28d72210080)`DerivedType::InstanceMethod(__this=(__ptr = <read memory from 0x28de65f5f0c failed (0 of 4 bytes read)>)) at generic.cpp:26:5
   23     long long DerivedValue = 2;
   24
   25     int InstanceMethod() {
-> 26       debug_break();
   27       return BaseValue + DerivedValue;
   28     }
   29
(lldb) v
(WasmtimeVMContext *) __vmctx = 0x0000028d72165f00
(WebAssemblyPtrWrapper<DerivedType>) __this = (__ptr = <read memory from 0x28de65f5f0c failed (0 of 4 bytes read)>)

It means the location descriptor for __this is incorrect.

@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented Dec 24, 2024

This is an interesting one. Here's how to reproduce the failure:

> lldb.exe -o "b InstanceMethod" -o "r" -- wasmtime.exe -Ccache=n -Ddebug-info generic.wasm

   25     int InstanceMethod() {
-> 26       debug_break();
   27       return BaseValue + DerivedValue;
   28     }
   29
(lldb) v
(WasmtimeVMContext *) __vmctx = 0x000001dc1b2b2e10
(WebAssemblyPtrWrapper<DerivedType>) this = (__ptr = <read memory from 0x1dc38812e1c failed (0 of 4 bytes read)>)

Here's the expression for this/__this:

Variable: id = {0x0000048d}, name = "this", type = "WebAssemblyPtrWrapper<DerivedType>", valid ranges = <block>,
          location = DW_OP_breg4 RSI+0, DW_OP_plus_uconst 0xc, DW_OP_breg5 RDI+96, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, decl =
          {RSI + 0xc, [RDI + 96]} ==> {[RDI + 96], RSI + 0xc} ==> [RDI + 96] + (RSI + 0xc) & 0xffffffff
; [RDI + 96] is __vmctx->memory
; RSI is the linear memory frame pointer (WASM level frame base, sans offset)

Here's annotated disassembly:

    0x262b49d1520 <+0>:   push   rbp
    0x262b49d1521 <+1>:   mov    rbp, rsp
    0x262b49d1524 <+4>:   mov    r10, qword ptr [rdi + 0x8]
    0x262b49d1528 <+8>:   mov    r10, qword ptr [r10 + 0x10]
    0x262b49d152c <+12>:  add    r10, 0x30
    0x262b49d1530 <+16>:  cmp    r10, rsp
    0x262b49d1533 <+19>:  ja     0x262b49d15ae  ; <+142> at generic.cpp:27:5
    0x262b49d1539 <+25>:  sub    rsp, 0x20
    0x262b49d153d <+29>:  mov    qword ptr [rsp], rbx
    0x262b49d1541 <+33>:  mov    qword ptr [rsp + 0x8], r12
    0x262b49d1546 <+38>:  mov    qword ptr [rsp + 0x10], r13
    0x262b49d154b <+43>:  mov    qword ptr [rsp + 0x18], r15
    0x262b49d1550 <+48>:  mov    r15d, dword ptr [rdi + 0xa0] ; r15 = __vmctx->__stack_pointer
    0x262b49d1557 <+55>:  mov    rsi, r15                     ; rsi = r15 (__stack_pointer)
    0x262b49d155a <+58>:  sub    esi, 0x10                    ; rsi -= 16
    0x262b49d155d <+61>:  mov    dword ptr [rdi + 0xa0], esi  ; __vmctx->__stack_pointer = rsi
    0x262b49d1563 <+67>:  mov    r12, qword ptr [rdi + 0x60]
    0x262b49d1567 <+71>:  mov    dword ptr [r12 + rsi + 0xc], edx
    0x262b49d156c <+76>:  mov    rbx, rdi
    0x262b49d156f <+79>:  mov    r13, rdx
    0x262b49d1572 <+82>:  mov    rsi, rbx ; rsi = __vmctx
->  0x262b49d1575 <+85>:  call   0x262b49d1180  ; debug_break at generic.h:9:28
    0x262b49d157a <+90>:  mov    rdx, r13
    0x262b49d157d <+93>:  mov    edx, edx
    0x262b49d157f <+95>:  mov    ecx, dword ptr [r12 + rdx]
    0x262b49d1583 <+99>:  mov    rdx, qword ptr [r12 + rdx + 0x8]
    0x262b49d1588 <+104>: mov    dword ptr [rbx + 0xa0], r15d ; __vmctx->__stack_pointer = r15
    0x262b49d158f <+111>: lea    eax, [rcx + rdx]
    0x262b49d1592 <+114>: mov    rbx, qword ptr [rsp]
    0x262b49d1596 <+118>: mov    r12, qword ptr [rsp + 0x8]
    0x262b49d159b <+123>: mov    r13, qword ptr [rsp + 0x10]
    0x262b49d15a0 <+128>: mov    r15, qword ptr [rsp + 0x18]
    0x262b49d15a5 <+133>: add    rsp, 0x20
    0x262b49d15a9 <+137>: mov    rsp, rbp
    0x262b49d15ac <+140>: pop    rbp
    0x262b49d15ad <+141>: ret
    0x262b49d15ae <+142>: ud2

So the culprit here is the fact rsi, which initially does indeed contain the original WASM frame base (local 0), from which everything else gets evaluated, gets overwritten just before the call we've stopped on, and this is either not reflected in the frame info, or is not accounted for correctly.

BTW, here we can also see why the WASM frame base is so prone to being "optimized out" even as it is always used at the end of the method. Local zero gets split into two distinct values: one referenced in code and not referenced in DI (r15), the other - not referenced in code, but referenced in DI (rsi).

@alexcrichton alexcrichton added the wasmtime:debugging Issues related to debugging of JIT'ed code label Dec 27, 2024
@SingleAccretion
Copy link
Contributor Author

Upgrading the logging for location descriptor building, here's what we see:

Building ranges for values:    
L#0  : %rdx@[49..86) %r13@[86..94)
L#1  : %r15@[49..105)
L#3  : %rsi@[59..72)
L#4  : %rdx@[94..100)
L#5  : %rcx@[96..112)
L#6  : %rcx@[96..112)
L#8  : %rdx@[100..112)
VMCTX: %rdi@[49..86) %rbx@[86..115)

L3 here is our frame base (DW_OP_WASM_location 0x0 0x3), and it is correctly reported to live for only these three instructions:

    0x262b49d155d <+61>:  mov    dword ptr [rdi + 0xa0], esi  ; __vmctx->__stack_pointer = rsi
    0x262b49d1563 <+67>:  mov    r12, qword ptr [rdi + 0x60]
    0x262b49d1567 <+71>:  mov    dword ptr [r12 + rsi + 0xc], edx

This implies a bug in range intersection.

@SingleAccretion
Copy link
Contributor Author

The complete range info looks like this:

Building ranges for values in scope: [0..144)
L#0   : %rdx@[49..86) %r13@[86..94)
L#1   : %r15@[49..105)
L#3   : %rsi@[59..72)
L#4   : %rdx@[94..100)
L#5   : %rcx@[96..112)
L#6   : %rcx@[96..112)
L#8   : %rdx@[100..112)
VMCTX : %rdi@[49..86) %rbx@[86..115)    
Intersecting with L#3    
Intersecting with VMCTX    
Built ranges:    
[VMCTX:%rdi, L#3:%rsi]@[59..72)    

So we end up with a correct (single) range. That leads me to believe the bug is here:

let found_single_expr = {
// Micro-optimization all expressions alike, use one exprloc.
let mut found_expr: Option<write::Expression> = None;
for (_, _, expr) in &exprs {
if let Some(ref prev_expr) = found_expr {
if expr == prev_expr {
continue; // the same expression
}
found_expr = None;
break;
}
found_expr = Some(expr.clone())
}
found_expr
};

We can only optimize to a single expression when the range covers the entire scope, which I suspect it almost never will.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:debugging Issues related to debugging of JIT'ed code
Projects
None yet
Development

No branches or pull requests

2 participants