-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: mcontrol/mcontrol6 on fetch #1459
base: master
Are you sure you want to change the base?
fix: mcontrol/mcontrol6 on fetch #1459
Conversation
a8ffad0
to
1779c35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'm not really familiar with this part of the code.
@YenHaoChen can you explain the failing scenario that 2128daa fixes? I can see one of my testcases was previously taking a fetch breakpoint, and no longer is after that change. I will study it to see which behavior is correct. |
The case where I see behavior change: tdata1[2]=0x200000000010010c (mcontrol match=2(ge) u=1 execute=1) After this PR, this PC executes without taking the breakpoint trap. I don't think the spec says one way or the other about this case, but it is a functional change that needs to be documented in the PR. I guess it's only comparing the first byte of the fetch, instead of each byte? |
The fetching instruction data are in refill_icache() instead of fetch_slow_path(). This commit moves the mcontrol/mcontrol6 checking from fetch_slow_path() to refill_icache() for the correct data value. For example, consider an instruction 0xe9830313 (addi t1, t1, -360). The previous implementation fires mcontrol data triggers with tdata2=0xe983 and tdata2=0x0313 but does not fire one with tdata2=0xe9830313, which is incorrect. This commit fixes the issue.
The fetching PC is in refill_icache() instead of fetch_slow_path(). This commit moves the address mcontrol/mcontrol6 checking from fetch_slow_path() to refill_icache() for the correct fetching PC. For example, consider a 4-byte instruction at PC=0x4000. The Debug specification recommends to compare PC={0x4000,0x4001,0x4002,0x4003}; otherwise, the Debug specification requires to compare PC=0x4000. The previous implementation compares PC={0x4000,0x4002}, which is undesired.
1779c35
to
e8543e3
Compare
@scottj97 The first commit 0578364 fixes the instruction data comparison. For instance, an instruction data 0xe9830313 cannot match a data trigger with tdata2=0xe9830313 but instead ones with tdata2={0xe983,0x0313} without this commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have good test coverage of data triggers, so I'm not surprised that I didn't see that difference.
The last commit I am unsure about and @aswaterman should review.
e8543e3
to
661c6a2
Compare
auto access_info = generate_access_info(addr, FETCH, {false, false, false}); | ||
check_triggers(triggers::OPERATION_EXECUTE, addr, access_info.effective_virt); | ||
if (unlikely(tlb_insn_tag[vpn % TLB_ENTRIES] != vpn)) | ||
check_triggers(triggers::OPERATION_EXECUTE, addr, access_info.effective_virt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this will spuriously invoke check_triggers
on ITLB misses. Note that fetches from MMIO devices accesses always result in ITLB misses, so this means check_triggers
will be invoked on every MMIO fetch. Maybe this is functionally correct and is only a perf issue, but it seems undesirable.
What's wrong with my original suggestion of moving this check back into fetch_slow_path
? By construction, we should not need to check the triggers on the fetch fast path...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation splits instruction fetching into multiple 2-byte operations, which call fetch_slow_path
separately. This results in that a 4-byte instruction 0xe9830313 matches data triggers with tdata2={0xe983, 0x0313} instead of one with tdata2=0xe9830313 incorrectly. Additionally, a 4-byte instruction with PC at 0x4000 matches address triggers with tdata2={0x4000, 0x4002}, which is unexpected from a user perspective.
The behavior of the data trigger in the current implementation does not conform to the specification. It is weird to me that the address trigger matches PCs of 2-byte granularity. Clarify whether the specification allows the address trigger to match an arbitrary subset of the accessing addresses: riscv/riscv-debug-spec#877.
Update 2023.09.20:
The riscv/riscv-debug-spec#877 clarifies that the specification allows the address trigger to match an arbitrary subset of the accessing addresses. Thus, the specification allows the original behavior of the address trigger. Possibly, we can improve performance by keeping address trigger checking in fetch_slow_path
. However, the data trigger truly does not work in fetch_slow_path
. Moving address triggers out from fetch_slow_path
also provides a more reasonable behavior from a user perspective.
@aswaterman How do you feel about this?
The fetch address and data are in refill_icache() instead of fetch_slow_path(). This commit moves the mcontrol/mcontrol6 checkings from fetch_slow_path() to refill_icache() for correct fetch values (2128daa).
The last commit (1779c35) is for performance improvement. I need help verifying the effectiveness.