Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debuglink response can get cut off if a workflow ends #4401

Open
matejcik opened this issue Nov 28, 2024 · 1 comment
Open

Debuglink response can get cut off if a workflow ends #4401

matejcik opened this issue Nov 28, 2024 · 1 comment
Labels
bug Something isn't working as expected

Comments

@matejcik
Copy link
Contributor

Describe the bug
This does not usually happen on emulator because sending data on emulator is very fast.

However, if sending data is not fast, the following race condition may occur:

  1. host asks for DebugLinkGetState, where the response needs multiple USB packets (this is most screens actually, the layout json is long)
  2. firmware starts streaming the DebugLinkState response, pausing on every write until the USB link is ready
  3. during the pause, wirelink workflow handler is resumed
  4. wirelink workflow handler exits and starts sending a Success message, which fits in just one packet
  5. Success packet write loop ends (the packet is in an outgoing buffer or w/e)
  6. wirelink calls loop.clear() to clean out the memory and "reboot" the micropython env
  7. this clears out the pending debuglink write task
  8. host gets stuck waiting for remaining packets of the DebugLinkState response which will never be sent

A hot-fix on host side is to implement a timeout for the DebugLinkState wait, and ignore failures here.

Proper fix on firmware side involves something like registering "units of work" with the workflow manager (and/or loop scheduler), and run loop.clear() not after a workflow but instead "when no units of work are pending".
(This approach would also solve the problems that THP is facing with not knowing very well when it's safe to call loop.clear())

@matejcik matejcik added the bug Something isn't working as expected label Nov 28, 2024
@mmilata
Copy link
Member

mmilata commented Dec 2, 2024

On current main + #4375 I seem to be able to reproduce it often with PYTEST_TIMEOUT=24 pytest -v ../tests/device_tests/ -k "test_load_device_utf".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected
Projects
Status: No status
Development

No branches or pull requests

2 participants