Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

waitUntilQueued times out due to the pin internal lock #57

Open
arodchen opened this issue Jul 10, 2015 · 11 comments
Open

waitUntilQueued times out due to the pin internal lock #57

arodchen opened this issue Jul 10, 2015 · 11 comments

Comments

@arodchen
Copy link
Contributor

When syscallLeave is called from SyscallEnter waitUntilQueued can time out if the woken thread is about to call SyscallEnter before reaching sleeping point on schedLock futex. The reason for that is that before calling the function registered in PIN_AddSyscallEntryFunction the pin client lock is obtained as it is written in the Pin documentation.

It is possible to move some functionality from SyscallEnter to another function without a lock by adding instrumentation on INS_IsSyscall condition. I tried to move call to syscallLeave to the unlocked function which is called before SyscallEnter but ZSim failed with panic("Wakeup race in barrier?").

Do you have any ideas how to solve that problem?

@s5z
Copy link
Owner

s5z commented Jul 10, 2015

There might be conventional instrumentation being added between the INS_IsSyscall call and the syscall actually being taken. Have you tried inserting the SyscallEnter call with CALL_ORDER_LAST?

@arodchen
Copy link
Contributor Author

Thanks for the hint! I haven't tried it yet. Will check if that works.

arodchen added a commit to arodchen/zsim that referenced this issue Jul 12, 2015
…d SyscallEnterLocked to avoid waitUntilQueued time-outs due to the pin internal lock

The functionality of SyscallEnter is moved to SyscallEnterUnlocked called without acquiring the Pin internal lock. The actual modification of system call arguments is done in SyscallEnterLocked with acquiring the Pin internal lock. This commit fixes the following issue: s5z#57
@arodchen
Copy link
Contributor Author

Independently of CALL_ORDER_LAST parameter INS_IsSyscall call (with IPOINT_BEFORE param) is performed before PIN_AddSyscallEntryFunction call. So I chose to split SyscallEnter into 2 methods SyscallEnterUnlocked (INS_IsSyscall call) and SyscallEnterLocked (AddSyscallEntryFunction call). So all the initial functionality of SyscallEnter is moved to SyscallEnterUnlocked and modifications to syscall args are saved. In SyscallEnterLocked saved modifications are applied.

@gaomy3832
Copy link
Contributor

It has been a while since this issue opened. I have also seen the related pull request #60.

Recently I start to see a lot of waitUntilQueued() failures which force me to look closer into this. I found that the problem is not just due to syscall. In fact, waitUntilQueued() assumes the wakee thread will wait on schedLock after being woken up. However, in waitForContext(), which is the only place that threads can wait on futexWord, i.e., the futex wakeup() uses, the threads will only grab schedLock if needsJoin is true. However, in all the call sites of waitForContext(), none of them set needsJoin to true. (this is because the core is handed over from one thread to another, so the core never leaves and thus never needs to join the barrier). Therefore it never grabs the lock after join.

The syscall issue is a consequence of this bug, because if the thread does not grad schedLock after woken up, it may proceed to app code and enter syscall, which instead wait on the pin lock.

I tried a simple fix in another pull request of mine (which also relate to scheduler.cpp/h), i.e., grab schedLock regardless. See #114 and commit 51c4672. And it works fine for my test cases and use cases. If that is the correct way to fix this, I think the split of syscallEnter is not necessary any more. But I may miss something here, so please let me know.

@arodchen
Copy link
Contributor Author

arodchen commented Nov 4, 2016

When I analyzed this problem I came to conclusion that it is related to the the PIN lock hierarchy violation:
https://software.intel.com/sites/landingpage/pintool/docs/49306/Pin/html/#Deadlock

The following sequence of events happened:
Thread 1: [internal PIN lock is acquired]SyscallEnter()->leave()->wakeup()->waitUntilQueued()
Thread 2: The woken thread is running but at some point it tries to acquire the internal PIN lock and deadlock happens.

With the fix proposed in the pull request #60 it works the following way:
Thread 1: SyscallEnterUnlocked()->leave()->wakeup()->waitUntilQueued() ... [wait until the other thread acquires schedLock] -> [internal PIN lock is acquired]SyscallEnterLocked() ...
Thread 2: The woken thread is running, at some point acquires and releases the internal PIN lock, reaches the point where it acquires schedLock, so Thread 1 exits waitUntilQueued()
So the condition that the woken thread will acquire schedLock after being woken up is met at some point.

I don't quite get the fix 51c4672. It is mentioned that needsJoin is never set to true at waitForContext(). In this case schedLock does not guard anything, and it can be acquired and released before being observed at waitUntilQueued(). Perhaps, it can happen that with fix in 51c4672 it will not reach waitForContext() for the same reason (that dead lock on internal PIN lock will happen). Please, correct me if I am wrong.

@gaomy3832
Copy link
Contributor

I am not 100% on the scheduler logic, either. Let me try to explain my fix:

In your example without your fix, thread 2 will be woken up, and right after this wakeup, it is now inside the function waitForContext(). With my fix, the first thing this thread tries to do is to grab the schedLock regardless of anything. This will be detected by waitUntilQueued() of thread 1, and thus thread 1 won't be blocked by it, and can return the call chain that you listed, all the way back to SyscallEnter(), and also release the pin lock.

The key here is that waitUntilQueued() will only exit successfully if the wakee thread (thread 2) is waiting on the schedLock, which is used as the hint that thread 2 has been made running by the host OS, and joining the simulation scheduler (see the comments in #15). Without my fix, thread 2 may not wait on the schedLock, therefore even after it successfully joins the simulation, thread 1 won't detect that, and thus unnecessarily blocked in waitUntilQueued(). At some time, when thread 2 tries to acquire the pin lock, deadlock happens as you described.

To summary, I believe that as long as we enforce thread 2 to wait on the schedLock as the first thing after it is up, then thread 1 can quickly exit, and thus the deadlock won't happen.

Let me know if anything is unclear. I love to discuss this so we all get better understanding of the code :)

@gaomy3832
Copy link
Contributor

Another thing to mention: waitUntilQueued() is always called with schedLock hold (because wakeup() is always called with schedLock hold). So it guarantees that as long as thread 2 tries to grab schedLock after wakeup, this will be detected by thread 1, because the lock is held by 1, and 2 must wait on the lock.

@arodchen
Copy link
Contributor Author

arodchen commented Nov 4, 2016

I was assuming the following logic of this code. The thread is woken and it needs to reach the scheduler. If th->needsJoin is true it reaches it straight away as you say, but if not then it reaches it later. On the way it may or may not have a system call in the code where it can deadlock. This is how I understand this code.

I agree waitUntilQueued() is always called with schedLock, so there will be no race condition. So the question is whether the woken thread should indicate straight away as with your fix or later on as with my fix (and as it happens now when no syscall is on the way).

@gaomy3832
Copy link
Contributor

I agree that your understanding is correct based on the current code. I think the woken thread should reach the scheduler ASAP, to ensure that it joins the simulation.

I think my fix is more general (in original code, even without syscalls, there can still be deadlocks if thread 2 does not reach scheduler soon enough), but there are extra overheads (lock acquire, though I think it should be minor). Your fix is focusing on syscall, and the overhead should be smaller.

I think we are on the same page now. Let's leave it there for the authors to decide which fix to accept. Either should fix the problems we have.

@arodchen
Copy link
Contributor Author

arodchen commented Nov 4, 2016

My goal was to preserve the current behaviour of ZSim when waitUntilQueued does not time out and let it overcome the problem with internal PIN lock deadlock and related timeout. I agree, let's wait for the authors to decide.

@hlitz
Copy link

hlitz commented Jun 29, 2017

I experienced this issue myself and believe that @arodchen patch is correct. It fixes a deadlock that @gaomy3832 patch cannot handle:

  1. Thread A calls nanosleep which is turned into a FUTEX_WAIT by zsim.
  2. Thread B executes a systemcall, obtaining PIN's global lock
  3. Thread C obtains schedlock, does a FUTEX_WAKE to wake thread A and then does waitUntilQueued(A)
  4. Thread B tries to obtain schedlock as part of processing the system call ->blocked
  5. Thread A wakes up and tries to execute the nanosleep postPatch function, therefore it needs to obtain PIN's global lock which is already hold by Thread B.
    --> Deadlock

I suggest anybody merging @arodchen patch for fixing this issue.

caizixian pushed a commit to caizixian/zsim that referenced this issue Jun 6, 2023
…d SyscallEnterLocked to avoid waitUntilQueued time-outs due to the pin internal lock

The functionality of SyscallEnter is moved to SyscallEnterUnlocked called without acquiring the Pin internal lock. The actual modification of system call arguments is done in SyscallEnterLocked with acquiring the Pin internal lock. This commit fixes the following issue: s5z#57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants