-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback to Parking in Bounded and Unbounded Channels #1105
base: master
Are you sure you want to change the base?
Conversation
d1602c0
to
e2f54ea
Compare
e2f54ea
to
61013ca
Compare
We probably also want to introduce a non-linearizable version of |
Here are the benchmark results before and after. Note that all of the fallback cases are very unlikely to be hit on most setups, so this is just measuring the performance difference of new wakeup algorithm, which seems to be pretty mixed. - test bounded_1::create ... bench: 91 ns/iter (+/- 0)
+ test bounded_1::create ... bench: 82 ns/iter (+/- 3)
- test bounded_1::mpmc ... bench: 10,576,639 ns/iter (+/- 1,012,158)
+ test bounded_1::mpmc ... bench: 11,206,415 ns/iter (+/- 2,787,440)
- test bounded_1::mpsc ... bench: 21,551,625 ns/iter (+/- 975,344)
+ test bounded_1::mpsc ... bench: 21,975,220 ns/iter (+/- 390,863)
- test bounded_1::oneshot ... bench: 116 ns/iter (+/- 1)
+ test bounded_1::oneshot ... bench: 104 ns/iter (+/- 3)
- test bounded_1::spmc ... bench: 22,192,680 ns/iter (+/- 760,426)
+ test bounded_1::spmc ... bench: 22,268,657 ns/iter (+/- 559,440)
- test bounded_1::spsc ... bench: 23,097,047 ns/iter (+/- 76,889)
+ test bounded_1::spsc ... bench: 22,753,449 ns/iter (+/- 143,048)
- test bounded_n::mpmc ... bench: 2,387,457 ns/iter (+/- 313,821)
+ test bounded_n::mpmc ... bench: 2,734,135 ns/iter (+/- 315,206)
- test bounded_n::mpsc ... bench: 5,909,059 ns/iter (+/- 81,411)
+ test bounded_n::mpsc ... bench: 6,635,209 ns/iter (+/- 134,351)
- test bounded_n::par_inout ... bench: 6,835,077 ns/iter (+/- 234,552)
+ test bounded_n::par_inout ... bench: 6,583,975 ns/iter (+/- 210,953)
- test bounded_n::spmc ... bench: 5,649,130 ns/iter (+/- 127,809)
+ test bounded_n::spmc ... bench: 5,748,161 ns/iter (+/- 130,954)
- test bounded_n::spsc ... bench: 1,126,902 ns/iter (+/- 32,449)
+ test bounded_n::spsc ... bench: 1,194,757 ns/iter (+/- 15,458)
- test unbounded::create ... bench: 64 ns/iter (+/- 1)
+ test unbounded::create ... bench: 57 ns/iter (+/- 1)
- test unbounded::inout ... bench: 36 ns/iter (+/- 0)
+ test unbounded::inout ... bench: 44 ns/iter (+/- 0)
- test unbounded::mpmc ... bench: 1,181,564 ns/iter (+/- 204,692)
+ test unbounded::mpmc ... bench: 1,200,760 ns/iter (+/- 174,606)
- test unbounded::mpsc ... bench: 2,900,684 ns/iter (+/- 196,933)
+ test unbounded::mpsc ... bench: 2,144,163 ns/iter (+/- 199,247)
- test unbounded::oneshot ... bench: 103 ns/iter (+/- 0)
+ test unbounded::oneshot ... bench: 113 ns/iter (+/- 0)
- test unbounded::par_inout ... bench: 4,799,151 ns/iter (+/- 487,975)
+ test unbounded::par_inout ... bench: 6,485,203 ns/iter (+/- 407,363)
- test unbounded::spmc ... bench: 2,366,594 ns/iter (+/- 46,477)
+ test unbounded::spmc ... bench: 2,014,760 ns/iter (+/- 28,476)
- test unbounded::spsc ... bench: 1,061,444 ns/iter (+/- 14,589)
+ test unbounded::spsc ... bench: 1,392,352 ns/iter (+/- 11,792) |
9d6381c
to
cff8e5e
Compare
cff8e5e
to
b7277c9
Compare
There are some changes I made on the rust-lang PR that have to be applied to this one as well. I believe there's a large optimization here than can be made which is only notifying a waiter if the current thread sees remaining data on the channel. |
Since some of the issues indicate priority inversions I wonder if parking is the right choice or using inheritance-aware synchronization would be better on platforms that support it. |
Currently, the queues used by the bounded and unbounded channels (also
ArrayQueue
andSegQueue
) are not lock-free. The linearization point of a send or receive is moving the tail or head index respectively, meaning that if a sender or receiver sees the head/tail has moved but the value has not been written yet, they must spin until the corresponding receive or send completes that frees up space in the channel. Blocking in such a case is not possible, as sends might notify a receiver even though they are not visible due to a lagging sender holding up the channel, or vice versa, leading to missed wakeups.The current solution of unbounded spinning can lead to issues, especially on platforms with a single core or custom thread priorities. One way to allow falling back to blocking is if waiters that see more values in the channel after being woken up notify any other waiters, making up for any missed notifications. However, this ends up being quite expensive and can lead to many unnecessary wakeups. A solution to this is wakeup throttling as implemented by tokio and other userspace schedulers, which avoids notifications if there is already an active thread, the "waker thread". That thread will then notify another waiter if it finds a message, creating a new waker thread. This PR changes the bounded and unbounded channels to fallback to blocking after observing a linearizability condition, using wakeup throttling to makeup for missed notifications without creating too many unnecessary wakeups.
Thanks to @kprotty for pointing me in the right direction for the wakeup throttling algorithm. Hopefully paired with #1038 this should resolve all of the spinning issues in crossbeam/std.
Should resolve #366 and #997 (as well as rust-lang/rust#114851 and rust-lang/rust#112723 when upstreamed to std).