-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Profiler] Implement WaitHandle wait duration profiling #6426
base: master
Are you sure you want to change the base?
Conversation
Datadog ReportBranch report: ✅ 0 Failed, 121740 Passed, 537 Skipped, 6h 7m 2.08s Total Time |
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6426) - mean (69ms) : 67, 71
. : milestone, 69,
master - mean (69ms) : 65, 73
. : milestone, 69,
section CallTarget+Inlining+NGEN
This PR (6426) - mean (978ms) : 960, 996
. : milestone, 978,
master - mean (979ms) : 952, 1006
. : milestone, 979,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6426) - mean (107ms) : 105, 110
. : milestone, 107,
master - mean (107ms) : 104, 110
. : milestone, 107,
section CallTarget+Inlining+NGEN
This PR (6426) - mean (678ms) : 662, 695
. : milestone, 678,
master - mean (678ms) : 663, 693
. : milestone, 678,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6426) - mean (91ms) : 89, 92
. : milestone, 91,
master - mean (91ms) : 89, 93
. : milestone, 91,
section CallTarget+Inlining+NGEN
This PR (6426) - mean (634ms) : 616, 651
. : milestone, 634,
master - mean (633ms) : 616, 650
. : milestone, 633,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6426) - mean (191ms) : 187, 195
. : milestone, 191,
master - mean (189ms) : 185, 193
. : milestone, 189,
section CallTarget+Inlining+NGEN
This PR (6426) - mean (1,091ms) : 1063, 1119
. : milestone, 1091,
master - mean (1,091ms) : 1058, 1124
. : milestone, 1091,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6426) - mean (276ms) : 271, 280
. : milestone, 276,
master - mean (275ms) : 270, 279
. : milestone, 275,
section CallTarget+Inlining+NGEN
This PR (6426) - mean (866ms) : 839, 892
. : milestone, 866,
master - mean (868ms) : 838, 898
. : milestone, 868,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6426) - mean (264ms) : 259, 269
. : milestone, 264,
master - mean (264ms) : 259, 270
. : milestone, 264,
section CallTarget+Inlining+NGEN
This PR (6426) - mean (848ms) : 810, 885
. : milestone, 848,
master - mean (846ms) : 817, 874
. : milestone, 846,
|
currentThreadInfo->SetWaitStart(0ns); | ||
|
||
// We are not interested in waits that are too short | ||
if (waitDuration < 1ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. Assuming a memcache request takes 300 us to complete, we would miss all sync-over-async on them, even though it could be a very serious issue.
What would be the issue with not filtering any waits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. Assuming a memcache request takes 300 us to complete, we would miss all sync-over-async on them, even though it could be a very serious issue.
What would be the issue with not filtering any waits?
We could have an impact on performances (both with the number of samples to serialize and the collection).
This sampling duration will be settable via environment variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, this event is very cheap in term of number but also in size, especially compared to GCAllocationTick so I wouldn't be too concerned.
return; | ||
} | ||
|
||
AddContentionSample(0ns, -1, currentThreadInfo->GetWaitType(), waitDuration, 0, WStr(""), _emptyStack); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the wait events be branded as contention? I'm not sure sync-over-async can be considered as contention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the wait events be branded as contention? I'm not sure sync-over-async can be considered as contention.
This is an easy way to make the wait information available immediately.
I'm not sure it is worth creating a new profile type but we are discussing it.
I was hoping to be able to make the difference between the different synchronization mechanisms based on the AssociatedObjectID
but it is not always available so it won't be possible: the user will have to look at the call stacks in the flamegraph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a risk that contention events will show up twice?
Summary of changes
Listen to .NET 9+ events related to WaitHandle-derivated instances to profile wait durations
Reason for change
R&D week based on dotnet/runtime#94737 and #5814
Implementation details
Listen to WaitHandleWaitStart/WaitHandleWaitStop events to measure wait duration and, if possible, get the type of wait object like Mutex, Semaphore, ManualResetEvent or AutoResetEvent
Test coverage
Other details