Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Profiler] Implement WaitHandle wait duration profiling #6426

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

chrisnas
Copy link
Contributor

Summary of changes

Listen to .NET 9+ events related to WaitHandle-derivated instances to profile wait durations

Reason for change

R&D week based on dotnet/runtime#94737 and #5814

Implementation details

Listen to WaitHandleWaitStart/WaitHandleWaitStop events to measure wait duration and, if possible, get the type of wait object like Mutex, Semaphore, ManualResetEvent or AutoResetEvent

Test coverage

Other details

@chrisnas chrisnas requested review from a team as code owners December 12, 2024 08:15
@github-actions github-actions bot added the area:profiler Issues related to the continous-profiler label Dec 12, 2024
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Dec 12, 2024

Datadog Report

Branch report: chrisnas/r+d_week_wait_events
Commit report: 797a57f
Test service: dd-trace-dotnet

✅ 0 Failed, 121740 Passed, 537 Skipped, 6h 7m 2.08s Total Time

@andrewlock
Copy link
Member

andrewlock commented Dec 12, 2024

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6426) - mean (69ms)  : 67, 71
     .   : milestone, 69,
    master - mean (69ms)  : 65, 73
     .   : milestone, 69,

    section CallTarget+Inlining+NGEN
    This PR (6426) - mean (978ms)  : 960, 996
     .   : milestone, 978,
    master - mean (979ms)  : 952, 1006
     .   : milestone, 979,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6426) - mean (107ms)  : 105, 110
     .   : milestone, 107,
    master - mean (107ms)  : 104, 110
     .   : milestone, 107,

    section CallTarget+Inlining+NGEN
    This PR (6426) - mean (678ms)  : 662, 695
     .   : milestone, 678,
    master - mean (678ms)  : 663, 693
     .   : milestone, 678,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6426) - mean (91ms)  : 89, 92
     .   : milestone, 91,
    master - mean (91ms)  : 89, 93
     .   : milestone, 91,

    section CallTarget+Inlining+NGEN
    This PR (6426) - mean (634ms)  : 616, 651
     .   : milestone, 634,
    master - mean (633ms)  : 616, 650
     .   : milestone, 633,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6426) - mean (191ms)  : 187, 195
     .   : milestone, 191,
    master - mean (189ms)  : 185, 193
     .   : milestone, 189,

    section CallTarget+Inlining+NGEN
    This PR (6426) - mean (1,091ms)  : 1063, 1119
     .   : milestone, 1091,
    master - mean (1,091ms)  : 1058, 1124
     .   : milestone, 1091,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6426) - mean (276ms)  : 271, 280
     .   : milestone, 276,
    master - mean (275ms)  : 270, 279
     .   : milestone, 275,

    section CallTarget+Inlining+NGEN
    This PR (6426) - mean (866ms)  : 839, 892
     .   : milestone, 866,
    master - mean (868ms)  : 838, 898
     .   : milestone, 868,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6426) - mean (264ms)  : 259, 269
     .   : milestone, 264,
    master - mean (264ms)  : 259, 270
     .   : milestone, 264,

    section CallTarget+Inlining+NGEN
    This PR (6426) - mean (848ms)  : 810, 885
     .   : milestone, 848,
    master - mean (846ms)  : 817, 874
     .   : milestone, 846,

Loading

currentThreadInfo->SetWaitStart(0ns);

// We are not interested in waits that are too short
if (waitDuration < 1ms)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this. Assuming a memcache request takes 300 us to complete, we would miss all sync-over-async on them, even though it could be a very serious issue.

What would be the issue with not filtering any waits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this. Assuming a memcache request takes 300 us to complete, we would miss all sync-over-async on them, even though it could be a very serious issue.

What would be the issue with not filtering any waits?

We could have an impact on performances (both with the number of samples to serialize and the collection).
This sampling duration will be settable via environment variable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, this event is very cheap in term of number but also in size, especially compared to GCAllocationTick so I wouldn't be too concerned.

return;
}

AddContentionSample(0ns, -1, currentThreadInfo->GetWaitType(), waitDuration, 0, WStr(""), _emptyStack);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean the wait events be branded as contention? I'm not sure sync-over-async can be considered as contention.

Copy link
Contributor Author

@chrisnas chrisnas Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean the wait events be branded as contention? I'm not sure sync-over-async can be considered as contention.

This is an easy way to make the wait information available immediately.
I'm not sure it is worth creating a new profile type but we are discussing it.
I was hoping to be able to make the difference between the different synchronization mechanisms based on the AssociatedObjectID but it is not always available so it won't be possible: the user will have to look at the call stacks in the flamegraph

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a risk that contention events will show up twice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:profiler Issues related to the continous-profiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants