-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Randomize the order of tests #5831
Conversation
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5831) - mean (70ms) : 64, 76
. : milestone, 70,
master - mean (69ms) : 66, 72
. : milestone, 69,
section CallTarget+Inlining+NGEN
This PR (5831) - mean (981ms) : 961, 1001
. : milestone, 981,
master - mean (975ms) : 950, 1001
. : milestone, 975,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5831) - mean (108ms) : 106, 109
. : milestone, 108,
master - mean (108ms) : 105, 110
. : milestone, 108,
section CallTarget+Inlining+NGEN
This PR (5831) - mean (679ms) : 663, 696
. : milestone, 679,
master - mean (679ms) : 664, 694
. : milestone, 679,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5831) - mean (91ms) : 89, 93
. : milestone, 91,
master - mean (91ms) : 90, 93
. : milestone, 91,
section CallTarget+Inlining+NGEN
This PR (5831) - mean (633ms) : 614, 651
. : milestone, 633,
master - mean (635ms) : 619, 651
. : milestone, 635,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5831) - mean (190ms) : 186, 195
. : milestone, 190,
master - mean (194ms) : 189, 198
. : milestone, 194,
section CallTarget+Inlining+NGEN
This PR (5831) - mean (1,092ms) : 1056, 1128
. : milestone, 1092,
master - mean (1,094ms) : 1067, 1122
. : milestone, 1094,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5831) - mean (274ms) : 270, 278
. : milestone, 274,
master - mean (277ms) : 273, 281
. : milestone, 277,
section CallTarget+Inlining+NGEN
This PR (5831) - mean (868ms) : 842, 894
. : milestone, 868,
master - mean (871ms) : 849, 892
. : milestone, 871,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5831) - mean (264ms) : 259, 269
. : milestone, 264,
master - mean (267ms) : 263, 271
. : milestone, 267,
section CallTarget+Inlining+NGEN
This PR (5831) - mean (848ms) : 814, 883
. : milestone, 848,
master - mean (855ms) : 820, 889
. : milestone, 855,
|
Datadog ReportBranch report: ✅ 0 Failed, 248661 Passed, 1997 Skipped, 19h 22m 23.71s Total Time New Flaky Tests (1)
|
Benchmarks Report for tracer 🐌Benchmarks for #5831 compared to master:
The following thresholds were used for comparing the benchmark speeds:
Allocation changes below 0.5% are ignored. Benchmark detailsBenchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SpanBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.TraceAnnotationsBenchmark - Slower
|
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin‑net6.0 | 1.140 | 598.41 | 682.00 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | RunOnMethodBegin |
net6.0 | 598ns | 0.295ns | 1.1ns | 0.00961 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
netcoreapp3.1 | 944ns | 0.533ns | 2.06ns | 0.00949 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
net472 | 1.14μs | 0.791ns | 3.06ns | 0.104 | 0 | 0 | 658 B |
#5831 | RunOnMethodBegin |
net6.0 | 682ns | 0.399ns | 1.54ns | 0.00953 | 0 | 0 | 696 B |
#5831 | RunOnMethodBegin |
netcoreapp3.1 | 965ns | 1.45ns | 5.63ns | 0.00959 | 0 | 0 | 696 B |
#5831 | RunOnMethodBegin |
net472 | 1.12μs | 0.802ns | 3.1ns | 0.104 | 0 | 0 | 658 B |
5835af8
to
614d579
Compare
614d579
to
b5de117
Compare
Throughput/Crank Report ⚡Throughput results for AspNetCoreSimpleController comparing the following branches/commits: Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red. Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards! gantt
title Throughput Linux x64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (5831) (11.110M) : 0, 11109933
master (11.077M) : 0, 11076833
benchmarks/2.9.0 (11.081M) : 0, 11080577
section Automatic
This PR (5831) (7.335M) : 0, 7335171
master (7.345M) : 0, 7344633
benchmarks/2.9.0 (7.732M) : 0, 7732233
section Trace stats
master (7.679M) : 0, 7679279
section Manual
master (10.938M) : 0, 10938170
section Manual + Automatic
This PR (5831) (6.847M) : 0, 6847016
master (6.834M) : 0, 6834026
section DD_TRACE_ENABLED=0
master (10.142M) : 0, 10142100
gantt
title Throughput Linux arm64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (5831) (9.431M) : 0, 9430915
master (9.565M) : 0, 9564825
benchmarks/2.9.0 (9.798M) : 0, 9798067
section Automatic
This PR (5831) (6.546M) : 0, 6546446
master (6.519M) : 0, 6519301
section Trace stats
master (6.608M) : 0, 6607677
section Manual
master (9.501M) : 0, 9500792
section Manual + Automatic
This PR (5831) (6.150M) : 0, 6149537
master (6.130M) : 0, 6130416
section DD_TRACE_ENABLED=0
master (8.723M) : 0, 8723414
gantt
title Throughput Windows x64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (5831) (10.023M) : 0, 10023196
master (10.195M) : 0, 10195122
benchmarks/2.9.0 (10.067M) : 0, 10067315
section Automatic
This PR (5831) (6.887M) : 0, 6887310
master (6.952M) : 0, 6952184
benchmarks/2.9.0 (7.552M) : 0, 7552193
section Trace stats
master (7.465M) : 0, 7465157
section Manual
master (10.356M) : 0, 10356333
section Manual + Automatic
This PR (5831) (6.276M) : 0, 6276346
master (6.232M) : 0, 6232377
section DD_TRACE_ENABLED=0
master (9.469M) : 0, 9469059
|
32ae961
to
5766a59
Compare
5766a59
to
c02c220
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤞 Thanks
Summary of changes
Randomize the order of the tests.
Reason for change
Flaky tests are much harder to fix when we discover them long after they have been written. By randomizing the order of the tests, I'm hoping to make them fail earlier.
In practice, this could temporarily increase the overall flakiness, but I expect this will reduce the overall effort spent on fixing tests.
Implementation details
In
CustomTestFramework
, randomize the list of all tests in each collections, and the collections themselves.The seed is displayed in the output. When a test order causes tests to fail, this allows to deterministically reproduce that test order.
Other details
Four other issues were found thanks to this: #6535, #6532, #6511, #6509