-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(profiling): remove slow getpid call from memalloc path #11848
Conversation
|
Datadog ReportBranch report: ✅ 0 Failed, 289 Passed, 1179 Skipped, 8m 2.93s Total duration (28m 21.92s time saved) |
BenchmarksBenchmark execution time: 2025-01-02 19:16:11 Comparing candidate commit a69241a in PR branch Found 1 performance improvements and 0 performance regressions! Performance is the same for 393 metrics, 2 unstable metrics. scenario:flasksimple-profiler
|
memalloc uses getpid to detect whether the process has forked, so that we can unlock the memalloc lock in the child process (if it isn't already locked). Unfortunately the getpid call is quite slow. From the man page: "calls to getpid() always invoke the actual system call, rather than returning a cached value." Furthermore, we _always_ attempt to take the lock for allocations, even if we aren't going to sample them. So this is basically adding a syscall to every allocation. Move this logic out of the allocation path. Switch to using pthread_atfork handlers to ensure that the lock is held prior to forking, and unlock it in the parent and child after forking. This (maybe) has the added benefit of making sure the data structures are in a consistent state in the child process after forking. Unclear if that's an issue prior to this change, though. I may be missing some code that resets the profiler on fork anyway?
3277223
to
a69241a
Compare
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.17 2.17
# Navigate to the new working tree
cd .worktrees/backport-2.17
# Create a new branch
git switch --create backport-11848-to-2.17
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6bfe77ede64278fadbd64131fa14ad123417c7ec
# Push it to GitHub
git push --set-upstream origin backport-11848-to-2.17
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.17 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.19 2.19
# Navigate to the new working tree
cd .worktrees/backport-2.19
# Create a new branch
git switch --create backport-11848-to-2.19
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6bfe77ede64278fadbd64131fa14ad123417c7ec
# Push it to GitHub
git push --set-upstream origin backport-11848-to-2.19
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.19 Then, create a pull request where the |
memalloc uses getpid to detect whether the process has forked, so that we can unlock the memalloc lock in the child process (if it isn't already locked). Unfortunately the getpid call is quite slow. From the man page: "calls to getpid() always invoke the actual system call, rather than returning a cached value." Furthermore, we _always_ attempt to take the lock for allocations, even if we aren't going to sample them. So this is basically adding a syscall to every allocation. Move this logic out of the allocation path. Switch to using pthread_atfork handlers to ensure that the lock is held prior to forking, and unlock it in the parent and child after forking. This (maybe) has the added benefit of making sure the data structures are in a consistent state in the child process after forking. Unclear if that's an issue prior to this change, though. I may be missing some code that resets the profiler on fork anyway? (cherry picked from commit 6bfe77e)
…2.18] (#11849) Backport 6bfe77e from #11848 to 2.18. memalloc uses getpid to detect whether the process has forked, so that we can unlock the memalloc lock in the child process (if it isn't already locked). Unfortunately the getpid call is quite slow. From the man page: "calls to getpid() always invoke the actual system call, rather than returning a cached value." Furthermore, we _always_ attempt to take the lock for allocations, even if we aren't going to sample them. So this is basically adding a syscall to every allocation. Move this logic out of the allocation path. Switch to using pthread_atfork handlers to ensure that the lock is held prior to forking, and unlock it in the parent and child after forking. This (maybe) has the added benefit of making sure the data structures are in a consistent state in the child process after forking. Unclear if that's an issue prior to this change, though. I may be missing some code that resets the profiler on fork anyway?
memalloc uses getpid to detect whether the process has forked, so that we can unlock the memalloc lock in the child process (if it isn't already locked). Unfortunately the getpid call is quite slow. From the man page: "calls to getpid() always invoke the actual system call, rather than returning a cached value." Furthermore, we _always_ attempt to take the lock for allocations, even if we aren't going to sample them. So this is basically adding a syscall to every allocation. Move this logic out of the allocation path. Switch to using pthread_atfork handlers to ensure that the lock is held prior to forking, and unlock it in the parent and child after forking. This (maybe) has the added benefit of making sure the data structures are in a consistent state in the child process after forking. Unclear if that's an issue prior to this change, though. I may be missing some code that resets the profiler on fork anyway? (cherry picked from commit 6bfe77e)
…2.19] (#11964) Backport 6bfe77e from #11848 to 2.19. memalloc uses getpid to detect whether the process has forked, so that we can unlock the memalloc lock in the child process (if it isn't already locked). Unfortunately the getpid call is quite slow. From the man page: "calls to getpid() always invoke the actual system call, rather than returning a cached value." Furthermore, we _always_ attempt to take the lock for allocations, even if we aren't going to sample them. So this is basically adding a syscall to every allocation. Move this logic out of the allocation path. Switch to using pthread_atfork handlers to ensure that the lock is held prior to forking, and unlock it in the parent and child after forking. This (maybe) has the added benefit of making sure the data structures are in a consistent state in the child process after forking. Unclear if that's an issue prior to this change, though. I may be missing some code that resets the profiler on fork anyway? ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) Co-authored-by: Nick Ripley <[email protected]>
memalloc uses getpid to detect whether the process has forked, so that
we can unlock the memalloc lock in the child process (if it isn't
already locked). Unfortunately the getpid call is quite slow. From the
man page: "calls to getpid() always invoke the actual system call,
rather than returning a cached value." Furthermore, we always attempt
to take the lock for allocations, even if we aren't going to sample
them. So this is basically adding a syscall to every allocation.
Move this logic out of the allocation path. Switch to using
pthread_atfork handlers to ensure that the lock is held prior to
forking, and unlock it in the parent and child after forking. This
(maybe) has the added benefit of making sure the data structures are in
a consistent state in the child process after forking. Unclear if that's
an issue prior to this change, though. I may be missing some code that
resets the profiler on fork anyway?
Checklist
Reviewer Checklist