-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 5954 - Disable Transparent Huge Pages #5955
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. minor questions
Makefile.am
Outdated
@@ -667,7 +667,7 @@ schema_DATA = $(srcdir)/ldap/schema/99user.ldif | |||
libexec_SCRIPTS = | |||
|
|||
if SYSTEMD | |||
libexec_SCRIPTS += wrappers/ds_systemd_ask_password_acl wrappers/ds_selinux_restorecon.sh | |||
libexec_SCRIPTS += wrappers/ds_systemd_ask_password_acl wrappers/ds_systemd_disable_thp wrappers/ds_selinux_restorecon.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch looks good to me. The script ds_systemd_disable_thp is a shell script. Any reason to not name it ds_systemd_disable_thp.sh ?
I have no strong opinion about that just for curiosity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the same naming as for ds_systemd_ask_password_acl
. I have no strong opinion as well, we need someone else to chime in :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer without the .sh, it's just cleaner.
wrappers/systemd.template.service.in
Outdated
@@ -16,6 +16,7 @@ EnvironmentFile=-@initconfigdir@/@package_name@-%i | |||
PIDFile=/run/@package_name@/slapd-%i.pid | |||
ExecStartPre=@libexecdir@/ds_systemd_ask_password_acl @instconfigdir@/slapd-%i/dse.ldif | |||
ExecStartPre=@libexecdir@/ds_selinux_restorecon.sh @instconfigdir@/slapd-%i/dse.ldif | |||
ExecStartPre=@libexecdir@/ds_systemd_disable_thp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in dirsrv systemd file but the name '/sys/kernel/mm/transparent_hugepage' may suggest a system wide tuning. Is it disabling THP only for dirsrv or for the all system ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It disables THP for the whole system, I'm not aware of a way to disable it just for one process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anton said there was an API (I think) that we can use that can do it per process. Or maybe just making some system call? He definitely said there was a way to do it - just not 100% sure what it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://www.digitalocean.com/blog/transparent-huge-pages-and-alternative-memory-allocators
This has a good writeup on the topic. It seems to disable them via madvise you need to know the ranges in use. There seems to be some questions about tcmalloc + thp google/tcmalloc#190
But tcmalloc also has: https://github.com/google/tcmalloc/blob/master/docs/tuning.md
So it seems that disabling THP may cause other issues here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mreynolds389, thanks for the pointer. I found that PR_SET_THP_DISABLE
can be set on a process:
https://github.com/discourse/discourse_docker/blob/master/image/base/thpoff.c
@Firstyear, first link is exactly the problem we're having, since we use jemalloc.
Looking at https://jemalloc.net/jemalloc.3.html, there is an option to turn off THP as well:
opt.thp (const char *) r-
Transparent hugepage (THP) mode. Settings "always", "never" and "default" are available if THP is supported by the operating system. The "always" setting enables transparent hugepage for all user memory mappings with MADV_HUGEPAGE; "never" ensures no transparent hugepage with MADV_NOHUGEPAGE; the default setting "default" makes no changes. Note that: this option does not affect THP for jemalloc internal metadata (see [opt.metadata_thp](https://jemalloc.net/jemalloc.3.html#opt.metadata_thp)); in addition, for arenas with customized [extent_hooks](https://jemalloc.net/jemalloc.3.html#arena.i.extent_hooks), this option is bypassed as it is implemented as part of the default extent hooks.
To summarize, we have 3 ways to turn off THP:
- System-wide
- Per process via
prctl
- Per process via jemalloc options
I was leaning towards the system-wide option, because we plan in the future to utilize TuneD profiles for RHDS workloads, such as throughput-performance or network-latency and move our custom sysctl.conf settings there as well. Profile setting transparent_hugepages=never
disables THP system-wide via sysfs knob, same as in the ExecStartPre script. But we're not switching to TuneD until we complete performance testing across different profiles and workloads. And this PR was supposed to be a stopgap solution.
Now I'm leaning towards option 3, because 389DS can be used not only as a single workload on a server (RHDS), but also as part of FreeIPA/IdM, where some components such as Dogtag PKI can be affected by turning off this optimization (THP).
Finally, we can leave only the healthcheck part, and leave the decision to the admin based on their unique workload. Also, I will change it to trigger only on large caches, not sure what's the minimum size that can be considered "large" though.
Please let me know what do you prefer.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I my mind, an ideal option would be a per process tuning set by an environment variable in the systemd config file. I did not find such environment variable. Between option 2 and 3, I tend to prefer '2' because it does not rely on an allocator (jemalloc) that can change in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened another PR #5964 with option 2.
Bug Description: THP can have negative effects on DS performance when large caches are used. Fix Description: * Add a new variable for `ns-slapd` THP_DISABLE. When THP_DISABLE is set to 1, THP is disabled for `ns-slapd` process via `prctl(2)`. With any other value, THP settings are untouched. Before: ``` $ grep THP /proc/$(pidof ns-slapd)/status THP_enabled: 1 ``` After ``` $ grep THP /proc/$(pidof ns-slapd)/status THP_enabled: 0 ``` * Add a new healthcheck linter, that checks if THP is disabled system-wide or per instance. In case THP is enabled for both the system and the process, it prints recommendations how to disable THP. Fixes: #5954 Reviewed-by: ???
Bug Description:
THP can have negative effects on DS performance when large caches are used.
Fix Description:
Fixes: #5954
Reviewed-by: ???