-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per Script Invocation Lua Memory Limits #903
base: main
Are you sure you want to change the base?
Conversation
…ust to prove it's possible; squashing to clean up _a lot_ of experimentation commits
… for LuaScripts, fixes that
It will probably be unusual to use this allocator, but it shouldn't be _bad_ either.
// | ||
// Since we're managing the start/end pointers outside of the buffer | ||
// we need to signal that the buffer has data to copy | ||
bufferManager.GrowBuffer(alwayCopy: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to copy the entire old buffer to the new buffer, regardless of where the manually managed start and end pointers were. It might be more efficient to tell the GrowBuffer what segment it needs to copy over?
This would be the most concerning for the PR. What is causing this drop, and if it is the trampoline, then is there a way to enable an unsafe mode that avoids this overhead? |
/// </summary> | ||
internal unsafe class LuaTrackedAllocator : ILuaAllocator | ||
{ | ||
// TODO: XXXMemoryPressure here might be unnecessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this comment still relevant?
public SessionScriptCache(StoreWrapper storeWrapper, IGarnetAuthenticator authenticator, ILogger logger = null) | ||
{ | ||
this.storeWrapper = storeWrapper; | ||
this.logger = logger; | ||
|
||
scratchBufferNetworkSender = new ScratchBufferNetworkSender(); | ||
processor = new RespServerSession(0, scratchBufferNetworkSender, storeWrapper, null, authenticator, false); | ||
|
||
// There's some parsing involved in these, so save them off per-session | ||
memoryManagementMode = storeWrapper.serverOptions.LuaOptions.MemoryManagementMode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems these lines are causing BDN for BasicOperations, ObjectOperations, HashObjectOperations to fail as something (perhaps storeWrapper) is null here:
System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
---> System.NullReferenceException: Object reference not set to an instance of an object.
at Garnet.server.SessionScriptCache..ctor(StoreWrapper storeWrapper, IGarnetAuthenticator authenticator, ILogger logger) in /_/libs/server/Lua/SessionScriptCache.cs:line 41
at Garnet.server.RespServerSession..ctor(Int64 id, INetworkSender networkSender, StoreWrapper storeWrapper, SubscribeBroker`3 subscribeBroker, IGarnetAuthenticator authenticator, Boolean enableScripts) in /_/libs/server/Resp/RespServerSession.cs:line 221
at Embedded.server.EmbeddedRespServer.GetRespSession() in /_/benchmark/BDN.benchmark/Embedded/EmbeddedRespServer.cs:line 41
at BDN.benchmark.Operations.OperationsBase.GlobalSetup() in /_/benchmark/BDN.benchmark/Operations/OperationsBase.cs:line 80
at BDN.benchmark.Operations.BasicOperations.GlobalSetup() in /_/benchmark/BDN.benchmark/Operations/BasicOperations.cs:line 20
at BenchmarkDotNet.Engines.EngineFactory.CreateReadyToRun(EngineParameters engineParameters)
at BenchmarkDotNet.Autogenerated.Runnable_0.Run(IHost host, String benchmarkName) in /_/benchmark/BDN.benchmark/bin/Release/net8.0/cb61c2e4-da46-43ab-8a17-882e6ff8a654/cb61c2e4-da46-43ab-8a17-882e6ff8a654.notcs:line 177
at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr)
--- End of inner exception stack trace ---
at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr)
at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at BenchmarkDotNet.Autogenerated.UniqueProgramName.AfterAssemblyLoadingAttached(String[] args) in /_/benchmark/BDN.benchmark/bin/Release/net8.0/cb61c2e4-da46-43ab-8a17-882e6ff8a654/cb61c2e4-da46-43ab-8a17-882e6ff8a654.notcs:line 57
Example action run: https://github.com/microsoft/garnet/actions/runs/12681127499/job/35344227191
Another decent sized one, though hopefully this is the last "big" Lua PR - the rest I can foresee should be smaller.
TODOs:
Are memory pressure updated necessary? Have a thread with .NET GC folks for this.Got our answer, they are correct to have here.Behavior when scripts aborted? Redis is weird here.It's reasonable for writes that happened pre-abort to still happen. We can explore rollback if there's a pressing need, but it's non-trivial.This introduces the ability to specify maximum memory limits for Lua scripts, currently this a single config (
--lua-script-memory-limit
). To enable this we also have to introduce custom allocators (--lua-memory-management-mode
) for Lua, there are 3 in this PR:Native
(the current behavior, where Lua provides the allocator),Tracked
(where memory is acquired withNativeMemory
and GC pressure is updated), andManaged
(where a POH array is pre-allocated and memory is obtained from a freelist punned over that allocation).In order to gracefully handle Lua OOMs more of the operation of
LuaRunner
(things like compilation and the preamble) is hidden behind Lua PCalls. This is a necessary change, as the default behavior of Lua is to abort the process in the face of OOMs - PCalls prevent that.To make the PCall changes less expensive (and just generally less awful), I introduced some (Strong, not Pinned) GCHandles, function pointers, and trampolines. At the end of this, we're basically just using KeraLua to package Lua and define some constants - none of the .NET code is really running anymore. If we really wanted, we could build Lua ourselves (maybe even drop down to 5.1 to match Redis) and exploit that tight coupling - but I have no intention of doing so at this time.
When improving the Lua OOM RESP error, I also found a bug in previous PR around buffer management - it is fixed in this commit.
The Allocators
Native
This is the default.
This just uses the built-in Lua allocator, which is a thin shim over
malloc
. It should perform bit better thanTracked
simply because there isn't any .NET code in the way.Native does not support memory limits.
Tracked
A thin wrapper over
NativeMemory
. It supports memory limits, and will fail once total requested bytes exceeds the configure limit. Since it cannot see the overhead ofNativeMemory
the limit is only softly enforced.This currently calls GC.(Add|Remove)MemoryPressure, but see Open Questions.
Managed (w/ and w/o Limits)
A really basic free-list based allocator over a POH array. It pre-allocates the total limit, and (if one is configured) it strictly limits allocations since the overhead can be seen.
If a limit is not configured, 2MB (or larger, if the requested size exceeds 2MB) arrays are allocated as needed.
We could certainly do a lot better here (I imagine there's something existing in Garnet I could steal or repurpose), but this is mostly a proof we could get Lua 100% onto the managed heap. That said, I couldn't help put profile a little bit, so it shouldn't be awful given Lua's allocation patterns.
Open Questions
Is GC pressure actually needed in the
Tracked
case?Docs say:Which makes very little sense to me, as in a container (like a job) with memory limits the presence or lack of a finalizer seems irrelevant to whether the GC needs to be informed of native allocations?
Ultimately the .NET GC folks will just have to answer this one, I've opened a thread with them.Docs are (somewhat) incorrect here, and will be updated. It is correct, but not strictly necessary, to have these calls in the Tracked case. I'm leaving them in so the GC can respond more promptly to memory pressure.
What is expected behavior when a script is aborted?
This change introduces a case where a script might be aborted, and I expect future changes (timeouts, and potentially
SCRIPT|KILL
) to add more.Redis doesn't allow this - you are expected to let Redis crash, or force a shutdown, if a script goes out of control. That's kind of nuts, IMO, especially for any HA service.
However, by deviating from Redis (with this opt-in switch), we do need to define expected behavior.
Right now, the behavior is "any commands that executed in the script, executed". Commands cannot half-execute, but scripts can, basically.
Is this acceptable, or do we need some (presumably configurable) rollback behavior?
With transactions enabled we already know the scope of "needs to be rolled back", but the implementation would be non-trivial.
Decision: No rollbacks
Summarizing some discussion:
Benchmarks
I changed
ScriptOperations
to useLuaParams
instead ofOperationParams
as we were already ignoring most of the operation variants there. Now all Lua-related benchmarks run for with different allocators enabled: Native (the old behavior, and current default), Tracked w/ 2M limit, Tracked w/o a limit, Managed w/ 2M limit, and Managed w/o limit.main
results are as ofce21c248f084744e45bbff08d0ecce0a51326cca
.luaMemoryLimits
are as ofa2996e9ae5f7e9c44a8848e44cc91417ddf418c4
.Broadly speaking, we're giving up a bit of perf for the ability to recover from OOMs (and other runtime errors, technically). There's some work that could be done to claw bits of this back, in theory, but we are actually doing more with this change.
LuaRunnerOperations
Comparing the baseline and the
Native,None
case, we're giving up a small amount across the board. Worst case ~9%, though these are very fast (ns) already.main
luaMemoryLimits
LuaScriptCacheOperations
Cases where we construct a new
LuaRunner
are a bit slower, though most of these are in the error bounds.main
luaMemoryLimits
LuaScripts
Giving up ~32% in the worst case (comparing baseline to
Native,None
, Script4).main
luaMemoryLimits
ScriptOperations
This is more of a mixed bag, LargeScript is improved somewhat (~6%), while very basic evaluations like Eval and EvalSha are a bit slower. The loss is probably due to the pcall, and the gains are probably peanut butter improvements in calls to and from Lua from .NET.
main
(eliding Params != None)luaMemoryLimits