Per Script Invocation Lua Memory Limits #903

kevin-montrose · 2025-01-07T20:16:35Z

Another decent sized one, though hopefully this is the last "big" Lua PR - the rest I can foresee should be smaller.

TODOs:

Custom allocators working
New config options for allocators and memory limits
Update benchmarks
Get final benchmark numbers
Answer open questions
- ~~Are memory pressure updated necessary? Have a thread with .NET GC folks for this.~~ Got our answer, they are correct to have here.
- ~~Behavior when scripts aborted? Redis is weird here.~~ It's reasonable for writes that happened pre-abort to still happen. We can explore rollback if there's a pressing need, but it's non-trivial.

This introduces the ability to specify maximum memory limits for Lua scripts, currently this a single config (--lua-script-memory-limit). To enable this we also have to introduce custom allocators (--lua-memory-management-mode) for Lua, there are 3 in this PR: Native (the current behavior, where Lua provides the allocator), Tracked (where memory is acquired with NativeMemory and GC pressure is updated), and Managed (where a POH array is pre-allocated and memory is obtained from a freelist punned over that allocation).

In order to gracefully handle Lua OOMs more of the operation of LuaRunner (things like compilation and the preamble) is hidden behind Lua PCalls. This is a necessary change, as the default behavior of Lua is to abort the process in the face of OOMs - PCalls prevent that.

To make the PCall changes less expensive (and just generally less awful), I introduced some (Strong, not Pinned) GCHandles, function pointers, and trampolines. At the end of this, we're basically just using KeraLua to package Lua and define some constants - none of the .NET code is really running anymore. If we really wanted, we could build Lua ourselves (maybe even drop down to 5.1 to match Redis) and exploit that tight coupling - but I have no intention of doing so at this time.

When improving the Lua OOM RESP error, I also found a bug in previous PR around buffer management - it is fixed in this commit.

The Allocators

Native

This is the default.

This just uses the built-in Lua allocator, which is a thin shim over malloc. It should perform bit better than Tracked simply because there isn't any .NET code in the way.

Native does not support memory limits.

Tracked

A thin wrapper over NativeMemory. It supports memory limits, and will fail once total requested bytes exceeds the configure limit. Since it cannot see the overhead of NativeMemory the limit is only softly enforced.

This currently calls GC.(Add|Remove)MemoryPressure, but see Open Questions.

Managed (w/ and w/o Limits)

A really basic free-list based allocator over a POH array. It pre-allocates the total limit, and (if one is configured) it strictly limits allocations since the overhead can be seen.

If a limit is not configured, 2MB (or larger, if the requested size exceeds 2MB) arrays are allocated as needed.

We could certainly do a lot better here (I imagine there's something existing in Garnet I could steal or repurpose), but this is mostly a proof we could get Lua 100% onto the managed heap. That said, I couldn't help put profile a little bit, so it shouldn't be awful given Lua's allocation patterns.

Open Questions

Is GC pressure actually needed in the `Tracked` case?

~~Docs say:~~

The AddMemoryPressure and RemoveMemoryPressure methods improve performance only for types that exclusively depend on finalizers to release the unmanaged resources. It's not necessary to use these methods in types that follow the dispose pattern, where finalizers are used to clean up unmanaged resources only in the event that a consumer of the type forgets to call Dispose.

Which makes very little sense to me, as in a container (like a job) with memory limits the presence or lack of a finalizer seems irrelevant to whether the GC needs to be informed of native allocations?

~~Ultimately the .NET GC folks will just have to answer this one, I've opened a thread with them.~~

Docs are (somewhat) incorrect here, and will be updated. It is correct, but not strictly necessary, to have these calls in the Tracked case. I'm leaving them in so the GC can respond more promptly to memory pressure.

What is expected behavior when a script is aborted?

This change introduces a case where a script might be aborted, and I expect future changes (timeouts, and potentially SCRIPT|KILL) to add more.

Redis doesn't allow this - you are expected to let Redis crash, or force a shutdown, if a script goes out of control. That's kind of nuts, IMO, especially for any HA service.

However, by deviating from Redis (with this opt-in switch), we do need to define expected behavior.

Right now, the behavior is "any commands that executed in the script, executed". Commands cannot half-execute, but scripts can, basically.

Is this acceptable, or do we need some (presumably configurable) rollback behavior?

With transactions enabled we already know the scope of "needs to be rolled back", but the implementation would be non-trivial.

Decision: No rollbacks

Summarizing some discussion:

If we are running in non-transaction mode, then we view Lua scripts as logically no different from a client issuing a sequence of calls, so the idea that the commands that happened pre-abort are persisted, is the only thing that makes sense.

If we are in transaction mode, it is possible some users expect atomicity - but this is going to be harder as we will not have the "before image" of keys stored anywhere. It is perfectly fine to document that there is no rollback in this situation. We will simply unlock the keys and "succeed" the partial transaction.

Benchmarks

I changed ScriptOperations to use LuaParams instead of OperationParams as we were already ignoring most of the operation variants there. Now all Lua-related benchmarks run for with different allocators enabled: Native (the old behavior, and current default), Tracked w/ 2M limit, Tracked w/o a limit, Managed w/ 2M limit, and Managed w/o limit.

main results are as of ce21c248f084744e45bbff08d0ecce0a51326cca.
luaMemoryLimits are as of a2996e9ae5f7e9c44a8848e44cc91417ddf418c4.

Broadly speaking, we're giving up a bit of perf for the ability to recover from OOMs (and other runtime errors, technically). There's some work that could be done to claw bits of this back, in theory, but we are actually doing more with this change.

LuaRunnerOperations

Comparing the baseline and the Native,None case, we're giving up a small amount across the board. Worst case ~9%, though these are very fast (ns) already.

main

Method	Params	Mean	Error	StdDev	Median	Allocated
ResetParametersSmall	None	102.5 ns	0.51 ns	0.43 ns	102.6 ns	-
ResetParametersLarge	None	103.4 ns	0.50 ns	0.47 ns	103.4 ns	-
ConstructSmall	None	97,641.9 ns	609.86 ns	540.63 ns	97,877.1 ns	344 B
ConstructLarge	None	99,759.4 ns	1,113.26 ns	1,041.34 ns	99,650.9 ns	3408 B
CompileForSessionSmall	None	1,663.9 ns	32.35 ns	57.50 ns	1,689.1 ns	-
CompileForSessionLarge	None	34,445.3 ns	222.75 ns	208.36 ns	34,498.1 ns	-

luaMemoryLimits

Method	Params	Mean	Error	StdDev	Median	Gen0	Gen1	Gen2	Allocated
ResetParametersSmall	Managed,Limit	102.15 ns	0.366 ns	0.305 ns	102.16 ns	-	-	-	-
ResetParametersLarge	Managed,Limit	95.21 ns	0.692 ns	0.614 ns	95.12 ns	-	-	-	-
ConstructSmall	Managed,Limit	137,379.06 ns	2,712.401 ns	4,456.553 ns	137,299.33 ns	3.6621	3.6621	3.6621	2097606 B
ConstructLarge	Managed,Limit	138,607.89 ns	2,755.709 ns	4,753.463 ns	138,323.06 ns	3.6621	3.6621	3.6621	2100672 B
CompileForSessionSmall	Managed,Limit	6,339.17 ns	126.666 ns	160.192 ns	6,316.06 ns	-	-	-	99 B
CompileForSessionLarge	Managed,Limit	34,215.27 ns	137.625 ns	122.001 ns	34,210.77 ns	-	-	-	-
ResetParametersSmall	Managed,None	100.34 ns	0.361 ns	0.338 ns	100.28 ns	-	-	-	-
ResetParametersLarge	Managed,None	100.74 ns	0.475 ns	0.421 ns	100.58 ns	-	-	-	-
ConstructSmall	Managed,None	143,109.12 ns	2,836.777 ns	5,397.263 ns	142,720.70 ns	3.6621	3.6621	3.6621	2097678 B
ConstructLarge	Managed,None	157,441.66 ns	3,137.771 ns	7,697.007 ns	156,534.24 ns	3.6621	3.6621	3.6621	2100740 B
CompileForSessionSmall	Managed,None	253,209.01 ns	26,459.785 ns	78,017.278 ns	279,199.30 ns	-	-	-	512 B
CompileForSessionLarge	Managed,None	34,322.35 ns	155.958 ns	130.232 ns	34,329.75 ns	-	-	-	-
ResetParametersSmall	Native,None	99.18 ns	0.490 ns	0.459 ns	99.31 ns	-	-	-	-
ResetParametersLarge	Native,None	99.61 ns	0.462 ns	0.386 ns	99.53 ns	-	-	-	-
ConstructSmall	Native,None	106,559.60 ns	1,881.476 ns	1,759.934 ns	107,402.40 ns	-	-	-	328 B
ConstructLarge	Native,None	106,600.10 ns	1,103.980 ns	1,032.664 ns	106,645.17 ns	-	-	-	3392 B
CompileForSessionSmall	Native,None	2,067.66 ns	38.841 ns	71.995 ns	2,064.22 ns	-	-	-	-
CompileForSessionLarge	Native,None	34,540.81 ns	391.955 ns	347.458 ns	34,426.06 ns	-	-	-	-
ResetParametersSmall	Tracked,Limit	98.86 ns	0.596 ns	0.528 ns	98.69 ns	-	-	-	-
ResetParametersLarge	Tracked,Limit	100.35 ns	0.678 ns	0.601 ns	100.53 ns	-	-	-	-
ConstructSmall	Tracked,Limit	156,818.87 ns	940.860 ns	880.081 ns	157,031.77 ns	0.2441	0.2441	0.2441	401 B
ConstructLarge	Tracked,Limit	161,996.34 ns	928.417 ns	775.270 ns	162,092.98 ns	0.2441	0.2441	0.2441	3466 B
CompileForSessionSmall	Tracked,Limit	3,949.95 ns	65.316 ns	61.097 ns	3,948.64 ns	0.0076	0.0076	0.0076	-
CompileForSessionLarge	Tracked,Limit	41,873.48 ns	315.093 ns	279.322 ns	41,905.40 ns	0.1221	0.1221	0.1221	-
ResetParametersSmall	Tracked,None	105.51 ns	0.626 ns	0.555 ns	105.62 ns	-	-	-	-
ResetParametersLarge	Tracked,None	100.43 ns	0.649 ns	0.607 ns	100.48 ns	-	-	-	-
ConstructSmall	Tracked,None	160,488.50 ns	1,080.702 ns	1,010.889 ns	160,509.45 ns	0.2441	0.2441	0.2441	362 B
ConstructLarge	Tracked,None	159,200.11 ns	690.706 ns	612.293 ns	159,156.92 ns	0.2441	0.2441	0.2441	3426 B
CompileForSessionSmall	Tracked,None	4,056.16 ns	49.709 ns	41.510 ns	4,069.20 ns	0.0076	0.0076	0.0076	-
CompileForSessionLarge	Tracked,None	43,021.50 ns	527.114 ns	440.164 ns	42,957.61 ns	0.1221	0.1221	0.1221	-

LuaScriptCacheOperations

Cases where we construct a new LuaRunner are a bit slower, though most of these are in the error bounds.

main

Method	Params	Mean	Error	StdDev	Median	Allocated
LookupHit	None	2.855 μs	0.8448 μs	2.464 μs	1.450 μs	688 B
LookupMiss	None	2.504 μs	0.6450 μs	1.882 μs	3.150 μs	688 B
LoadOuterHit	None	3.472 μs	0.8717 μs	2.543 μs	3.200 μs	688 B
LoadInnerHit	None	220.146 μs	9.0449 μs	25.806 μs	213.550 μs	1056 B
LoadMiss	None	5.845 μs	0.7413 μs	2.139 μs	6.200 μs	688 B
Digest	None	14.450 μs	0.6912 μs	1.994 μs	13.800 μs	688 B

luaMemoryLimits

Method	Params	Mean	Error	StdDev	Median	Allocated
LookupHit	Managed,Limit	3.341 μs	0.6181 μs	1.803 μs	3.600 μs	64 B
LookupMiss	Managed,Limit	3.071 μs	0.5321 μs	1.560 μs	3.400 μs	688 B
LoadOuterHit	Managed,Limit	5.104 μs	0.6970 μs	2.022 μs	5.500 μs	688 B
LoadInnerHit	Managed,Limit	208.739 μs	9.3748 μs	27.642 μs	206.650 μs	2098560 B
LoadMiss	Managed,Limit	5.648 μs	0.9120 μs	2.631 μs	5.850 μs	688 B
Digest	Managed,Limit	14.363 μs	0.4871 μs	1.382 μs	14.300 μs	688 B
LookupHit	Managed,None	3.106 μs	0.6947 μs	2.027 μs	2.700 μs	688 B
LookupMiss	Managed,None	2.385 μs	0.6353 μs	1.863 μs	1.400 μs	688 B
LoadOuterHit	Managed,None	5.884 μs	0.5978 μs	1.734 μs	6.000 μs	688 B
LoadInnerHit	Managed,None	207.616 μs	13.4135 μs	38.915 μs	196.600 μs	2098384 B
LoadMiss	Managed,None	7.473 μs	0.5251 μs	1.498 μs	7.450 μs	688 B
Digest	Managed,None	13.751 μs	0.7736 μs	2.257 μs	14.250 μs	688 B
LookupHit	Native,None	3.546 μs	0.6263 μs	1.817 μs	4.150 μs	688 B
LookupMiss	Native,None	2.584 μs	0.6604 μs	1.937 μs	1.400 μs	688 B
LoadOuterHit	Native,None	5.173 μs	0.6795 μs	1.971 μs	5.600 μs	688 B
LoadInnerHit	Native,None	215.667 μs	4.3205 μs	9.927 μs	216.500 μs	1040 B
LoadMiss	Native,None	5.934 μs	0.7840 μs	2.262 μs	6.150 μs	688 B
Digest	Native,None	12.857 μs	1.0007 μs	2.951 μs	13.050 μs	688 B
LookupHit	Tracked,Limit	2.293 μs	0.6791 μs	1.970 μs	1.400 μs	688 B
LookupMiss	Tracked,Limit	2.584 μs	0.5875 μs	1.723 μs	1.900 μs	688 B
LoadOuterHit	Tracked,Limit	5.420 μs	0.7018 μs	2.036 μs	5.600 μs	688 B
LoadInnerHit	Tracked,Limit	247.422 μs	6.8352 μs	19.279 μs	242.900 μs	1072 B
LoadMiss	Tracked,Limit	5.975 μs	0.6976 μs	2.013 μs	6.200 μs	688 B
Digest	Tracked,Limit	13.461 μs	0.6482 μs	1.849 μs	13.550 μs	688 B
LookupHit	Tracked,None	3.379 μs	0.6682 μs	1.939 μs	4.100 μs	976 B
LookupMiss	Tracked,None	2.763 μs	0.6014 μs	1.754 μs	3.450 μs	688 B
LoadOuterHit	Tracked,None	4.794 μs	0.7989 μs	2.330 μs	5.500 μs	688 B
LoadInnerHit	Tracked,None	235.640 μs	5.0250 μs	13.840 μs	234.200 μs	1072 B
LoadMiss	Tracked,None	6.243 μs	0.6940 μs	1.980 μs	5.900 μs	688 B
Digest	Tracked,None	13.654 μs	0.6362 μs	1.856 μs	13.700 μs	64 B

LuaScripts

Giving up ~32% in the worst case (comparing baseline to Native,None, Script4).

main

Method	Params	Mean	Error	StdDev	Gen0	Allocated
Script1	None	109.3 ns	1.09 ns	1.02 ns	-	-
Script2	None	174.6 ns	1.55 ns	1.38 ns	0.0002	24 B
Script3	None	248.1 ns	1.52 ns	1.35 ns	0.0005	32 B
Script4	None	228.1 ns	2.97 ns	2.78 ns	-	-

luaMemoryLimits

Method	Params	Mean	Error	StdDev	Gen0	Allocated
Script1	Managed,Limit	173.4 ns	0.61 ns	0.51 ns	-	-
Script2	Managed,Limit	216.8 ns	0.94 ns	0.88 ns	0.0002	24 B
Script3	Managed,Limit	300.3 ns	1.38 ns	1.29 ns	0.0005	32 B
Script4	Managed,Limit	289.7 ns	5.54 ns	5.18 ns	-	-
Script1	Managed,None	149.1 ns	1.52 ns	1.42 ns	-	-
Script2	Managed,None	215.5 ns	1.13 ns	1.05 ns	0.0002	24 B
Script3	Managed,None	296.5 ns	1.45 ns	1.14 ns	0.0005	32 B
Script4	Managed,None	285.1 ns	5.48 ns	5.13 ns	-	-
Script1	Native,None	150.0 ns	0.88 ns	0.82 ns	-	-
Script2	Native,None	215.5 ns	1.00 ns	0.89 ns	0.0002	24 B
Script3	Native,None	298.7 ns	2.31 ns	2.05 ns	0.0005	32 B
Script4	Native,None	300.5 ns	3.51 ns	3.29 ns	-	-
Script1	Tracked,Limit	149.9 ns	2.69 ns	2.52 ns	-	-
Script2	Tracked,Limit	222.9 ns	4.27 ns	7.13 ns	0.0002	24 B
Script3	Tracked,Limit	303.6 ns	4.76 ns	4.45 ns	0.0005	32 B
Script4	Tracked,Limit	284.5 ns	1.49 ns	1.32 ns	-	-
Script1	Tracked,None	148.1 ns	1.96 ns	1.74 ns	-	-
Script2	Tracked,None	214.6 ns	0.86 ns	0.76 ns	0.0002	24 B
Script3	Tracked,None	301.3 ns	2.02 ns	1.79 ns	0.0005	32 B
Script4	Tracked,None	284.2 ns	2.11 ns	1.76 ns	-	-

ScriptOperations

This is more of a mixed bag, LargeScript is improved somewhat (~6%), while very basic evaluations like Eval and EvalSha are a bit slower. The loss is probably due to the pcall, and the gains are probably peanut butter improvements in calls to and from Lua from .NET.

main (eliding Params != None)

Method	Params	Mean	Error	StdDev	Allocated
ScriptLoad	None	80.452 μs	0.4009 μs	0.3554 μs	9600 B
ScriptExistsTrue	None	18.095 μs	0.2135 μs	0.1893 μs	-
ScriptExistsFalse	None	17.289 μs	0.0655 μs	0.0547 μs	-
Eval	None	58.513 μs	0.2955 μs	0.2468 μs	-
EvalSha	None	24.331 μs	0.4261 μs	0.3986 μs	-
SmallScript	None	61.024 μs	0.2889 μs	0.2702 μs	-
LargeScript	None	4,297.098 μs	49.7821 μs	46.5662 μs	4 B
ArrayReturn	None	110.093 μs	0.7220 μs	0.6754 μs	-

luaMemoryLimits

Method	Params	Mean	Error	StdDev	Median	Gen0	Gen1	Gen2	Allocated
ScriptLoad	Managed,Limit	85.92 μs	0.912 μs	0.853 μs	85.77 μs	-	-	-	9600 B
ScriptExistsTrue	Managed,Limit	18.13 μs	0.290 μs	0.272 μs	18.28 μs	-	-	-	-
ScriptExistsFalse	Managed,Limit	16.87 μs	0.082 μs	0.073 μs	16.86 μs	-	-	-	-
Eval	Managed,Limit	71.51 μs	0.866 μs	0.810 μs	71.33 μs	-	-	-	-
EvalSha	Managed,Limit	31.83 μs	0.531 μs	0.497 μs	31.63 μs	-	-	-	-
SmallScript	Managed,Limit	56.39 μs	0.265 μs	0.248 μs	56.34 μs	-	-	-	-
LargeScript	Managed,Limit	4,964.14 μs	99.011 μs	125.218 μs	4,943.73 μs	-	-	-	8 B
ArrayReturn	Managed,Limit	155.47 μs	10.902 μs	32.144 μs	146.56 μs	-	-	-	-
ScriptLoad	Managed,None	87.75 μs	0.617 μs	0.547 μs	87.82 μs	-	-	-	9600 B
ScriptExistsTrue	Managed,None	18.34 μs	0.241 μs	0.226 μs	18.40 μs	-	-	-	-
ScriptExistsFalse	Managed,None	17.22 μs	0.114 μs	0.101 μs	17.19 μs	-	-	-	-
Eval	Managed,None	69.74 μs	1.018 μs	0.952 μs	69.44 μs	-	-	-	-
EvalSha	Managed,None	35.71 μs	0.702 μs	0.721 μs	35.69 μs	-	-	-	-
SmallScript	Managed,None	59.14 μs	0.189 μs	0.157 μs	59.14 μs	-	-	-	-
LargeScript	Managed,None	5,035.96 μs	78.300 μs	73.242 μs	5,022.85 μs	-	-	-	8 B
ArrayReturn	Managed,None	163.17 μs	10.815 μs	31.888 μs	155.71 μs	-	-	-	-
ScriptLoad	Native,None	83.49 μs	1.469 μs	1.374 μs	83.17 μs	-	-	-	9600 B
ScriptExistsTrue	Native,None	17.62 μs	0.085 μs	0.071 μs	17.61 μs	-	-	-	-
ScriptExistsFalse	Native,None	17.06 μs	0.080 μs	0.067 μs	17.04 μs	-	-	-	-
Eval	Native,None	69.78 μs	0.482 μs	0.427 μs	69.74 μs	-	-	-	-
EvalSha	Native,None	29.04 μs	0.242 μs	0.215 μs	29.09 μs	-	-	-	-
SmallScript	Native,None	57.33 μs	1.056 μs	0.987 μs	57.88 μs	-	-	-	-
LargeScript	Native,None	4,028.48 μs	21.690 μs	16.934 μs	4,032.39 μs	-	-	-	8 B
ArrayReturn	Native,None	122.22 μs	1.979 μs	1.851 μs	121.50 μs	-	-	-	-
ScriptLoad	Tracked,Limit	84.09 μs	0.621 μs	0.581 μs	84.02 μs	-	-	-	9600 B
ScriptExistsTrue	Tracked,Limit	18.25 μs	0.115 μs	0.102 μs	18.25 μs	-	-	-	-
ScriptExistsFalse	Tracked,Limit	18.01 μs	0.349 μs	0.327 μs	17.93 μs	-	-	-	-
Eval	Tracked,Limit	69.00 μs	0.484 μs	0.453 μs	69.08 μs	-	-	-	-
EvalSha	Tracked,Limit	28.56 μs	0.444 μs	0.416 μs	28.52 μs	-	-	-	-
SmallScript	Tracked,Limit	58.14 μs	0.663 μs	0.587 μs	58.26 μs	-	-	-	-
LargeScript	Tracked,Limit	5,133.20 μs	61.766 μs	54.754 μs	5,120.47 μs	15.6250	15.6250	15.6250	23 B
ArrayReturn	Tracked,Limit	164.22 μs	1.495 μs	1.398 μs	164.29 μs	-	-	-	-
ScriptLoad	Tracked,None	83.47 μs	0.620 μs	0.549 μs	83.54 μs	-	-	-	9600 B
ScriptExistsTrue	Tracked,None	18.28 μs	0.150 μs	0.133 μs	18.28 μs	-	-	-	-
ScriptExistsFalse	Tracked,None	17.92 μs	0.157 μs	0.147 μs	17.88 μs	-	-	-	-
Eval	Tracked,None	69.06 μs	0.665 μs	0.622 μs	69.05 μs	-	-	-	-
EvalSha	Tracked,None	32.11 μs	0.432 μs	0.383 μs	32.01 μs	-	-	-	-
SmallScript	Tracked,None	57.22 μs	0.935 μs	0.874 μs	57.11 μs	-	-	-	-
LargeScript	Tracked,None	4,984.96 μs	29.063 μs	24.269 μs	4,977.49 μs	15.6250	15.6250	15.6250	25 B
ArrayReturn	Tracked,None	146.78 μs	1.429 μs	1.267 μs	146.98 μs	-	-	-	-

…ust to prove it's possible; squashing to clean up _a lot_ of experimentation commits

…onal validation

… for LuaScripts, fixes that

It will probably be unusual to use this allocator, but it shouldn't be _bad_ either.

badrishc · 2025-01-08T23:02:25Z

libs/server/Lua/LuaRunner.cs

+                //
+                // Since we're managing the start/end pointers outside of the buffer
+                // we need to signal that the buffer has data to copy
+                bufferManager.GrowBuffer(alwayCopy: true);


This seems to copy the entire old buffer to the new buffer, regardless of where the manually managed start and end pointers were. It might be more efficient to tell the GrowBuffer what segment it needs to copy over?

badrishc · 2025-01-09T00:08:12Z

LuaScripts BDN - Giving up ~32% in the worst case

This would be the most concerning for the PR. What is causing this drop, and if it is the trampoline, then is there a way to enable an unsafe mode that avoids this overhead?

badrishc · 2025-01-09T00:48:37Z

libs/server/Lua/LuaTrackedAllocator.cs

+    /// </summary>
+    internal unsafe class LuaTrackedAllocator : ILuaAllocator
+    {
+        // TODO: XXXMemoryPressure here might be unnecessary


nit: this comment still relevant?

badrishc · 2025-01-09T00:56:44Z

libs/server/Lua/SessionScriptCache.cs

        public SessionScriptCache(StoreWrapper storeWrapper, IGarnetAuthenticator authenticator, ILogger logger = null)
        {
            this.storeWrapper = storeWrapper;
            this.logger = logger;

            scratchBufferNetworkSender = new ScratchBufferNetworkSender();
            processor = new RespServerSession(0, scratchBufferNetworkSender, storeWrapper, null, authenticator, false);
+
+            // There's some parsing involved in these, so save them off per-session
+            memoryManagementMode = storeWrapper.serverOptions.LuaOptions.MemoryManagementMode;


it seems these lines are causing BDN for BasicOperations, ObjectOperations, HashObjectOperations to fail as something (perhaps storeWrapper) is null here:

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.NullReferenceException: Object reference not set to an instance of an object. at Garnet.server.SessionScriptCache..ctor(StoreWrapper storeWrapper, IGarnetAuthenticator authenticator, ILogger logger) in /_/libs/server/Lua/SessionScriptCache.cs:line 41 at Garnet.server.RespServerSession..ctor(Int64 id, INetworkSender networkSender, StoreWrapper storeWrapper, SubscribeBroker`3 subscribeBroker, IGarnetAuthenticator authenticator, Boolean enableScripts) in /_/libs/server/Resp/RespServerSession.cs:line 221 at Embedded.server.EmbeddedRespServer.GetRespSession() in /_/benchmark/BDN.benchmark/Embedded/EmbeddedRespServer.cs:line 41 at BDN.benchmark.Operations.OperationsBase.GlobalSetup() in /_/benchmark/BDN.benchmark/Operations/OperationsBase.cs:line 80 at BDN.benchmark.Operations.BasicOperations.GlobalSetup() in /_/benchmark/BDN.benchmark/Operations/BasicOperations.cs:line 20 at BenchmarkDotNet.Engines.EngineFactory.CreateReadyToRun(EngineParameters engineParameters) at BenchmarkDotNet.Autogenerated.Runnable_0.Run(IHost host, String benchmarkName) in /_/benchmark/BDN.benchmark/bin/Release/net8.0/cb61c2e4-da46-43ab-8a17-882e6ff8a654/cb61c2e4-da46-43ab-8a17-882e6ff8a654.notcs:line 177 at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor) at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr) --- End of inner exception stack trace --- at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr) at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at BenchmarkDotNet.Autogenerated.UniqueProgramName.AfterAssemblyLoadingAttached(String[] args) in /_/benchmark/BDN.benchmark/bin/Release/net8.0/cb61c2e4-da46-43ab-8a17-882e6ff8a654/cb61c2e4-da46-43ab-8a17-882e6ff8a654.notcs:line 57

Example action run: https://github.com/microsoft/garnet/actions/runs/12681127499/job/35344227191

kevin-montrose added 13 commits January 7, 2025 14:01

Lua allocations go through .NET; really crummy allocator on the POH j…

f894781

…ust to prove it's possible; squashing to clean up _a lot_ of experimentation commits

punch a LuaOptions into settings

d65adc6

wire up the Lua options

3163064

prep for other allocators

4b2c8d9

Implement more allocators and expand testing

e27a86a

Knock out a number of todos, consider allocator in benchmarks, additi…

4e23ecd

…onal validation

add a test for OOMs

a84bc58

formatting

83d2ae5

convert ScriptOperations to explore different allocators

beb035a

cleanup Lua error messgages; this revealed a bug in buffer management…

5f50429

… for LuaScripts, fixes that

Make managed allocator less naive, and benchmark on par.

a112acb

It will probably be unusual to use this allocator, but it shouldn't be _bad_ either.

formatting

a2996e9

Merge branch 'main' into luaMemoryLimits

c5fb7f0

kevin-montrose marked this pull request as ready for review January 8, 2025 15:12

kevin-montrose requested review from badrishc and vazois January 8, 2025 15:57

badrishc reviewed Jan 8, 2025

View reviewed changes

badrishc reviewed Jan 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per Script Invocation Lua Memory Limits #903

Per Script Invocation Lua Memory Limits #903

kevin-montrose commented Jan 7, 2025 •

edited

Loading

badrishc Jan 8, 2025

badrishc commented Jan 9, 2025

badrishc Jan 9, 2025

badrishc Jan 9, 2025

Per Script Invocation Lua Memory Limits #903

Are you sure you want to change the base?

Per Script Invocation Lua Memory Limits #903

Conversation

kevin-montrose commented Jan 7, 2025 • edited Loading

The Allocators

Native

Tracked

Managed (w/ and w/o Limits)

Open Questions

Is GC pressure actually needed in the Tracked case?

What is expected behavior when a script is aborted?

Benchmarks

LuaRunnerOperations

LuaScriptCacheOperations

LuaScripts

ScriptOperations

badrishc Jan 8, 2025

Choose a reason for hiding this comment

badrishc commented Jan 9, 2025

badrishc Jan 9, 2025

Choose a reason for hiding this comment

badrishc Jan 9, 2025

Choose a reason for hiding this comment

kevin-montrose commented Jan 7, 2025 •

edited

Loading

Is GC pressure actually needed in the `Tracked` case?