Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimization: std.sort should only evaluate keyF once per array eleme…
…nt (#245) This PR improves the performance of `std.sort` and related functions when `keyF` is used. The existing implementation evaluates `keyF` multiple times per input element because: - Prior to sorting, it evaluates `keyF` on all elements to check whether all keys are of the same type. - During sorting, it re-evaluates `keyF` on every pair of compared elements. In the best case (an already-sorted array), this performs ~3x more evaluations than needed because each element participates in up to two extra unnecessary comparisons. In the worst case, we have to do additional comparisons during sorting and the unnecessary work will be even higher. ### The fix The fix: - Precompute all keys up front (which we already do for type-checking purposes). - Sort an array of indices using a comparator which fetches their corresponding precomputed keys - Use the sorted indices to project out the array values in the correct order I also made a few other small improvements: - Explicitly error out when trying to sort arrays of booleans: neither jsonnet nor go-jsonnet supports this. The existing sjsonnet code didn't either, but failed with a confusing `"Cannot sort with key values that are not all the same type"` error because `Val.True` and `Val.False` are different classes. The existing code which did class equality checks on `Val.Bool` would never match because `Val.Bool` is an abstract class. - Avoid allocating a `keyTypes` set: we can simply check that all other elements match the first type's element. This saves some garbage allocations in the common case. - Move `.force` calls earlier so that we don't have to call them in the sort comparator. ### Benchmarking results Consider the following toy benchmark case: ```jsonnet local largeArr = [ { complexKey: { a: i, b: i+1, c: i+2 }, value: "val" + i } for i in std.range(0, 9999) ]; local sortedArr = std.sort( largeArr, keyF=function(x) std.toString(x.complexKey) ); { sortedArrSample: sortedArr[0:5] } ``` With the `RunProfiler` we can see an enormous difference in the number of `std.toString` invocations via the key function: with a standard 5 benchmark runs, we expect to see only 50,000 hits but the old code ran it `241,210` times! I also measured performance on one of our real-world jsonnet bundles, where this PR's optimization cut one expensive target's runtime by 25%.
- Loading branch information