Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow objects with memoized values to be marshaled/unmarshaled across Ruby processes #51

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

obrie
Copy link

@obrie obrie commented Dec 16, 2024

Hi there! This pull request makes 2 main changes:

  1. It improves the performance by 50% of calls to methods that accept arguments, but didn't receive any
  2. It allows objects that were marshaled with memoized values to be unmarshaled in another Ruby process

I'm happy to break this apart into 2 separate PRs, but the performance improvement is mostly a side effect of other changes.

Performance improvements

The performance improvement here is mostly straightforward. Consider the current implementation for generating cache keys:

cache_key = original_arity.zero? ? method_key : [method_key, *args].hash

In this code, we will only use method_key if the arity is zero. However, if we have a method like so:

def value(*args)
  args
end

...then the arity is non-zero and we'll be required to generate a hash for the cache key. This isn't necessary because a call with no arguments should be treated identical to a call to a method that accepts no arguments. By making this change, we can improve performance by ~50%. Benchmarks below:

Benchmark before
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-linux]
Warming up --------------------------------------
        test_no_args   302.263k i/100ms
Calculating -------------------------------------
        test_no_args      3.049M (± 0.3%) i/s  (328.01 ns/i) -     15.415M in   5.056509s
Calculating -------------------------------------
        test_no_args     4.000k memsize (     0.000  retained)
                       100.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-linux]
Warming up --------------------------------------
     test_empty_args   212.775k i/100ms
Calculating -------------------------------------
     test_empty_args      2.132M (± 0.3%) i/s  (469.04 ns/i) -     10.852M in   5.089886s
Calculating -------------------------------------
     test_empty_args    12.000k memsize (    80.000  retained)
                       300.000  objects (     2.000  retained)
                         0.000  strings (     0.000  retained)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-linux]
Warming up --------------------------------------
      test_with_args   180.110k i/100ms
Calculating -------------------------------------
      test_with_args      1.800M (± 0.3%) i/s  (555.52 ns/i) -      9.006M in   5.002773s
Calculating -------------------------------------
      test_with_args    12.000k memsize (    80.000  retained)
                       300.000  objects (     2.000  retained)
                         0.000  strings (     0.000  retained)
Benchmark after
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-linux]
Warming up --------------------------------------
        test_no_args   332.647k i/100ms
Calculating -------------------------------------
        test_no_args      3.296M (± 3.7%) i/s  (303.42 ns/i) -     16.632M in   5.054779s
Calculating -------------------------------------
        test_no_args     4.000k memsize (     0.000  retained)
                       100.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-linux]
Warming up --------------------------------------
     test_empty_args   327.994k i/100ms
Calculating -------------------------------------
     test_empty_args      3.273M (± 0.7%) i/s  (305.54 ns/i) -     16.400M in   5.011001s
Calculating -------------------------------------
     test_empty_args     4.000k memsize (     0.000  retained)
                       100.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [x86_64-linux]
Warming up --------------------------------------
      test_with_args   185.707k i/100ms
Calculating -------------------------------------
      test_with_args      1.854M (± 0.5%) i/s  (539.31 ns/i) -      9.285M in   5.007838s
Calculating -------------------------------------
      test_with_args    12.000k memsize (    80.000  retained)
                       300.000  objects (     2.000  retained)
                         0.000  strings (     0.000  retained)

Marshal improvements

In our application, we store ActiveRecord objects in Memcached, including the state for memoized fields. When we moved to memery, we found that these memoized fields weren't being read from correctly when unmarshaling them from Memcached. There are currently 2 pieces of code that cause issues:

method_key = "#{method_name}_#{object_id}"

....and:

cache_key = original_arity.zero? ? method_key : [method_key, *args].hash

In these 2 pieces of code, the cache keys generated for a method call can be different across Ruby processes:

In this PR, I'm proposing that we solve this by:

  • Exclude the object_id of the prepended module from the cache key. I can't figure out what the use case is for this, so I'm just removing it altogether
  • Make the hashing of arguments optional

By making these changes, I can do the following:

class Foo
  include Memery

  self.memery_use_hashed_arguments = false

  def base_find(char)
    ("a".."k").find { |letter| letter == char }
  end

  memoize def find_new(char)
    puts "finding #{char}"
    base_find(char)
  end
end

irb> f = Foo.new
=> #<Foo:0x000056102107e360>
irb> f.find_new('a')
finding a
=> "a"
irb> f.find_new('a')
=> "a"
irb> Marshal.dump(f)
=> "\x04\bo:\bFoo\x06:\x1D@_memery_memoized_values{\x06[\a:\rfind_newI\"\x06a\x06:\x06ETS:3Memery::ClassMethods::MemoizationModule::Cache\a:\vresultI\"\x06a\x06;\bT:\ttimef\x1413717.190731299"

...then load that marshaled value into a separate Ruby process and retain the memoized values:

irb> f = Marshal.load("\x04\bo:\bFoo\x06:\x1D@_memery_memoized_values{\x06[\a:\rfind_newI\"\x06a\x06:\x06ETS:3Memery::ClassMethods::MemoizationModule::Cache\a:\vresultI\"\x06a\x06;\bT:\ttimef\x1413717.190731299")
=> #<Foo:0x0000563ebfb517c0 @_memery_memoized_values={[:find_new, "a"]=>#<struct Memery::ClassMethods::MemoizationModule::Cache result="a", time=13717.190731299>}>
irb> f.find_new('a')
=> "a"

Notes

  • The last commit fixes a Rubocop failures on number of lines allow for a method definition. Seems less readable to me, but feel free to make adjustments to the code style as needed!

Hope all of that makes sense! Let me know what you think and whether you'd like me to split this out into multiple PRs. Thanks!

This allows marshaled objects to retain their memoized values when being unmarshaled
in a separate Ruby process where the memery module object_id may not be the same
…ven if the method's arity is non-zero)

This has several positive impacts:
* You don't encounter any performance overhead associated with allocating arrays and generating hashes
* The cache keys are equal across multiple Ruby processes when not providing arguments
This allows marshaled objects to retain their memoized values when being unmarshaled
in a separate Ruby process where the object hashes may be different
@obrie obrie force-pushed the feature/marshal-compat branch from fd9d7ef to 88c879d Compare December 16, 2024 18:11
lib/memery.rb Outdated Show resolved Hide resolved
benchmark.rb Outdated Show resolved Hide resolved
lib/memery.rb Outdated Show resolved Hide resolved
@obrie
Copy link
Author

obrie commented Jan 10, 2025

@tycooon I've updated the PR based on the feedback. The main thing to address is the Rubocop failure -- let me know what you'd like me to do there.

@obrie
Copy link
Author

obrie commented Jan 13, 2025

I've addressed a few failures resulting from the changes in this PR:

  • Missing spec coverage when use_hashed_arguments is false
  • Overriding memoized methods in inheritance hierarchies (this is why object_id was originally included in the method_key -- I've since updated it to default to the class name and fall back to the object_id when needed)

The remaining CI failures are due to Coveralls builds (I think a CI config issue).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants