Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra memory consumed by embedded sds in new HashObject fields #1567

Open
ranshid opened this issue Jan 15, 2025 · 0 comments
Open

Extra memory consumed by embedded sds in new HashObject fields #1567

ranshid opened this issue Jan 15, 2025 · 0 comments

Comments

@ranshid
Copy link
Member

ranshid commented Jan 15, 2025

Following #1502 which introduced hashtable instead of dict in hash objects we started embedding the field sds in the hashTableEntry.
The problem is that the field sds might arrive from different sources:

  1. command arguments - in the usual case the sds will be provided from the parsed command arguments.. In such cases the sds is from a stringObject which means it will always have a minimal header size of 3 bytes (sds8).
  2. listpack conversion - when we convert from listpack to hashtable the listpack is scanned and the field sds is being created from the listpack string via sdsnewsize which will use the minimal header size (ie small strings will use sds5 which is 1 byte).
  3. Modules - for example VM_HashSet will create a RAW string object which will have the sds allocated with a minimal size header (ie sds5 for small strings which is 1 byte long).

When we create the hashTable entry in hashTypeCreateEntry we will embed the field sds according to the provided sds representation, so in case the field originated at a parsed command argument it will use extra 2 bytes for the header.

While there would probably NOT be any degradation in overall memory utilization (since the new hashtable is more memory efficient) it might cause strange results following listpack conversions.
for example:
say hash1 is created when the hash_max_listpack_entries config is 0 and added with 10 small fields
and hash2 is created when the hash_max_listpack_entries config is 0 9 and added with 10 small fields

after all 10 elements were added both tables are expected to show the same memory consumption, but hash1 would show as using extra 18 bytes of memory.

NOTE - I do think the issue is minor and would probably be addressed during the work on #1551 and/or #640 So I mainly opened it in order to have a better tracking of the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant