-
Notifications
You must be signed in to change notification settings - Fork 20
Metadata Manager
Metadata is stored in a distributed hash map. In each Hermes Daemon, we initialize an hipc::unordered_map. The main metadata structures we store are as follows: Tag Map (note, Buckets are represented as Tags), Blob Map, and Trait Map. These maps typically map an integer ID to an information structure. For example, the Blob Map maps a BlobId (a 96-bit int) to a BlobInfo struct. In addition, we have separate maps for mapping semantic strings to integer IDs. For example, we have a map from a hipc::string to a BlobId.
At this time, metadata is not replicated on nodes and we assume that metadata doesn't grow so large that it exceeds the bounds of main memory.
Metadata (e.g., Blobs and Tags) can be given semantic names using hipc::strings or std::strings. hipc::string is what is eventually stored in Hermes, since it's compatible with shared memory.
User primitives are referred to by unsigned 96-bit integers (IDs).
Each ID encodes the data it needs to access its metadata.
TagIds, BlobIds, and TraitIds all are instances of a UniqueId. UniqueIds are represented as follows:
- Node ID: The identifier of the node the metadata is on (32-bit)
- Unique: The unique number of the metadata object (64-bit)
The unique field is a 64-bit integer which is atomically incremented every time the program creates a new metadata object. 64-bit is large enough that the program should never be able to use all 2^64 combos.
All metadata is distributed among nodes by first hashing the key to determine the node, then hashing again to determine the slot.
- Better load balancing
- May require extra RPC calls. Initial tests show that this indirection should be avoided. TODO: We need to revisit this.
1. Create a new BlobID
. The ID's node index (top 32 bits) is created
by hashing the blob name, and the ID's offset to a list of BufferID
s
(bottom 32 bits) is allocated from the MDM shared memory segment on the
target node.
2. Add the new BlobID
to the IdMap. This could be local, or an
RPC.
3. Add the BlobID
to the Bucket
's list of blobs.
1. Hash the blob name to get the BlobID
.
2. Get the list of BufferID
s from the BlobID
.
3. Read each BufferID
's data into a user buffer.
There can be a total of 2^64 unique metadata objects. I.e., there can be a total of 2^64 Tags, Buckets, and Traits.