Generating unique IDs efficiently
The client creates IDs from a range of unique numbers that it gets from the server.
The HiLo algorithm is efficient because the client can automatically generate unique document IDs
without checking with the server or cluster each time a new document is created to ensure that the new ID is unique.
The client receives from the server a range of numbers that are reserved for the client's usage.
Each time a session creates a new document the client assigns the new document an ID based on the next number from that range.
For example, the first client or node to generate documents on a collection will reserve
the numbers 1-32. The next one will reserve numbers 33-64, and so on.
The collection name and node-tag are added to the ID.
To further ensure that no two nodes generate a document with the same ID, the collection name and the node-tag are added to the ID.
This is an added measure so that if two nodes B and C are working with the same range of numbers,
the IDs generated will be
orders/54-C. This situation is rare because as long as the nodes can communicate
when requesting a range of numbers, they will receive a different range of numbers.
The node-tag is added to ensure unique IDs across the cluster.
Thus, with minimal trips to the server, the client is able to determine to which collection an entity belongs,
and automatically assign it a number with a node-tag to ensure that the ID is unique across the cluster.
HiLo documents are used by the server to provide the next range of numbers
To ensure that multiple clients can generate the identifiers simultaneously, they need some mechanism to avoid duplicates.
This is ensured with Raven/HiLo/ documents, stored in the @hilo collection in the database.
These documents are modified by the server.
They have a very simple construction:
Max property means the maximum possible number that has been used by any client to create the identifier for a given collection. It is used as follows:
- The client asks the server for a range of numbers that it can use to create a document.
(32 is the initial capacity but the actual range size is calculated based on the frequency of getting HiLo by the client.)
- Then, the server checks the HiLo file to see what is the last "Max" number it sent to any client for this collection.
- The client will get the min and the max values it can use from the server (33 - 64 in our case).
- Then, the client generates a range object from the values it got from the server to generate identifiers.
- When the client reaches the max limit, it will repeat the process.