HiLo Algorithm
-
The HiLo algorithm is an efficient solution used by a session to generate the numeric parts of unique document identifiers. It is responsible for providing numeric values that are combined with collection names and node tags to create identifiers like
orders/10-A
orproducts/93-B
. -
The HiLo algorithm is the default means of creating unique IDs in RavenDB.
These IDs are generated when creating a new document without specifying anId
value.
See a sample Autogenerated HiLo ID. -
For other means of generating unique IDs see Working with Document Identifiers.
-
In this page:
How the HiLo Algorithm Works in RavenDB
Generating unique IDs efficiently
The client creates IDs from a range of unique numbers that it gets from the server.
The HiLo algorithm is efficient because the client can automatically generate unique document IDs
without checking with the server or cluster each time a new document is created to ensure that the new ID is unique.
The client receives from the server a range of numbers that are reserved for the client's usage.
Each time a session creates a new document the client assigns the new document an ID based on the next number from that range.
For example, the first client or node to generate documents on a collection will reserve
the numbers 1-32. The next one will reserve numbers 33-64, and so on.
The collection name and node-tag are added to the ID.
To further ensure that no two nodes generate a document with the same ID, the collection name and the node-tag are added to the ID.
This is an added measure so that if two nodes B and C are working with the same range of numbers,
the IDs generated will be orders/54-B
and orders/54-C
. This situation is rare because as long as the nodes can communicate
when requesting a range of numbers, they will receive a different range of numbers.
The node-tag is added to ensure unique IDs across the cluster.
Thus, with minimal trips to the server, the client is able to determine to which collection an entity belongs, and automatically assign it a number with a node-tag to ensure that the ID is unique across the cluster.
HiLo documents are used by the server to provide the next range of numbers
To ensure that multiple clients can generate the identifiers simultaneously, they need some mechanism to avoid duplicates.
This is ensured with Raven/HiLo/
These documents are modified by the server.
They have a very simple construction:
{
"Max": 32,
"@metadata": {
"@collection": "@hilo"
}
}
The Max
property means the maximum possible number that has been used by any client to create the identifier for a given collection. It is used as follows:
- The client asks the server for a range of numbers that it can use to create a document. (32 is the initial capacity but the actual range size is calculated based on the frequency of getting HiLo by the client.)
- Then, the server checks the HiLo file to see what is the last "Max" number it sent to any client for this collection.
- The client will get the min and the max values it can use from the server (33 - 64 in our case).
- Then, the client generates a range object from the values it got from the server to generate identifiers.
- When the client reaches the max limit, it will repeat the process.
Returning HiLo Ranges
When the document store is disposed, the client sends to the server the last value it used to create an identifier and the max value that was previously received from the server.
If the max value on the server-side is equal to the max value of the client and
the last used value by the client is smaller or equal to the max of the server-side,
the server will update the Max
value to the last used value by the client.
var store = new DocumentStore();
using (var session = store.OpenSession())
{
// Store an entity will give us the hilo range (ex. 1-32)
session.Store(new Employee
{
FirstName = "John",
LastName = "Doe"
});
session.SaveChanges();
}
// Release the range when it is no longer relevant
store.Dispose();
store.Dispose()
is used in this example to demonstrate that the range is released.
In normal use, the store
should only be disposed when the application is closed.
After execution of the code above, the Max
value of the Hilo document for the Employees collection in the server will be 1.
That's because the client used only one identifier from the range it got before we disposed the store.
The next time that a client asks for a range of numbers from the server for this collection it will get (in our example) the range 2 - 33.
var newStore = new DocumentStore();
using (var session = newStore.OpenSession())
{
// Store an entity after disposing the last store will give us (ex. 2-33)
session.Store(new Employee
{
FirstName = "John",
LastName = "Doe"
});
session.SaveChanges();
}
Identity Parts Separator
By default, document IDs created by the server use the character /
to separate their components.
This separator can be changed to any other character except |
in the
Document Store Conventions.
Manual HiLo ID Generation
To use RavenDB's automatic HiLo ID generation, a session needs to store a new document with id = null. RavenDB's automatic HiLo ID generator includes the collection name, unique number, and node tag to ensure that the ID is unique across the cluster.
However, you can use the HiLo algorithm and create documents with your own, manually created ID that is based on the next HiLo ID number provided by the client.
The manual generator does not guarantee unique IDs across a cluster because it does not attach a collection name and a node-tag to the ID.
It only provides the next number in the range that it got from the server.
RavenDB can only guarantee unique IDs when using RavenDB's ID generators.
When manually specifying your own IDs, you are responsible to ensure that the IDs are unique.
Syntax
Either one of the following overload methods will return the next available ID from the HiLo numbers reserved for the client. The returned ID number can then be used when storing a new document.
Task<long> GenerateNextIdForAsync(string database, object entity);
Task<long> GenerateNextIdForAsync(string database, Type type);
Task<long> GenerateNextIdForAsync(string database, string collectionName);
Parameters | Type | Description |
---|---|---|
database | string |
The database to write onto. null will create the Id for the default database set in the document store. |
collectionName | string |
The collection that the document will be added to. |
entity | object |
An instance of the specified collection. |
type | Type |
The collection entity type. It is usually the singular of the collection name. For example, collection = "Orders", then type = "Order". |
Return Value | Type | Description |
---|---|---|
nextId | long |
The next available number from the HiLo range reserved for the client. |
Examples: Manual HiLo Generators
The following example shows how to get the next HiLo ID number from the client.
The ID provided is the next unique number without the node tag and the collection.
This ID is then used to create and store a new document.
Calling GenerateNextIdForAsync
ensures minimal calls to the server, as the ID is generated by the client from the reserved range of numbers.
using (var session = store.OpenSession())
{
// Using overload - GenerateNextIdFor(string database, string collectionName);
var nextId = await store.HiLoIdGenerator.GenerateNextIdForAsync(null, "Products");
// Using overload - GenerateNextIdFor(string database, object entity);
nextId = await store.HiLoIdGenerator.GenerateNextIdForAsync(null, new Product());
// Using overload - GenerateNextIdFor(string database, Type type);
nextId = await store.HiLoIdGenerator.GenerateNextIdForAsync(null, typeof(Product));
// Now you can create a new document with the ID received.
var product = new Product
{
Id = nextId.ToString()
};
// Store the new document.
session.Store(product);
session.SaveChanges();
}
Unique IDs across the cluster
This manual generator sample is sufficient if you are using only one server. If you want to ensure unique IDs across the cluster, we recommend using our default generator.
You may also consider using the cluster-wide Identities generator,
which guarantees a unique ID across the cluster.
It is more costly than our default HiLo generator because it requires a request from the server for each ID,
and the server needs to do a Raft consensus check
to ensure that the other nodes in the cluster agree that the ID is unique, then returns the ID to the client.