Working with Document Identifiers

Each document in a RavenDB database has a unique string associated with its document called an identifier. Every entity that you store either by using a session or with a put document command will have such identifier assigned in the database. In RavenDB there are four options supported by the server to store a document and assign an identifier to it. The client can directly take advantage of them. You can always handle the identifier generation using the knowledge of the type of the entity and the identifier number provided by the HiLo algorithm. This is how the identifier is generated by the session, which is described in details below.

Session Usage

If you choose to use the session, you don't have to pay any special attention to the identifiers of the stored entities. The session will take care of it by generating the identifiers automatically.

It utilizes conventions and HiLo algorithms to produce the identifiers. Everything is handled by the session's mechanism and is transparent for the user. However, you can influence the identifier generation strategy by overwriting the identifier generation conventions.

In this article we are going to consider the behavior in accordance with the default conventions.

Identifiers in RavenDB are strings

Identifiers of documents in RavenDB database are always strings, so take this into consideration when you model your entities.

Autogenerated IDs

In order to figure out which property (or field) holds the entity's identifier, the convention Conventions.FindIdentityProperty is called. By default, it looks for the property or the field named Id (case sensitive). However, this property can have a null value or even not be present at all. Then the automatic identifier generation strategy is performed. The default convention is that entities get the identifiers in the following format collection/number-tag. RavenDB client first determines the name of the collection that the entity belongs to, then contacts the server in order to retrieve a numeric range of values. These values can be used as the number part. The range of available numbers is calculated by using the HiLo algorithm and it is tracked per collection. The current maximum value in ranges are stored in documents Raven/Hilo/collection.

Let's see the example.

var order = new Order
{
    Id = null // value not provided
};

session.Store(order);

What will be the identifier of this order? You can check it by calling:

var orderId = session.Advanced.GetDocumentId(order); // "orders/1-A"

If this is the first Order entity in your database, then it will return orders/1-A. How does the identifier generation process proceed? The RavenDB client determines the collection name as orders (by default it is the plural form of the entity name). Then it asks the server to give him the ID's range he can use (the first available range is 1 - 32). The server will handle the Raven/Hilo/orders document. The next available identifier value (always incrementing number) from the given range is 1 so its combination with the collection name and the node tag gives the result orders/1-A.

The next attempt to store another Order object within the same session will result in creating the orders/2-A identifier. However, this time asking the server about the possible range will not be necessary because the in-memory range (1 - 32) is enough, so simply the next number will be added as the identifier suffix.

Identifier value numeric range generation

Each (in code) document store instance handles the generation of the identifier value numeric range. The database stores the last requested number while the document store instances request ranges and caches the returned range of available identities.

The database has a single document (per collection) which stores the last identifier value requested by a document store instance.

E.g. the document Raven/HiLo/accounts has the following value

{ 
    "Max": "4000",
    "@metadata": {
        "@collection": "@hilo"
    } 
}

then the next range will be 4001 - 4032, if 32 was range size (by default, it's 32).

The number of sessions per document store instance plays no part in identifier value generation. When the store is disposed, the client sends to the server the last value that he used and the max value he got from the server. Then the server will write it in the HiLo document (If the Max number is equal to the max number from the client and bigger or equal to the last used value by the client)

If your intention is to skip the identifier creation strategy that relies on the collection and HiLo value pair, then you can allow the RavenDB database to assign the Guid identifier to the stored document. Then you have to provide the string.Empty as the value of the Id property:

var orderEmptyId = new Order
{
    Id = string.Empty // database will create a GUID value for it
};

session.Store(orderEmptyId);

session.SaveChanges();

var guidId = session.Advanced.GetDocumentId(orderEmptyId); // "bc151542-8fa7-45ac-bc04-509b343a8720"

This time the check for the document ID is called after SaveChanges because only then we go to the server while the entity's identifier is generated there.

Custom / Semantic IDs

The session also supports the option to store the entity and explicitly tell under what identifier it should be stored in the database. To do this, you can either set the Id property of the object:

var product = new Product
{
    Id = "products/ravendb",
    Name = "RavenDB"
};

session.Store(product);

or use the following Store method overload:

session.Store(new Product
{
    Name = "RavenDB"
}, "products/ravendb");

Server-side generated IDs

RavenDB also supports the notion of the identifier without the usage of the HiLo. By creating a string ID property in your entity and setting it to a value ending with a slash (/), you can ask RavenDB to assign a document ID to a new document when it is saved.

session.Store(new Company
{
    Id = "companies/"
});

session.SaveChanges();

Using / at the end of the ID will create an ID at the server side by appending a numeric value and the node tag. After executing the code above we will get from the server ID something that looks like companies/000000000000000027-A.

Information

Be aware that the only guarantee for the numeric part is that it will always be increasing only within the same node.

Identities

If you need IDs to increment across the cluster, you can use the Identity option.
To do so you need to use a pipe (|) as a suffix to the provided ID. This will instruct RavenDB to create the ID when the document is saved, using a special cluster-wide integer value that is continuously incremented.

Using an identity guarantees that IDs will be incremental, but does not guarantee that there wouldn't be gaps in the sequence.
The IDs sequence can therefore be, for example, companies/1, companies/2, companies/4..
This is because -

  • Documents could have been deleted.
  • A failed transaction still increments the identity value, thus causing a gap in the sequence.

session.Store(new Company
{
    Id = "companies|"
});

session.SaveChanges();

After the execution of the code above, the ID will be companies/1.
We do not add the node tag to the end of the ID, because the added number is unique in the cluster.
Identities continuously increase, so running the above code again will generate companies/2, and so on.

Note that we used companies as the prefix just to follow the RavenDB convention.
Nothing prevents you from providing a different prefix, unrelated to the collection name.

Be aware that using the pipe symbol (|) as a prefix to the ID generates a call to the cluster and might affect performance.

  • Identity Parts Separator
    By default, document IDs created by the server use / to separate their components.
    This separator can be changed to any other character except |, in the Global Identifier Generation Conventions.
    See Setting Identity IDs Using Commands and Operations for details.

  • Concurrent writes
    The identities are generated and updated on the server side in the atomic fashion.
    This means you can safely use this approach in the concurrent writes scenario.

Commands Usage

The use of the commands API gives you the full freedom to select the identifier generation strategy. As in the case of session, you can either ask the server to provide the identifier, or provide the identifier of the stored entity manually.

Identity IDs

As in the case of session, you can indicate if the identifier that you are passing needs to have the identifier suffix added. You have to mark it by ending the ID with / or | character:

var doc = new DynamicJsonValue
{
    ["Name"] = "My RavenDB"
};

var blittableDoc = session.Advanced.EntityToBlittable.ConvertEntityToBlittable(doc, null);

var command = new PutDocumentCommand("products/", null, blittableDoc);

session.Advanced.RequestExecutor.Execute(command, session.Advanced.Context);

var identityId = command.Result.Id; // "products/0000000000000000001-A if using only '/' in the seesion"

var commandWithPipe = new PutDocumentCommand("products|", null, blittableDoc);
session.Advanced.RequestExecutor.Execute(commandWithPipe, session.Advanced.Context);

var identityPipeId = command.Result.Id; // "products/1"

Using the commands, you can manage to build identifiers on the client, but still relying on the server side identifier generator. Simply point out for which prefix you want to fetch the next available identifier number. Look at the example:

var command = new NextIdentityForCommand("products");
session.Advanced.RequestExecutor.Execute(command, session.Advanced.Context);
var identity = command.Result;

var doc = new DynamicJsonValue
{
    ["Name"] = "My RavenDB"
};

var blittableDoc = session.Advanced.EntityToBlittable.ConvertEntityToBlittable(doc, null);

var putCommand = new PutDocumentCommand("products/" + identity, null, blittableDoc);

session.Advanced.RequestExecutor.Execute(putCommand, session.Advanced.Context);

Note that such construction requires going to the server twice in order to add a single document. The call of session.Advanced.RequestExecutor.Execute(command, session.Advanced.Context) is necessary for every entity you want to store. Asking the server about the next identifier results in increasing this value on the server side. You cannot simply get the next available identifier and use it to create the identifiers for the whole collection of the same type objects by locally incrementing this value because you can accidentally overwrite the document or get a conflict exception if someone else is putting documents using the identifier mechanism.

There are dedicated commands that allow you to set identifier values for a single given prefix:

var seedIdentityCommand = new SeedIdentityForCommand("products", 1994);