Replication: How client integrates with replication bundle?

RavenDB's Client API is aware of the replication mechanism offered by the server instances and is ready to support failover scenarios.

Failover behavior

By default the client will detect and respond appropriately whenever a server has the replication bundle enabled. This includes:

  • Detecting that an instance is replicating to another set of instances.
  • When that instance is down, the client will be automatically shifted to other instances.

This is caused by a failover mechanism which is turned in a document stored by default. The client can load a replication document from /replication/topology to learn what replication instances to use if the failover occurred.


The client by default creates requests for the replication document even if the server does not have the replication bundle enabled. In this case, the request for /replication/topology results in 404 in server logs.

You can turn off the failover behavior by using the document store conventions. In order to do so, use fail_immediately option

document_store.conventions.failover_behavior = Failover.fail_immediately

When fail_immediately option is used then client will raise exception when primary server is down.

The remaining values of Failover enumeration are:

  • allow_reads_from_secondaries (default) - allow to read from secondary server(s), but immediately fail writes to the secondary server(s)
  • allow_reads_from_secondaries_and_writes_to_secondaries - allow reads from and writes to secondary server(s)
  • read_from_all_servers - spread read requests across all servers, instead of doing all the work against master. Write requests will always go to master

They determine the strategy of the failovers if the primary server is down and the environment is configured to replicate between sibling instances.

Discovering destinations

Once the document store is configured to support failovers, the replication configuration of the database is checked. A list of replicated nodes is then retrieved and saved in the local application storage. Even if it is impossible to reach the primary server in the future, the list will still exist locally, and the document store can try to work with secondary instances, according to the conventions.

Changes in the server's replication configuration are monitored by the Client API as well. It is done regularly, every 5 minutes, to check if the documents are directed to current instances that are slaves to the primary server, in case a failover occurs.

Failover servers

If the client cannot reach the primary server and does not have a list of servers, nor is such list available in the local cache, the client will attempt to load and use manually configured failover servers. List of those servers can be configured by a list of ReplicationDestination in ReplicationDocument and then store them in the server.


with document_store.open_session("PyRavenDB") as session:
    replication_document = database.ReplicationDocument([database.ReplicationDestination(
        url="http://localhost:8080", database="destination_database_name"),
            url="http://localhost:8081", database="destination_database_name")])

Setting up default client configuration on server

Default client configuration can be 'injected' into a client, by filling out the ClientConfiguration property in Raven/Replication/Destinations.

The available options are:

  • Failover - default failover behavior for all clients that are connecting to a database.

Default configuration can be altered by The Studio as well. Appropriate settings are available in Settings -> Replication.

Setting up default client configuration on server

Request redirection

The Raven Client API is quite intelligent in this regard, as upon failure it will:

  • Assume that the failure is transient, and retry the request,
  • If the second attempt fails as well, will record the failure and shift to a replicated node, if available,
  • After ten consecutive failures, Raven will start replicating to this node less often
    • Once every 10 requests, until failure count reaches 100
    • Once every 100 requests, until failure count reaches 1,000
    • Once every 1,000 requests, when failure count is above 1,000
  • On the first successful request, the failure count is reset.

If the second replicated node fails, the same logic applies to it as well, and we move to the third replicated node, and so on. If all nodes fail, an appropriate exception is thrown.

Back to primary

The client shifted to a replicated node will go back to its primary server as soon as it becomes reachable (irrespective of the failure count). In replication environment the nodes send heartbeat messages in order to notify destination instances that they are up again. Then the destination (which is the secondary server for our shifted client) will send a feedback message to the client and then try sending a request to the primary server again. If the operation is successful, the failure count will be reset and the communication will work normally.

Replicated operations

At a lower level, the following operations support replication:

  • get - single document and multi documents
  • put
  • delete
  • query

The following operations do not support replication in the Client API:

  • put_index
  • delete_index