see on GitHub

Replication Overview

Replication in RavenDB is the process of transferring data from one database instance to another.
Data is highly available as Reads & Writes can be done on any cluster node.
Writes are always accepted and are automatically replicated to the other nodes,
either within the database-group, or to the nodes defined by a replication task,
(see the different types below).
Conflicts, which may be involved when the data replicates,
are resolved according to the defined conflict resolution policy on the receiving end.
In this page:

Replication types

Internal replication

Replicate between: All database-group nodes.
Handled by: Automatically handled by the Database-Group nodes.
Direction: Master-to-master replication among all database-group nodes.
Filtering: No filtering is available. Data is replicated as exists on the source.
Conflicts: Handled by the database resolution policy, which is the same for all the database instances.
Delays: There are no delays, data is immediately replicated.
Usage: This replication keeps the database data in sync across the database-group nodes.
The data is highly available as reads & writes can be done on any of the nodes.
You can write to any node in the database group,
that write will be recorded and automatically replicated to all other nodes in the database-group.

External replication

Replicate between: Two databases that are typically set on different clusters.
Handled by: Handled by the ongoing External Replication Task defined by the user.
Direction: One-way replication.
Filtering: No filtering is available. Data is replicated as exists on the source.
Conflicts: Handled as defined by the destination database resolution policy.
Delays: Replication can be delayed as defined within the external-replication task.
Usage: This replication allows you to have a live database replica in another cluster,
which can be used as a failover target.
It is possible to define two such tasks on separate clusters that will replicate to one another.

Hub/Sink replication

Replicate between: Multiple Sinks that connect to a single Hub on different clusters.
Handled by: Handled by the ongoing Hub/Sink Replication Tasks defined by the user.
Direction: Hub to Sink only, Sink to Hub only, or both directions (as defined by the task).
Filtering: Documents filtering is available when working with secure servers.
Conflicts: Handled as defined by the destination database resolution policy.
Delays: Replication can be delayed as defined within the Hub/Sink tasks.
Usage: Data is replicated between the Hub and all Sinks connected to that Hub.
The connection is always triggered by the Sink.

What is replicated

The following database-items are replicated:

Documents
Revisions
Attachments
Conflicts
Tombstones
Counters
Time Series

Content replicated:

The content of the replicated items data is Not changed.
If content change is required then consider using ETL tasks that use transformation scripts.

The following cluster-level features are Not replicated:

Index definitions and index data
Ongoing tasks definitions
Compare-exchange items
Identities
Conflict resolution scripts

With internal replication:
When you define a cluster-level behavior, i.e. create an index,
then consistency between the database instances in the database-group is achieved by the Raft Protocol.
With a replication task:
Replication controls only the flow of the data without dictating how it's going to be processed on the receiving end, thus different configurations can be defined on the source cluster and on the destination cluster.

How replication works

Each database instance holds a TCP connection to each of the other database instance destinations.
With internal replication - the destinations are all other database-group nodes.
With a replication task - the destinations are the nodes defined in the task.
Whenever there is a 'write' on one instance,
it will be sent to all the other nodes in the database-group immediately over this connection.
If a replication task is also defined, then the data will also replicate to the destination database.
Sending the data is done in an async manner.
If the database instance is unable to replicate the data, it will still accept that 'write action' and send it later.
Each database instance has its own local database-ETag.
This Etag increases on every storage write.
The item triggered the write will get that next consecutive number.
The order by which the items are replicated is set by their item-ETag, from low(oldest) to high(newest).
The data is sent in batches from the source to the destination.
Once the batch is processed successfully on the destination side,
the destination records the ETag of the last item it had received in that batch ( last-accepted-ETag ).
The destination sends a response back to the source with that last-accepted-ETag
so that the source will know where to continue sending the next batch from.
In case of a replication failure, when sending the next batch, replication will start from the item
that has this last-accepted-ETag, which is known from the previous successful batch.

Replication & transaction boundary

Transactions atomicity

RavenDB guarantees that modifications made in the same transaction will always be replicated
to the destination in a single batch and won't be broken into separate batches.
This is true for both the internal-replication and the replication-tasks.

Replication & cluster-wide transactions

A cluster-wide transaction, which is implemented by the Raft Protocol,
is either persisted on all database group nodes or rolled back on all upon failure.
After a cluster consensus is reached, the Raft command to be executed is propagated to all nodes.
When the command is executed locally on a node, if the items that are persisted are of the type that replicates, then the node will replicate them to the other nodes via the replication process.
A node that receives such replication will accept this write,
unless it has already committed it through a raft command it received before.

Transaction boundaries in single-node transactions

If there are several document modifications in the same transaction they will be sent in the same replication batch, keeping the transaction boundary on the destination as well.
However, when a document is modified in two separate transactions,
and if replication of the 1st transaction has not yet occurred,
then that document will Not be sent when the 1st transaction is replicated,
it will be sent with the 2nd transaction.
If you care about all the modifications that were done then enable revisions:
When a revision is created for a document it is written as part of the same transaction as the document.
The revision is then replicated along with the document in the same indivisible batch.

How revisions replication help data consistency

Consider a scenario in which two documents, Users/1 and Users/2, are created in the same transaction,
and then Users/2 is modified in a different transaction.

How will Users/1 and Users/2 be replicated?
When RavenDB creates replication batches, it keeps the transaction boundary by always sending documents that were modified in the same transaction, in the same batch.
In our scenario, however, Users/2 was modified after its creation, it is now recognized by its Etag as a part of a different transaction than that of Users/1, and the two documents may be replicated in two different batches, Users/1 first and Users/2 later.
If this happens, Users/1 will be replicated to the destination without Users/2 though they were created in the same transaction, causing a data inconsistency that will persist until the arrival of Users/2.
The scenario will be different if revisions are enabled.
In this case the creation of Users/1 and Users/2 will also create revisions for them both. These revisions will continue to carry the Etag given to them at their creation, and will be replicated in the same batch.
When the batch arrives at the destination, data consistency will be kept:
Users/1 will be stored, and so will the Users/2 revision, that will become a live Users/2 document.

see on GitHub

RavenDB

RavenDB Cloud

Try

Experience interactive demos and playground server

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Unlock your business potential

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events

What’s New

Roadmap

On-premise Pricing

Cloud Pricing

Support

Proof of Concept Program

Academic Program

Replication Overview

Replication types

Internal replication

External replication

Hub/Sink replication

What is replicated

How replication works

Replication & transaction boundary

How revisions replication help data consistency

Related Articles

Replication Within the Cluster

Replication Between Clusters

Inside RavenDB

Ayende @ Rahien Blog

RavenDB

RavenDB Cloud

Try

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Use Cases

Articles

Whitepapers