External Replication Task
-
Schedule an External Replication Task in order to have a live replica of your data in another database:
- In the same cluster if you want a live copy that won't be a client failover target.
- In a separate RavenDB cluster on local machines or a cloud instance,
which can be used as a failover if the source cluster is down.
-
"Live" means that the replica is up to date at all times.
Any changes in the source database will be reflected in the replica once they occur. -
This ongoing task replicates one-way, from the source to the destination. For additional functionality,
such as filtration and two-way (master-master) replication, consider Hub/Sink Replication. -
To replicate between two separate, secure RavenDB servers,
you need to pass a client certificate from the source server to the destination. -
External replication can be a comfortable means of data migration into a sharded database.
You can read more about this option in the sharding documentation external replication and migration sections. -
The External Replication task does not create a backup of your data and indexes.
See more in Backup -vs- Replication -
In this page:
General Information about External Replication Task
What is being replicated:
- All database documents and related data:
What is not being replicated:
-
Server and cluster level features:
- Indexes
- Conflict resolver definitions
- Compare-Exchange
- Subscriptions
- Identities
-
Ongoing tasks
Why are cluster-level features not replicated?
RavenDB is designed with a cluster-level data ownership model to prevent conflicts between clusters,
especially in scenarios where ACID transactions are critical.This approach ensures that certain features, such as policies, configurations, and ongoing tasks,
remain specific to each cluster, avoiding potential inconsistencies.To explore this concept further, refer to the Data Ownership in a Distributed System blog post.
Conflicts:
- Two databases that have an External Replication task defined between them will detect and resolve document conflicts according to each database conflict resolution policy.
- It is recommended to have the same conflict resolution policy configuration on both the source and the target databases.
Step-by-Step Guide
-
Pass Certificate from Source Server to Destination Server
This step must be done if replicating to a separate secure cluster so that the destination cluster trusts the source.-
Via RavenDB Studio:
Navigate from the "Manage Server" tab (left side) > "Certificates" to open the Certificate Management view.- Learn how to pass certificates here.
-
Via API:
See the code sample to learn how to define a client certificate in the DocumentStore().- To generate and configure a client certificate from the source server:
Via code, see CreateClientCertificateOperation.
Via RavenDB CLI PowerShell command, see Client Certificate Usage - Learn the rationale needed to configure client certificates in The RavenDB Security Authorization Approach
- To generate and configure a client certificate from the source server:
-
Via RavenDB Studio:
-
Create Target Database in Destination Server
- Consider creating an empty target database because data transfer can overwrite existing data.
- If the source database is encrypted,
be sure that the target database is as well.
-
Define External Replication Task in the Source Database
Learn more about defining the task in the dedicated section.-
Optional Parameters
- Task Name
- Delay Time
- Preferred Node
-
Required Parameters
- Connection String
-
Save
- Click "Save" to activate the External Replication task.
- Check the target database to see if data has been transferred. This can take at least about 20-30 seconds, depending on the dataset size.
- If the data did not transfer properly, check the notifications (top-right of Studio) in the responsible node to see if there were any errors.
-
Optional Parameters
Definition
To access the External Replication Task Studio interface:
a. Open the Databases view in the source server.
b. Select the source database.
c. Click Tasks tab.
d. Select Ongoing Tasks
e. Click Add a database task
f. Click External Replication to access the following interface.
Create New External Replication Task
-
Source Database
Be sure that you are defining the task from the correct source database. -
Task Name (Optional)
- Choose a name of your choice
- If no name is given then the RavenDB server will create one for you based on the defined connection string
-
Set Replication Delay Time (Optional)
- If a delay time is set then data will be replicated only after this time period has passed for each data change.
-
Having a delayed instance of a database allows you to "go back in time" and undo contamination to your data due to an attack, faulty patch script, or other human errors.
- This doesn't replace the need to safely backup your databases, but it does provide a way to stay online while repairing.
-
Set Preferred Responsible Node (Optional)
- Select a preferred mentor node from the Database Group to be the responsible node for this External Replication Task
- If not selected, then the cluster will assign a responsible node (see Members Duties)
-
Create a new RavenDB connection string
- Select a connection string from the pre-defined list -or- create a new connection string to be used.
-
The connection string defines the external database and its server URL to replicate to.
- Name
Give the connection string a meaningful name. -
Database
Copy the exact name of the destination database.- If the source database is encrypted, make sure that the destination is encrypted as well.
-
Discovery URL
Copy the URL from the destination server here.- Be sure to copy only the server URL - without extraneous details.
- Save
Click "Save" to activate the External Replication task.
- Name
If the destination database is in a cluster
You can set multiple connection strings to multiple different nodes on different machines so that if one is down, the other can keep the destination updated.
See Offline Behavior - When the destination node is down.
If you define only one node's connection string, RavenDB will wait until that node is online and will then update any missing information.
Details in Tasks List View
Tasks List View Details
-
External Replication Task Details:
- Task Status - Active / Not Active / Not on Node / Reconnect
- Connection String - The connection string used
- Destination Database - The external database to which the data is being replicated
- Actual Destination URL - The server URL to which the data is actually being replicated,
the one that is currently used out of the available Topology Discovery URLs - Topology Discovery URLs - List of the available destination Database Group servers URLs
-
Database Group Topology:
Visual representation showing the responsible node for the External Replication Task
Offline Behavior
-
When the source cluster is down (and there is no leader):
-
Creating a new Ongoing Task is a Cluster-Wide operation,
thus, a new Ongoing External Replication Task cannot be scheduled. -
If an External Replication Task was already defined and active when the cluster went down,
then the task will not be active, no replication will take place.
-
-
When the node responsible for the external replication task is down
- If the responsible node for the External Replication Task is down,
then another node from the Database Group will take ownership of the task so that the external replica is up to date.
- If the responsible node for the External Replication Task is down,
-
When the destination node is down:
-
The external replication will wait until the destination is reachable again and proceed from where it left off.
-
If there is a cluster on the other side, and the URL addresses of the destination database group nodes are listed in the connection string, then when the destination node is down, the replication task will simply start transferring data to one of the other nodes specified.
-