Studio: External Replication Task



General Information about External Replication Task

What is being replicated:

What is not being replicated:

Why are some cluster-level features not replicated?

To provide for architecture that prevents conflicts between clusters, especially when ACID transactions are important, RavenDB is designed so that data ownership is at the cluster level.
To learn more, see Data Ownership in a Distributed System.

It is also best to ensure that each cluster defines policies, configurations, and ongoing tasks that are relevant to it.

Conflicts:

  • Two databases that have an External Replication task defined between them will detect and resolve document conflicts according to each database conflict resolution policy.
  • It is recommended to have the same conflict resolution policy configuration on both the source and the target databases.

Step-by-Step Guide

  1. Pass Certificate from Source Server to Destination Server
    This step must be done if replicating to a separate secure cluster so that the destination cluster trusts the source.
  2. Create Target Database in Destination Server
  3. Define External Replication Task in the Source Database
    Learn more about defining the task in the dedicated section.
    • Optional Parameters
      • Task Name
      • Delay Time
      • Preferred Node
    • Required Parameters
      • Connection String
    • Save
      • Click "Save" to activate the External Replication task.
      • Check the target database to see if data has been transferred. This can take at least about 20-30 seconds, depending on the dataset size.
      • If the data did not transfer properly, check the notifications (top-right of Studio) in the responsible node to see if there were any errors.

Definition

To access the External Replication Task Studio interface:

a. Open the Databases view in the source server.
b. Select the source database.
c. Click Tasks tab.
d. Select Ongoing Tasks
e. Click Add a database task
f. Click External Replication to access the following interface.

Figure 1. External Replication Task Definition

Create New External Replication Task

  1. Source Database
    Be sure that you are defining the task from the correct source database.

  2. Task Name (Optional)

    • Choose a name of your choice
    • If no name is given then the RavenDB server will create one for you based on the defined connection string
  3. Set Replication Delay Time (Optional)

    • If a delay time is set then data will be replicated only after this time period has passed for each data change.
    • Having a delayed instance of a database allows you to "go back in time" and undo contamination to your data due to an attack, faulty patch script, or other human errors.
  4. Set Preferred Responsible Node (Optional)

    • Select a preferred mentor node from the Database Group to be the responsible node for this External Replication Task
    • If not selected, then the cluster will assign a responsible node (see Members Duties)
  5. Create a new RavenDB connection string

    • Select a connection string from the pre-defined list -or- create a new connection string to be used.
    • The connection string defines the external database and its server URL to replicate to.
      External Replication: Connection String
      1. Name
        Give the connection string a meaningful name.
      2. Database
        Copy the exact name of the destination database.
        • If the source database is encrypted, make sure that the destination is encrypted as well.
      3. Discovery URL
        Copy the URL from the destination server here. External Replication: URL
        • Be sure to copy only the server URL - without extraneous details.
      4. Save
        Click "Save" to activate the External Replication task.

    If the destination database is in a cluster

    You can set multiple connection strings to multiple different nodes on different machines so that if one is down, the other can keep the destination updated.
    See Offline Behavior - When the destination node is down.
    If you define only one node's connection string, RavenDB will wait until that node is online and will then update any missing information.

Details in Tasks List View

Figure 2. External Replication Task - Task List View

Tasks List View Details

  1. External Replication Task Details:

    • Task Status - Active / Not Active / Not on Node / Reconnect
    • Connection String - The connection string used
    • Destination Database - The external database to which the data is being replicated
    • Actual Destination URL - The server URL to which the data is actually being replicated,
      the one that is currently used out of the available Topology Discovery URLs
    • Topology Discovery URLs - List of the available destination Database Group servers URLs
  2. Database Group Topology:
    Visual representation showing the responsible node for the External Replication Task

Offline Behavior

  • When the source cluster is down (and there is no leader):

    • Creating a new Ongoing Task is a Cluster-Wide operation,
      thus, a new Ongoing External Replication Task cannot be scheduled.

    • If an External Replication Task was already defined and active when the cluster went down,
      then the task will not be active, no replication will take place.

  • When the node responsible for the external replication task is down

    • If the responsible node for the External Replication Task is down,
      then another node from the Database Group will take ownership of the task so that the external replica is up to date.
  • When the destination node is down:

    • The external replication will wait until the destination is reachable again and proceed from where it left off.

    • If there is a cluster on the other side, and the URL addresses of the destination database group nodes are listed in the connection string, then when the destination node is down, the replication task will simply start transferring data to one of the other nodes specified.