Hub/Sink Replication


Hub/Sink replication is used to maintain a live replica of a database or a chosen part of it, through a secure connection between ongoing Hub and Sink replication tasks.

RavenDB instances distributed onboard mobile library busses, for example, can collect data locally (e.g. bus GPS coordinates and books returns and borrows), and replicate it via local Sink tasks to the central library's Hub whenever they are online.

  • Learn more about Hub/Sink replication Here.
  • You can use the Studio to define Hub and Sink tasks.


What is and is not replicated?

After the tasks are defined, changed documents whose replication is allowed by both the Hub and the Sink filters will replicate.

If you want the entire database to be replicated in the destination, you can import the database into the destination.

After the data is in the destination server, setting up a hub/sink replication ongoing task will keep the two databases up to date.

What is being replicated:

What is not being replicated:

Why are some cluster-level features not replicated?

To provide for architecture that prevents conflicts between clusters, especially when ACID transactions are important, RavenDB is designed so that data ownership is at the cluster level.
To learn more, see Data Ownership in a Distributed System.

It is also best to ensure that each cluster defines policies, configurations, and ongoing tasks that are relevant for it.

Defining Replication Tasks

To start replication via Hub and Sink tasks, you need to define -

  1. A Hub task
  2. Hub Access/es
    • Multiple Sink tasks can connect a Hub using each Access you define for it.
    • Each Access has an associated certificate, that is used by the Sink to authenticate with the Hub. This certificate is used to identify the specific Access and the relevant filters for the connection.
  3. Sink task/s
  4. Filtering
    • You can enable or disable replication filtering, and specify the paths of documents whose replication is allowed.
    • Allowed paths are defined separately for the Hub and for the Sink.
    • You can filter incoming and outgoing replication defining separate lists of allowed paths for incoming and outgoing documents.
      • Only documents that are allowed by both the hub and sink filters will be replicated.

Defining a Replication Hub

Use PutPullReplicationAsHubOperation to register a new Hub task,
and configure it using a PullReplicationDefinition class.

await store.Maintenance.SendAsync(new PutPullReplicationAsHubOperation 
    (new PullReplicationDefinition {
        Name = "Hub1_Bidirectional",
        Mode = PullReplicationMode.SinkToHub | PullReplicationMode.HubToSink,
        WithFiltering = true
    }));
  • PutPullReplicationAsHubOperation definition

    public PutPullReplicationAsHubOperation(string name)  
          public PutPullReplicationAsHubOperation(PullReplicationDefinition pullReplicationDefinition)
  • PullReplicationDefinition parameters

    Parameters Type Description
    DelayReplicationFor TimeSpan Amount of time to wait before starting replication
    Disabled bool Disable task or leave it enabled
    MentorNode string Preferred Mentor Node
    Mode PullReplicationMode Data Direction (HubToSink, SinkToHub, or Both)
    Name string Task Name
    TaskId long Task ID
    WithFiltering bool Allow Replication Filtering

Defining a Hub Access

Use RegisterReplicationHubAccessOperation to define a Hub Access,
and configure it using a ReplicationHubAccess class.

await store.Maintenance.SendAsync(new RegisterReplicationHubAccessOperation
   ("Hub1_Bidirectional", new ReplicationHubAccess {
        Name = "Access1",
        AllowedSinkToHubPaths = new[]
        {
            "products/*",
        },
        AllowedHubToSinkPaths = new[]
        {
            "products/*",
        },
        CertificateBase64 = Convert.ToBase64String(pullCert.Export(X509ContentType.Cert))
    }));
  • RegisterReplicationHubAccessOperation definition

    public RegisterReplicationHubAccessOperation(string hubName, ReplicationHubAccess access)
  • ReplicationHubAccess parameters

    Parameters Type Description
    Name string Task Name
    CertificateBase64 string Task Certificate
    AllowedHubToSinkPaths string[] Allowed paths from Hub to Sink
    AllowedSinkToHubPaths string[] Allowed paths from Sink to Hub

To Remove an existing Access, use UnregisterReplicationHubAccessOperation.

  • UnregisterReplicationHubAccessOperation definition:
    public UnregisterReplicationHubAccessOperation(string hubName, string thumbprint)

Defining a Replication Sink

Use UpdatePullReplicationAsSinkOperation to define a Sink task,
and configure it using a PullReplicationAsSink class.

await store.Maintenance.SendAsync(new UpdatePullReplicationAsSinkOperation
   (new PullReplicationAsSink {
        ConnectionStringName = dbName + "_ConStr",
        Mode = PullReplicationMode.SinkToHub | PullReplicationMode.HubToSink,
        CertificateWithPrivateKey = Convert.ToBase64String(pullCert.Export(X509ContentType.Pfx)),
        HubName = "Bidirectional",
        AllowedHubToSinkPaths = new[]
        {
            "employees/8-A"
        },
        AllowedSinkToHubPaths = new[]
        {
            "employees/8-A"
        }
    }));
  • UpdatePullReplicationAsSinkOperation definition

    public UpdatePullReplicationAsSinkOperation(PullReplicationAsSink pullReplication)
  • PullReplicationAsSink parameters

    Parameters Type Description
    Mode PullReplicationMode Data Direction (HubToSink, SinkToHub, or Both)
    AllowedHubToSinkPaths string[] Allowed paths from Hub to Sink
    AllowedSinkToHubPaths string[] Allowed paths from Sink to Hub
    CertificateWithPrivateKey string A certificate with the Sink's Private key
    CertificatePassword string Certificate Password
    AccessName string Access Name to connect to
    HubName string Hub Name to connect to

Defining a Connection String

The Sink needs a connection string to locate the Hub task it is to use.

Use PutConnectionStringOperation to define a connection string,
and configure it using a RavenConnectionString class.

await storeA.Maintenance.SendAsync(
    new PutConnectionStringOperation<RavenConnectionString>(new RavenConnectionString
    {
        Database = dbNameB,
        Name = dbName + "_ConStr",
        TopologyDiscoveryUrls = store.Urls
    }));

Learn about Connection Strings here.

Usage Sample

// Issue a certificate
var pullCert = new X509Certificate2("/path/to/cert.pfx", 
    (string)null, X509KeyStorageFlags.Exportable);

// Define a Hub task
await store.Maintenance.SendAsync(new PutPullReplicationAsHubOperation(
    new PullReplicationDefinition
    {
        Name = "Hub1_SinkToHub_Filtered",
        Mode = PullReplicationMode.SinkToHub | PullReplicationMode.HubToSink,
        WithFiltering = true
    }));

// Define Hub access
await store.Maintenance.SendAsync(new RegisterReplicationHubAccessOperation(
    "Hub1_SinkToHub_Filtered", new ReplicationHubAccess
    {
        Name = "Access1",
        AllowedSinkToHubPaths = new[]
        {
            "products/*",
            "orders/*"
        },
        
        // The public portion of the certificate, in base 64
        CertificateBase64 = Convert.ToBase64String(pullCert.Export(X509ContentType.Cert))
    }));

// Define a Connection String
await store.Maintenance.SendAsync(
    new PutConnectionStringOperation<RavenConnectionString>(new RavenConnectionString
    {
        Database = dbNameB,
        Name = dbNameB + "_ConStr",
        TopologyDiscoveryUrls = store.Urls
    }));

// Define a Sink task
await store.Maintenance.SendAsync(
    new UpdatePullReplicationAsSinkOperation(new PullReplicationAsSink
    {
        ConnectionStringName = dbNameB + "_ConStr",
        Mode = PullReplicationMode.SinkToHub,
        CertificateWithPrivateKey = Convert.ToBase64String(pullCert.Export(X509ContentType.Pfx)),
        HubName = "Hub1_SinkToHub_Filtered"
    }));

Failover

Since the Sink task always initiates the replication, it is also the Sink's responsibility to reconnect on network failure.


Hub Failure

As part of the connection handshake, the Sink fetches an ordered list of nodes from the Hub cluster. If a preferred node is defined (by explicitly selecting a mentor node), it will be at the top of this list.
The Sink will try to connect to the first node in the list, and proceed down the list with every failed attempt.
If the connection fails with all nodes, the Sink will request the list again.


Sink Failure

If the failure occurs on the Sink node, the Sink cluster will select a different node for the job.

Backward Compatibility

RavenDB versions that precede 5.1 support Pull Replication, which allows you to define Hub and Sink tasks and replicate data from Hub to Sink.

In RavenDB 5.1 and on, Pull Replication is replaced and enhanced by Hub/Sink Replication, which provides everything Pull Replication does and adds to it Sink to Hub replication and Replication Filtering.

  • Pull Replication tasks defined on a RavenDB version earlier than 5.1, will remain operative when you upgrade to version 5.1 and on.

  • A Hub or a Sink task that runs on a RavenDB version earlier than 5.1, can connect a Hub or a Sink defined on RavenDB 5.1 and on.
    You do not need to upgrade the task's instance to keep the task operative.

Upgrade RavenDB from a version earlier than 5.1 if you want to implement Hub/Sink Replication added features, i.e. Sink-to-Hub replication and Replication Filtering.