Highly Available Tasks



License

Please see available license types and check your own license to verify whether the Highly Available Tasks feature is activated in your database.

  • If your license provides highly available tasks, the responsibilities of a failed cluster node will be assigned automatically to another, available, node.

    Supported tasks include:

    • Backup
    • Data subscription
    • All ETL types
  • If your license does not provide highly available tasks, the responsibilites of a failed node will be resumed when the node returns online.

  • Scenarios below are meant to demonstrate the behavior of a system licensed for highly available tasks.

Constraints

  1. Task is defined per Database Group.

  2. Task is executed by a single Database Node only.
    With Backup Task being an exception in case of a cluster partition, see Backup Task - When Cluster or Node are Down.

  3. A Database Node can be assigned with many tasks.

  4. The node must be in a Member state in the Database Group in order to perform a task.

  5. Cluster must be in a functional state.

Responsible Node

  • Responsible Node is the node that is responsible to perform a specific Ongoing Task.

  • Each node checks whether it is the Responsible Node for the task by executing a local function that is based on the
    unique hash value of the task and the current Database Topology.

  • Since the Database Topology is eventually consistent across the cluster,
    there will be an eventually consistent single Responsible Node, which will answer the above constraints.

Additional Reading

Learn more here about database nodes' relations and states.

Tasks Relocation

Upon a Database Topology change, all existing tasks will be re-evaluated and re-distributed among the functional nodes.

For example:

Let's assume that we have a 5 nodes cluster [A, B, C, D, E] with a database on [A, B, E] and a task on node B.

Node B has network issues and is separated from the cluster. So nodes [A, C, D, E] are on one side and node [B] is on the other side.

The Cluster Observer will note that it can't reach node B and issue a Raft Command in order to move node B to a Rehab state.

Once this change has propagated, it will trigger a re-assessment of all tasks in all reachable nodes.
In our example the task will move to either A or E.

In the meanwhile, node B which has no communication with the Cluster Leader,
moves itself to be a Candidate and removes all its tasks.

Pinning a Task

It is sometimes preferable to prevent the failover of tasks to different responsible nodes.
An example for such a case is a heavy duty backup task, that better be left for the continuous care of its original node than reassigned during a backup operation.
Another example is an ETL task that transfers artificial documents. In this case a reassigned task might skip some of the artificial documents that were created on the original node.

The failover of a task to another responsible node can be prevented by pinning the task to a mentor node.

  • A pinned task will be handled only by the node it is pinned to as long as this node is a database group member.
  • If the node the task is pinned to fails, the task will not be executed until the node is back online.
    When the node awakes, the task will be resumed from the failure point on.
  • If a node remains offline for the period set by cluster.timebeforeaddingreplicainsec, the cluster observer will attempt to select an available node to replace it in the database group and redistribute the fallen node's tasks, including pinned ones, among database group members.

A task can be pinned to a selected node via Studio or using code.

Pinning via Studio

Pinning an ETL Task Using Studio

Pinning an ETL Task Using Studio

Pinning using code

To pin a task to the node that runs it, set the task's PinToMentorNode configuration option to true.
In the following example, a RavenDB ETL task is pinned.

AddEtlOperation<RavenConnectionString> operation = new AddEtlOperation<RavenConnectionString>(
    new RavenEtlConfiguration
    {
        ConnectionStringName = "raven-connection-string-name",
        Name = "Employees ETL",
        Transforms =
        {
            new Transformation
            {
                Name = "Script #1",
                Collections =
                {
                    "Employees"
                },
                Script = @"loadToEmployees ({
                        Name: this.FirstName + ' ' + this.LastName,
                        Title: this.Title
                });"
            }
        },

        // Pin the task to prevent failover to another node
        PinToMentorNode = true

    });

AddEtlOperationResult result = store.Maintenance.Send(operation);