Distributed Database
-
In RavenDB, a database can be replicated across multiple nodes, depending on its Replication Factor.
-
A node where a database resides is referred to as a
Database Node
.
The group of Database Nodes that assemble the distributed database, is called aDatabase Group
. -
Each Database Node has a full copy of the database, it contains all the database documents and it indexes them locally.
This greatly simplifies executing query requests, since there is no need to orchestrate an aggregation of data from various nodes. -
In this page:
The Database Record
Each database instance keeps its configuration (e.g. Index definitions, Database topology) in a Database Record object.
Upon database creation, this object is passed through Rachis to all nodes in the cluster.
After that, each node updates its own database record
on its own upon any new Raft command received,
i.e. when an index has changed.
Replication Factor
When creating a database it is possible to specify the exact nodes for the Database Group
, or just the number of nodes needed.
This will implicitly set the Replication Factor
to the specified amount of nodes.
It is possible to pass the Replication Factor
explicitly and let RavenDB choose the nodes on which the database will reside.
Either way, once the database is created by getting a Consensus,
the Cluster Observer begins monitoring the Database Group in order to maintain the Replication Factor
.
Database Topology
The Database Topology
describes the relations inside the Database Group
between the Database Nodes
.
Each Database Node
can be in one of the following states:
State | Description |
---|---|
Member | Fully updated and functional database node. |
Promotable | A node that has been recently added to the group and is being updated. |
Rehab | A former Member node that is assumed to be not up-to-date due to a partition. |
States Flow
In general, all nodes in a newly created database are in a Member
state.
When adding a new Database Node
to an already existing database group, a Mentor Node
is selected in order to update it.
The new node will be in a Promotable
state until it receives and indexes all the documents from the mentor node.
Nodes Order
The database topology is kept in a list that is always ordered with Member
nodes first, then Rehabs
and Promotables
are last.
The order is important since it defines the client's order of access into the Database Group
, (see Load balancing client requests).
The order can be changed with the Client-API
or via the Studio.
Replication
All Members
have master-master Replication in order to keep the documents in sync across the nodes.
Dynamic Database Distribution
If any of the Database Nodes
is down or partitioned, the Cluster Observer will recognize it and act as following:
-
If Cluster.TimeBeforeMovingToRehabInSec (default: 60 seconds) time has passed and the node is still unreachable,
the node will be moved to aRehab
state. -
If the node is for Cluster.TimeBeforeAddingReplicaInSec (default: 900 seconds) still in
Rehab
,
a new database node will be automatically added to the database group to replace theRehab
node. -
If the
Rehab
node is online again, it will be assigned with a Mentor Node to update him with the recent changes. -
The first node to be up-to-date stays, while the other is deleted.
Deletion
The Rehab
node is actually deleted only when it is re-connected to the cluster,
and only after it has finished sending all its new documents that it may have (while it was disconnected) to the other nodes in the Database Group.
The Dynamic Database Distribution feature can be toggled on and off with the following request:
URL | Method | URL Params |
---|---|---|
/admin/databases/dynamic-node-distribution | POST |
name=[database-name ], enable=[bool ] |
Sharding
Currently, RavenDB 4.x doesn't offer sharding as an out of the box solution.
It is on the development roadmap to be implemented in a future version. Track it here.