Advanced Replication Topics
Replication Handshake Procedure
Whenever the replication process between two databases is initiated it has to determine the process state.
- The first message is a request to establish a TCP connection of type replication with a protocol version.
- The destination verifies that the protocol version matches and that the request is authorized.
- Once the source gets the OK message it queries the destination about the latest ETag it got from him.
- The destination sends back a heartbeat message with both the latest
ETag
he got from the source and the current Change Vector of the database. - The
ETag
is used as a starting point for the replication process but it is then been filtered by the destination's currentchange vector
, meaning we will skip documents with higherETag
and lowerChange Vector
, this is done to prevent the Ripple Effect.
Preventing the Ripple Effect
RavenDB Database Group is a fully connected graph of replication channels, meaning that if there are n
nodes in a Database Group
there are n*(n-1)
replication channels.
We wanted to prevent the case where inserting data into one database will cause the data to propagate multiple times through multiple paths to all the other nodes.
We have managed to do so by delaying the propagation of data coming from the replication logic itself.
If the sole source of incoming data is replication we will not replicate it right away, we will wait up to 15 seconds
before sending data.
This will allow the destination to inform us about his current change vector and most of the time the data will get filtered at the source.
On a stable system the steady state will have a Spanning Tree
of replication channels, n-1
of the fastest channels available that are doing the actual work and the rest are just sending heartbeats.