Transactional behavior with RavenDB is divided into two modes:
In this mode, a user can perform all requested operations (read and/or write) in a single request.
A batch of multiple write operations will be executed atomically in a single transaction via calling
SaveChanges() which generates a single HTTP request to the database.
Multiple reads & writes:
Performing interleaving reads and writes or conditional execution can be achieved by running a patching script.
In the script you can read documents, make decisions based on their content and update or put document(s) within the scope of a single transaction.
If you only need to modify a document in a transaction, JSON Patch syntax allows you to do that.
RavenDB does not support a single transaction that spans all requested operations within multiple requests.
Instead, users are expected to utilize optimistic concurrency to achieve similar behavior.
Your changes will get committed only if no one else has changed the data you are modifying in the meantime.
No support for interactive transactions
RavenDB client uses HTTP to communicate with the RavenDB server.
It means that RavenDB doesn't allow you to open a transaction on the server side, make multiple operations over a network connection, and then commit or roll it back.
This model, known as the interactive transactions model, is incredibly costly. Both in terms of engine complexity and the impact on the overall performance of the system.
In one study the cost of managing the transaction state across multiple network operations was measured at over 40% of the total system performance.
This is because the server needs to maintain locks and state across potentially very large time frames.
RavenDB's approach differs from the classical SQL model, which relies on interactive transactions. Instead, RavenDB uses the batch transaction model. It allows us to provide the same capabilities as interactive transactions in
conjunction with optimistic concurrency, with much better performance.
Key to that design decision is our ability to provide similar guarantees about the state of your data without experiencing the overhead of interactive transactions.
Batch transaction model
RavenDB uses the batch transaction model, where a RavenDB client submits all the operations to be run in a single transaction in one network call.
This allows the storage engine inside RavenDB to avoid holding locks for an extended period of time and gives plenty of room to optimize the performance.
This decision is based on the typical interaction pattern by which RavenDB is used.
RavenDB serves as a transactional system of record for business applications, where the common workflow involves presenting data to users,
allowing them to make modifications, and subsequently save these changes.
A single request loads the data which is then presented to the user.
After a period of contemplation or "think time," the user submits a set of updates, which are then saved to the database.
This model fits the batch transaction model a lot more closely than the interactive one, as there's no necessity to keep a transaction open during the user's "think time."
All changes that are sent via SaveChanges are persisted in a single unit.
If you modify documents concurrently and you want to assure they won't by affected by the lost update problem,
then you must enable optimistic concurrency (turned off by default) across all sessions that modify those documents.
RavenDB employs the multi-master model, allowing writes to be made to any node in the cluster.
These writes are then propagated asynchronously to the other nodes via replication.
The interaction of transactions and distributed work is anything but trivial. Let's start from the obvious problem:
- RavenDB allows you to perform concurrent write operations on multiple nodes.
- RavenDB explicitly allows you to write to a node that was partitioned from the rest of the network.
Taken together, this violates the CAP theorem
which states that a system can only provide 2 out of 3 guarantees around consistency, availability, and partition tolerance.
RavenDB's answer to distributed transactional work is nuanced and was designed to give you as the user the choice
so you can utilize RavenDB for each of your scenarios:
- Single-node operations are available and partition tolerant (AP) but cannot meet the consistency guarantee.
- If you need to guarantee uniqueness or replicate the data for redundancy across more than one node,
you can choose to have higher consistency at the cost of availability (CP).
When running in a multi-node setup, RavenDB still uses transactions. However, they are single-node transactions.
That means that the set of changes that you write in a transaction is committed only to the node you are writing to.
It will then asynchronously replicate to the other nodes.
To achieve consistency across the entire cluster please refer to the Cluster-wide transactions section below.
This is an important observation because you can get into situations where two clients wrote (even with optimistic concurrency turned on)
to the same document and both of them committed successfully (each one to a separate node).
RavenDB attempts to minimize this situation by designating a preferred node for writes for each database,
but since writing to the preferred node isn't guaranteed, this might not alleviate the issue.
In such a case, the data will replicate across the cluster, and RavenDB will detect that there were conflicting modifications to the document.
It will then apply the conflict resolution strategy that you choose.
That can include selecting a manual resolution, running a resolution script to reconcile the conflicting versions,
or simply selecting the latest version. You are in control of this behavior.
This behavior was influenced by the Dynamo paper which emphasizes the importance of writes.
The assumption is that if you are writing data to the database, you expect it to be persisted.
RavenDB will do its utmost to provide that to you, allowing you to write to the database even in the case of partitions or partial failure states.
However, handling replication conflicts is a consideration you have to take into account when using single-node transactions in RavenDB (see below for running a cluster-wide transaction).
If no conflict resolution script is defined for a collection, then by default RavenDB resolves the conflict using the latest version based on the
@last-modified property of conflicted versions of the document.
That might result in the lost update anomaly.
If you care about avoiding lost updates, you need to ensure you have the conflict resolution script defined accordingly or use a cluster-wide transaction.
Replication & transaction boundary
The following is an important aspect to RavenDB's transactional behavior with regards to asynchronous replication.
When replicating modifications to another server, RavenDB will ensure that the transaction boundaries are maintained.
If there are several document modifications in the same transaction they will be sent in the same replication batch, keeping the transaction boundary on the destination as well.
However, a special attention is needed when a document is modified in two separate transactions but the replication of the first transaction has not occurred yet.
Read more about that in How revisions replication help data consistency.
RavenDB also supports cluster-wide transactions.
This feature modifies the way RavenDB commits a transaction, and it is meant to address scenarios where you prefer to get a failure if the transaction cannot be persisted to a majority of the nodes in the cluster.
In other words, this feature is for scenarios where you want to favor consistency over availability.
For cluster-wide transactions, RavenDB uses the Raft protocol.
This protocol ensures that the transaction is acknowledged by a majority of the nodes in the cluster and once committed, the changes will be visible on any node that you'll use henceforth.
Similar to single-node transactions, RavenDB requires that you submit the cluster-wide transaction as a single request of all the changes you want to commit to the database.
Cluster-wide transactions have the notion of atomic guards to prevent an overwrite of a document modified in a cluster transaction by a change made in another cluster transaction.
The usage of atomic guards makes cluster-wide transactions conflict-free.
There is no way to make a conflict between two versions of the same document.
If a document got updated meanwhile by someone else then a
ConcurrencyException will be thrown.