RavenDB version 2.5. Other versions:

Transaction support in RavenDB

In RavenDB all actions performed on documents are fully ACID (Atomicity, Consistency, Isolation, Durability).

The indexing mechanism is built on a BASE model (Basically Available, Soft state, Eventual consistency).

RavenDB also is able to perform distributed transactions by implementing two-phase commit (2PC) transactions.

Standard transactions

RavenDB has a built-int transaction support. A storage engine used by RavenDB under the hood (Esent) supports ACID transactional model.

ACID for document operations

An each document operation or a batch of operations applied to a set of documents sent in a single HTTP request will execute in a single transaction. The ACID properties of RavenDB for standard transactions:

  • Atomicity - All operations are atomic. Either they succeed or fail, not midway operation. In particular, operations on multiple documents will all happen atomically, all the way or none at all.

  • Consistency and Isolation / Consistency of Scans - In a single transaction, all operations operate under snapshot isolation. Even if you access multiple documents, you'll get all of their state as it was in the beginning of the request.

  • Visibility - All transactions are immediately made available on commit. Thus, if a transaction is commit after updating two docs, you'll always see the updates to those two docs at the same time. (That is, you either see the updates to both, or you don't see the update to either one).

  • Durability - If an operation has completed successfully, it was fsync'ed to disk. Reads will never return any data that hasn't been flushed to disk.

All of these constraints are ensured when you call SaveChanges or any action that creates a HTTP request by using DatabaseCommands for example.

BASE for query operations

The transaction model is different when indexes are involved, because indexes are BASE, not ACID. Then the following constraints are applied to query operations:

  • Basically Available - Index query results will be always available, but they might be stale.

  • Soft state - The state of the system could change over the time because it is needed some time to perform the indexing. This is an incremental operation the less documents remains to index, the more accurate index results we have.

  • Eventual consistency - The database will eventually become consistent once it stops receiving new documents and the indexing operation finishes.

DTC transactions

Sometimes we need multiple calls to SaveChanges for one reason or another, but we want those calls to be contained within a single atomic operation. RavenDB supports System.Transactions for multiple operations against a RavenDB server, or even against multiple RavenDB databases and servers (distributed transactions).

The client code for this is as simple as:

    using (var transaction = new TransactionScope())
    {
    	BlogPost entity = session.Load<BlogPost>("blogs/1");
    
    	entity.Title = "Some new title";
    
    	session.SaveChanges(); // will create HTTP request
    
    	session.Delete(entity);
    
    	session.SaveChanges(); // will create HTTP request
    
    	transaction.Complete(); // will commit transaction
    }
    

As you can see DTC transaction can happen on multiple requests. If at any point any of this code fails, none of the changes will be enacted against the RavenDB document store.

You can see that RavenDB does indeed send a transaction identifier and its timeout (Raven-Transaction-Information) along with all of the the HTTP requests under this transaction scope as shown here:

    POST /bulk_docs HTTP/1.1
    Raven-Transaction-Information: 975ee0bf-cac9-4b8e-ba29-377de722f037, 00:01:00
    Accept-Encoding: gzip
    Content-Type: application/json; charset=utf-8
    Host: 127.0.0.1:8081
    Content-Length: 300
    Expect: 100-continue

    [
      {
        "Key": "blogs/1",
        "Etag": null,
        "Method": "PUT",
        "Document": {
          "Title": "Some new title",
          "Category": null,
          "Content": null,
          "Comments": null
        },
        "Metadata": {
          "Raven-Entity-Name": "Blogs",
          "Raven-Clr-Type": "ConsoleApplication5.Blog, ConsoleApplication5",
          "@id": "blogs/1",
          "@etag": "00000000-0000-0200-0000-000000000002"
        }
      }
    ]

A call transaction.Complete() involves separate requests to another HTTP endpoint with that transaction id. According to the 2PC implementation of the DTC protocol, a commit is a two-phase operation. Phase one called Prepare involves the first request when the actual work is made (but the transaction is not committed yet):

    POST /transaction/prepare?tx=975ee0bf-cac9-4b8e-ba29-377de722f037 HTTP/1.1
	Accept-Encoding: gzip
	Content-Type: application/json; charset=utf-8
	Host: 127.0.0.1:8081
	Content-Length: 0

If the Prepare phase succeeded then the actual transaction commit is made by sending the request:

    POST /transaction/commit?tx=975ee0bf-cac9-4b8e-ba29-377de722f037 HTTP/1.1
	Accept-Encoding: gzip
	Content-Type: application/json; charset=utf-8
	Host: 127.0.0.1:8081
	Content-Length: 0

All the intermediate states are durable between requests of the DTC transaction, and any document that has been modified is locked for writes from another transaction. All other transactions will see the uncommitted state, until the transaction is uncommitted. Once the DTC transaction has been committed, standard transaction rules apply.

While RavenDB supports System.Transactions, you should only use this if you really require this (for example, to coordinate between multiple transactional resources), since there is additional cost for using System.Transactions and distributed transactions over simply using the standard API and the transactional SaveChanges.

Comments add new comment

The comments section is for user feedback or community content. If you seek assistance or have any questions, please post them at our support forums.

Guz
REPLY Posted by Guz on

Using transaction, it is possible to fill a collection and block everyone else querying this collection ?

Ayende Rahien
REPLY Posted by Ayende Rahien on

No, it isn't. Collections are virtual concepts, they have no real existence and cannot be "locked". Until you commit the transaction, no one will see the documents that you have created in the transaction

Prashanth Menon
REPLY Posted by Prashanth Menon on

"The implementation details of this are not important" ... I would hesitate excluding details from documentation, especially those that claim to be on advanced topics.

Do transactions work across shards? If so, how? The example makes it seem as though RavenDB supports full interactive transactions. How would these work with long-lived transactions? Please do provide all the gory details.

Thanks, Prashanth

Ayende Rahien
REPLY Posted by Ayende Rahien on

Implementation details are subject to change, the external behavior of the system is not. Indeed, we have changed the implementation details of supporting System.Transactions.

Transactions work on shard, by having each node in the cluster participate as a resource. Long lived transactions are supported, but note that there is also the transaction timeout for you to consider.

Prashanth Menon
REPLY Posted by Prashanth Menon on

I'm not sure I understand. Are shards mastered by a single server? Can a single transaction modify entities that reside on different shards mastered by different servers?

Ayende Rahien
REPLY Posted by Ayende Rahien on

No, there is no master shard. A single transaction can modify entries on multiple shards.

That isn't recommended, because of the inherit problems in DTC.

Prashanth Menon
REPLY Posted by Prashanth Menon on

You'll have to forgive me, I don't work in .NET and am unfamiliar with the nomenclature. It looks like you rely on Microsoft's DTC, which seems to be an implementation of 2PC. I'm also inferring that RavenDB does synchronous multi-master replication if it provides ACID semantics. A few questions:

  1. The FAQ mentions that though transactions may indicate a success, RavenDB does not provide read-your-writes without some additional parameters. Do you guarantee that when the system returns a successful commit, that it will be durable? How is this guarantee made, and if so, why is there a delay to begin with? What happens to updates to a key "A" that occur on a node for which a separate transaction that includes "A" isn't yet durable?
  2. How are conflicts detected between documents? Does RavenDB employ a form of MVCC?
  3. Have you measured your transaction performance against the likes of Oracle or DB2? Perhaps the TPC-C benchmark?

Ayende Rahien
REPLY Posted by Ayende Rahien on

Prashanth, Yes, we support DTC which is 2PC.

DTC does async commits, what we guarantee that once the DTC TX is committed, the data is durable. For that matter, the data is durable even for multiple operations inside the same transaction, they just aren't visible to any other transaction.

Note that even without DTC, all document operations on RavenDB are fully ACID.

Our on disk data store is ACID, and support all the standard transactional semantics.

I can't under the key "A" question, because I don't follow it.

Conflict between docs do not really happen (unless you use replication, which is a separate thing). There is optimistic concurrency to control updates to the docs. If transaction 1 is updating doc A, and transaction 2 is also trying to update it, what will happen is: a) Transaction 2 will get a notification that doc A is already being modified in an uncommitted transaction b) If Transaction 2 will try to update doc A, it will fail, because it was modified in another transaction.

Transaction benchmarks are pretty meaningless, because the data model itself is different, as is the way you approach just working with it.

Nate Allan
REPLY Posted by Nate Allan on

This is my top concern in my evaluation of RavenDB, and I think you've merely glossed over it here. I respect that you reserve the right to change the implementation, but you've not fully explained all of the "external" facing semantics. Really all you've discussed here is atomic commits, your example makes no reads. What about isolation, consistency, and durability? Here is an example of the type of detail I would expect you to publish: http://hbase.apache.org/acid-semantics.html

Thanks.

Ayende Rahien
REPLY Posted by Ayende Rahien on

Nate, There are several things to note here:

There are standard transactions, and there are DTC transactions. This post topic discusses DTC transactions, not standard transactions.

For standard transactions (per request), and document operations:

Atomicity

All operations are atomic. Either they succeed or fail, not midway operation. In particular, operations on multiple documents will all happen atomically, all the way or none at all.

** Consistency and Isolation / Consistency of Scans **

In a single transaction, all operations operate under snapshot isolation. Even if you access multiple documents, you'll get all of their state as it was in the beginning of the request.

** Visibility **

All transactions are immediately made available on commit. Thus, if a transaction is commit after updating 2 docs. You'll always see the updates to those two docs at the same time. (That is, you either see the updates to both, or you don't see the update to either one).

** Durability **

If an operation has completed successfully, it was fsync'ed to disk. Reads will never return any data that hasn't been flushed to disk.

Things are slightly more complex when indexes are involved, because indexes are BASE, not ACID. And DTC transaction can happen on multiple requests. In that scenario, all the intermidate state are still durable, and any document that has been modified is locked for writes from another transaction. All other transactions will see the committed state, until the transaction is committed. Once the DTC transaction has been committed, standard transaction rules apply.

Marcel Valdez
REPLY Posted by Marcel Valdez on

I have to agree with Nate Allan that all of this you wrote in the comments should be part of the article, I found it very useful and entertaining/interesting. If your objective was not to tire the reader, I believe you just left the reader wanting more (dissapointed).

dario-g
REPLY Posted by dario-g on

Is TransactionScope will work correctly on operations performed on multiple database instances (multitenancy)? If I want to save the data in the two databases at the same time whether it is the right way to use TransactionScope?

Ayende Rahien
REPLY Posted by Ayende Rahien on

Yes, you can use TransactionScope with multiple databases, for that matter you can do that with mulitple servers.

dario-g
REPLY Posted by dario-g on

Can I span with TransactionScope only all SaveChanges calls or should I span something more?

Arun
REPLY Posted by Arun on

I assume from here that you are providing transaction between different shards. I have a scenario as follows, 1. I have a server holding the common data which is used universally in the system( which is common for all users). 2. I have n no.of shards which holds the actual user details. Here when i am writing to the shardeddocumentstore, I need to also update the common datastore which is universal for all users. Is it now possible put those two sessions under a single transaction scope? Actually i did it the way i said above as a result the raven log showed the data is been stored successfully in both sessions. but when i look out for the data stored in sharded document store it doesn't get displayed and when queried the reply is "couldn't find the document". I tested the same without any transactionscope it works but no gurantee that all of my work will be done successfully every time. is there a solution to use both sharded and normal document store under a single transaction?

SUBMIT COMMENT