RavenDB version 2.0. Other versions:

Transaction support in RavenDB

All the previous examples have assumed that a single unit of work can be achieved with a single IDocumentSession and a single call to SaveChanges, and for the most part this is definitely true. Sometimes however we do need multiple calls to SaveChanges for one reason or another, but we want those calls to be contained within a single atomic operation.

RavenDB supports System.Transactions for multiple operations against a RavenDB server, or even against multiple RavenDB servers.

The client code for this is as simple as::

    using (var transaction = new TransactionScope())
    {
    	BlogPost entity = session.Load<BlogPost>("blogs/1");
    
    	entity.Title = "Some new title";
    
    	session.SaveChanges();
    
    	session.Delete(entity);
    	session.SaveChanges();
    
    	transaction.Complete();
    }
    

If at any point any of this code fails, none of the changes will be enacted against the RavenDB document store.

The implementation details of this are not important, although it is possible to see that RavenDB does indeed send a transaction Id along with all of the the HTTP requests under this transaction scope as shown here:

POST /bulk_docs HTTP/1.1
Raven-Transaction-Information: 975ee0bf-cac9-4b8e-ba29-377de722f037, 00:01:00
Accept-Encoding: deflate,gzip
Content-Type: application/json; charset=utf-8
Host: 127.0.0.1:8081
Content-Length: 300
Expect: 100-continue

[
  {
    "Key": "blogs/1",
    "Etag": null,
    "Method": "PUT",
    "Document": {
      "Title": "Some new title",
      "Category": null,
      "Content": null,
      "Comments": null
    },
    "Metadata": {
      "Raven-Entity-Name": "Blogs",
      "Raven-Clr-Type": "ConsoleApplication5.Blog, ConsoleApplication5",
      "@id": "blogs/1",
      "@etag": "00000000-0000-0200-0000-000000000002"
    }
  }
]

A call to commit involves a separate call to another HTTP endpoint with that transaction id:

POST /transaction/commit?tx=975ee0bf-cac9-4b8e-ba29-377de722f037 HTTP/1.1
Accept-Encoding: deflate,gzip
Content-Type: application/json; charset=utf-8
Host: 127.0.0.1:8081
Content-Length: 0

While RavenDB supports System.Transactions, you should only use this if you really require this (for example, to coordinate between multiple transactional resources), since there is additional cost for using System.Transactions and distributed transactions over simply using the standard API and the transactional SaveChanges.

Comments add new comment

The comments section is for user feedback or community content. If you seek assistance or have any questions, please post them at our support forums.

Guz
REPLY Posted by Guz on

Using transaction, it is possible to fill a collection and block everyone else querying this collection ?

Ayende Rahien
REPLY Posted by Ayende Rahien on

No, it isn't. Collections are virtual concepts, they have no real existence and cannot be "locked". Until you commit the transaction, no one will see the documents that you have created in the transaction

Prashanth Menon
REPLY Posted by Prashanth Menon on

"The implementation details of this are not important" ... I would hesitate excluding details from documentation, especially those that claim to be on advanced topics.

Do transactions work across shards? If so, how? The example makes it seem as though RavenDB supports full interactive transactions. How would these work with long-lived transactions? Please do provide all the gory details.

Thanks, Prashanth

Ayende Rahien
REPLY Posted by Ayende Rahien on

Implementation details are subject to change, the external behavior of the system is not. Indeed, we have changed the implementation details of supporting System.Transactions.

Transactions work on shard, by having each node in the cluster participate as a resource. Long lived transactions are supported, but note that there is also the transaction timeout for you to consider.

Prashanth Menon
REPLY Posted by Prashanth Menon on

I'm not sure I understand. Are shards mastered by a single server? Can a single transaction modify entities that reside on different shards mastered by different servers?

Ayende Rahien
REPLY Posted by Ayende Rahien on

No, there is no master shard. A single transaction can modify entries on multiple shards.

That isn't recommended, because of the inherit problems in DTC.

Prashanth Menon
REPLY Posted by Prashanth Menon on

You'll have to forgive me, I don't work in .NET and am unfamiliar with the nomenclature. It looks like you rely on Microsoft's DTC, which seems to be an implementation of 2PC. I'm also inferring that RavenDB does synchronous multi-master replication if it provides ACID semantics. A few questions:

  1. The FAQ mentions that though transactions may indicate a success, RavenDB does not provide read-your-writes without some additional parameters. Do you guarantee that when the system returns a successful commit, that it will be durable? How is this guarantee made, and if so, why is there a delay to begin with? What happens to updates to a key "A" that occur on a node for which a separate transaction that includes "A" isn't yet durable?
  2. How are conflicts detected between documents? Does RavenDB employ a form of MVCC?
  3. Have you measured your transaction performance against the likes of Oracle or DB2? Perhaps the TPC-C benchmark?

Ayende Rahien
REPLY Posted by Ayende Rahien on

Prashanth, Yes, we support DTC which is 2PC.

DTC does async commits, what we guarantee that once the DTC TX is committed, the data is durable. For that matter, the data is durable even for multiple operations inside the same transaction, they just aren't visible to any other transaction.

Note that even without DTC, all document operations on RavenDB are fully ACID.

Our on disk data store is ACID, and support all the standard transactional semantics.

I can't under the key "A" question, because I don't follow it.

Conflict between docs do not really happen (unless you use replication, which is a separate thing). There is optimistic concurrency to control updates to the docs. If transaction 1 is updating doc A, and transaction 2 is also trying to update it, what will happen is: a) Transaction 2 will get a notification that doc A is already being modified in an uncommitted transaction b) If Transaction 2 will try to update doc A, it will fail, because it was modified in another transaction.

Transaction benchmarks are pretty meaningless, because the data model itself is different, as is the way you approach just working with it.

Nate Allan
REPLY Posted by Nate Allan on

This is my top concern in my evaluation of RavenDB, and I think you've merely glossed over it here. I respect that you reserve the right to change the implementation, but you've not fully explained all of the "external" facing semantics. Really all you've discussed here is atomic commits, your example makes no reads. What about isolation, consistency, and durability? Here is an example of the type of detail I would expect you to publish: http://hbase.apache.org/acid-semantics.html

Thanks.

Ayende Rahien
REPLY Posted by Ayende Rahien on

Nate, There are several things to note here:

There are standard transactions, and there are DTC transactions. This post topic discusses DTC transactions, not standard transactions.

For standard transactions (per request), and document operations:

Atomicity

All operations are atomic. Either they succeed or fail, not midway operation. In particular, operations on multiple documents will all happen atomically, all the way or none at all.

** Consistency and Isolation / Consistency of Scans **

In a single transaction, all operations operate under snapshot isolation. Even if you access multiple documents, you'll get all of their state as it was in the beginning of the request.

** Visibility **

All transactions are immediately made available on commit. Thus, if a transaction is commit after updating 2 docs. You'll always see the updates to those two docs at the same time. (That is, you either see the updates to both, or you don't see the update to either one).

** Durability **

If an operation has completed successfully, it was fsync'ed to disk. Reads will never return any data that hasn't been flushed to disk.

Things are slightly more complex when indexes are involved, because indexes are BASE, not ACID. And DTC transaction can happen on multiple requests. In that scenario, all the intermidate state are still durable, and any document that has been modified is locked for writes from another transaction. All other transactions will see the committed state, until the transaction is committed. Once the DTC transaction has been committed, standard transaction rules apply.

Marcel Valdez
REPLY Posted by Marcel Valdez on

I have to agree with Nate Allan that all of this you wrote in the comments should be part of the article, I found it very useful and entertaining/interesting. If your objective was not to tire the reader, I believe you just left the reader wanting more (dissapointed).

dario-g
REPLY Posted by dario-g on

Is TransactionScope will work correctly on operations performed on multiple database instances (multitenancy)? If I want to save the data in the two databases at the same time whether it is the right way to use TransactionScope?

Ayende Rahien
REPLY Posted by Ayende Rahien on

Yes, you can use TransactionScope with multiple databases, for that matter you can do that with mulitple servers.

dario-g
REPLY Posted by dario-g on

Can I span with TransactionScope only all SaveChanges calls or should I span something more?

SUBMIT COMMENT