Always profile! The case of the mysterious performance regression
This is part of the same issue as the previous post. I was tracking a performance regression between RavenDB 4.1 and RavenDB 4.2, there was a roughly 8% performance difference between the two (favoring the older version) which was reported to us. The scenario was very large and complex documents (hundreds of objects in a document, > 16KB of JSON each).
The two code bases have had a lot of (small) changes between them, so it was hard to figure out exactly what was the root cause for the regression. Eventually I found something utterly bizarre. One of the things that we have to do when you save a document is check if the document has been modified. If so, we need to save it, otherwise, we can skip it. Here is the relevant piece of code in 4.1:
So this costs 0.5 ms (for very large documents), seems perfectly reasonable. But when looking at this on 4.2, we have:
This cost six times as much, what the hell?! To clarify, Blittable is the name of the document format that RavenDB uses. It is a binary JSON format that is highly efficient. You can think about this as comparing two JSON documents, because this is what it is doing.
I mentioned that there are differences between these versions? There have been quite a few (thousands of commits worth), but this particular bit of code hadn’t changed in years. I just couldn’t figure out what was going on. Then I looked deeper. Here are the cost of these calls. Here is the 4.1 version:
And here is the 4.2 version:
There are a few interesting things here. First, we can see that we are using Enumerable.Contains and that is where most of the time goes. But much more interesting, in 4.1, we are calling this method a total of 30,000 times. In 4.2, we are calling it 150,000 times!!! Note that CompareBlittable is recursive, so even though we call it on 10,000 documents, we get more calls. But why the difference between these version?
I compared the code for these two functions, and they were virtually identical. In 4.2, we mostly change some error message texts, nothing major, but somehow the performance was so different. It took a while to figure out that there was another difference. In 4.1, we checked the changes in the documents in the order of the properties on the document, but on 4.2, we optimized things slightly and just used the index of the property. A feature of the blittable format is that properties are lexically sorted.
Here is the document in question, in our test, we are modifying Property6, as you can see here:
There are a total of 40 properties in this document. And much nesting. In this case, in 4.2, we are scanning for changes in the document using the lexical sorting, which means:
The CompareBlittable() function will exit on the first change it detect, and in 4.1, it will get to the changed Property6 very quickly. On 4.2, it will need to scan most of the (big) document before it find a change. That is a large part of the cost difference between these versions.
Now that I know what the issue is, we have to consider whatever behavior is better for us. I decided to use the order of inserted properties, instead of the lexical order. The reasoning is simple. If a user care about that, they can much more easily change the order of properties in the document than the names of the properties. In C#, you can just change the order the properties shows up in the class definition.
I have to say, this was much harder to figure out than I expected, because the change happened in a completely different location and was very much none obvious in how it worked.