Optimizing RavenDB by adding Thread.Sleep(5)

This post is here because we recently had to add this code to RavenDB:

Yes, we added a sleep to RavenDB, and we did it to increase performance.

The story started out with a reported performance regression. On a previous version of RavenDB, the user was able to insert 32,000 documents per second. Same code, same machine, new version of RavenDB, but the performance is 13,000 documents per second.

That is, as we call it internally, and Issue. More specifically issue: RavenDB-14777. Smile

Deeper investigation revealed that the problem was that we are too fast, therefor we are too slow. Adding a sleep fixed the being too fast thing, so we were faster again.

You might need to read the previous paragraph a few times to make sense of it, I’m particularly proud of it. Here is what actually happened. Our bulk insert code is reading from the network and as soon as we have some data, we start parallelizing the write to disk and the read from the network. The idea is that we want to be reduce the user time, so we maximize the amount of work we do. This is a fairly standard optimization for us and has paid many dividends in performance. The way it works, we read from the network until there is nothing available in memory and we have to wait for I/O, at which point we start writing to the disk and wait for the network I/O to resume the operation.

However, the issue is that the versions that the user was trying also included a runtime change. The old version run on .NET Core 2.2 and the new version run on .NET Core 3.1. There has been many optimizations as a result of this change, and it seems that the read from network path has benefited from these.

As a result, we would be able to read the data from the buffer more quickly, which meant that we would end up faster with waiting for network I/O. And that meant that we would do a lot more disk writes because we were better in reading from the network. And that, in turn, slowed down the whole thing enough to be noticeable.

Our change means that we’ll only queue a new disk operation if there has been 5 milliseconds with no new network traffic (or a bunch of other conditions that you don’t really care about). This way, we retain the parallel work and not saturate the disk with small writes.

As I said earlier, we had to pump the brakes to get into real high speed.

RavenDB

RavenDB Cloud

Try

Experience interactive demos and playground server

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Unlock your business potential

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events

What’s New

Roadmap

On-premise Pricing

Cloud Pricing

Support

Proof of Concept Program

Academic Program

Optimizing RavenDB by adding Thread.Sleep(5)

Woah, already finished? 🤯

Related Articles

CollabTalk Podcast | Episode 123 with Oren Eini–Building a business with Open Source foundations

RavenDB’s storage engine: Voron–unlocking the secret

Certificates from the Ground Up

Watch Live Demo

RavenDB

RavenDB Cloud

Try

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events