The performance degradation in the load testing tool

During benchmark run on a large dataset, I started to notice that longer benchmarks were showing decidedly worse numbers than short ones. In other words, a benchmark that is run for 1 minute is orders of magnitude higher latencies than a benchmark that is run for 30 seconds. And the longer the benchmark, the worst things off.

That raised a lot of red flags, and spawn a pretty serious investigation. We take performance very seriously and the observed behavior was that we were getting slower over time. We suspected a leak, high number of GC calls, or memory fragmentation. The scenario under test was a web application using the RavenDB API to talk to RavenDB. We run both the web application and the server under profilers and found a few hot spots, but nothing really major. There was no smoking gun.

Then we noticed that the load testing machine was sitting there with 100% CPU. I initially thought that this is us generating too much load for the machine, but that wasn’t it. We are using wrk2, which is capable of handling million of requests per seconds.

We were generating the requests dynamically using a Lua script, and in one of the scenarios under test, we have code like this:

path = "/orders/user/" .. page * pageSize .. "/" .. pageSize .. "/?userId=" .. item.id .. "&deep=y"

That isn’t the most optimal way to do things, I’ll admit. We can do better by using something like table.concat(), but the problem was that regardless of how you build the string, this is supposed to be fairly cheap. The wrk2 project is using LuaJIT, which has a reputation as a really scripting system. I never really thought that this would be a problem. Sure, it is a little wasteful, but it isn’t too bad, a few string temporaries and maybe some realloc() calls, but nothing major.

Instead, this resulted in us getting far worse results over time. It took a while to actually figure out why, but the root cause is in the way LuaJIT handles string hashing.

a = lj_getu32(str);
h ^= lj_getu32(str+len-4); b = lj_getu32(str+(len>>1)-2); h ^= b; h -= lj_rol(b, 14); b += lj_getu32(str+(len>>2)-1);

Strings in Lua are interned, which means that there is just a single copy of a string per value. That means that hashing is important, but the way it does hashing is to take the first 4 bytes, the last 4 bytes and the 4 bytes in the middle and use that for a hash. And that is it.

If you have a bunch of strings where those 3 locations match… well, welcome to hash collisions. At which point, what is supposed to be a O(1) call becomes an O(N) call. And creating strings will turn the operations into an O(N^2) operation!

Here is the reproduction code:

Change the prefix to be an empty string for a major performance boost. The actual bug is well known (5 or 6 years), but it was only recently fixed and not on the version that wrk2 is using.

We had to toss out the entire benchmarking set and start over because of this.

We were generating requests with random data, so some of them would hit this problem hard, and some would avoid it by magic. I was not expecting to debug hash collision in Lua code while trying to get some performance numbers from overloading RavenDB, quite random, literally.

RavenDB

RavenDB Cloud

Try

Experience interactive demos and playground server

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Unlock your business potential

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events

What’s New

Roadmap

On-premise Pricing

Cloud Pricing

Support

Proof of Concept Program

Academic Program

The performance degradation in the load testing tool

Woah, already finished? 🤯

Related Articles

CollabTalk Podcast | Episode 123 with Oren Eini–Building a business with Open Source foundations

RavenDB’s storage engine: Voron–unlocking the secret

Certificates from the Ground Up

Watch Live Demo

RavenDB

RavenDB Cloud

Try

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events