Production postmortem: The server ate all my memory

A customer reported a scenario where RavenDB was using stupendous amounts of memory. In the orders of tens of GB on a system that didn’t have that much load.

Our first suspicion was that this is an issue with reading the metrics, since RavenDB will try to keep as much of the data in memory, which sometimes leads users to worry. I spoke about this at length in the past.

In this case, that wasn’t the case. We were able to drill down into the exact cause of the memory usage and we found out that RavenDB was using an abnormally high amount of memory. The question was why that was, exactly.

We looked into the common operations on the server, and we found a suspicious query, it looked something like this:

from index ‘Sales/Actions’
where endsWith(WorkflowStage, ‘/Final’)

The endsWith query was suspicious, so we looked into that further. In general, endsWith requires us to scan all the unique terms for a particular field, but in most cases, there aren’t that many unique values for a field. In this case, however, that wasn’t the case, here are some of the values for WorkflowStage:

Workflows/3a1af12a-b5d2-4c96-9348-177ebaacab6c/Step-2
Workflows/6aacc86c-2f28-4b8b-8dee-1024314d5add/Final

In total, there were about 250 million sales in the database, each one of them with a unique WorflowStage value.

What does this mean, in terms of RavenDB query execution? Well, the fields are indexed, but we need to effectively do:

This isn’t the actual code, but it will show you what is going on.

In other words, in order to process this query, we have to scan (and materialize) all 250 million unique terms for this field. Obviously that is going to consume a lot of memory.

But what is the solution to that? Instead of doing an expensive endsWith query, we can move the computation from the query time to the index time.

In other words, instead of indexing the WorkflowStage field as is, we’ll extract the information we want from it. The index would have one of those:

IsFinalWorkFlowStage = doc.WorkflowStage.EndsWith(“/Final”),

WorkflowStagePostfix = doc.WorkflowStage.Split(‘/’).Last()

The first one will check whether the value is final or not, while the second just gets the (one of hopefully a few) postfixes for the field. We can then query using equality instead of endsWith, leading to far better performance and greatly reduced memory usage, since we don’t need to materialize any values during the query.

RavenDB

RavenDB Cloud

Try

Experience interactive demos and playground server

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Unlock your business potential

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events

What’s New

Roadmap

On-premise Pricing

Cloud Pricing

Support

Proof of Concept Program

Academic Program

Production postmortem: The server ate all my memory

Woah, already finished? 🤯

Related Articles

CollabTalk Podcast | Episode 123 with Oren Eini–Building a business with Open Source foundations

RavenDB’s storage engine: Voron–unlocking the secret

Certificates from the Ground Up

Watch Live Demo

RavenDB

RavenDB Cloud

Try

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events