Understanding query processing and wildcards in RavenDB

I recently got an interesting question about how RavenDB is processing certain types of queries. The customer in question had a document with a custom analyzer and couldn’t figure out why certain queries didn’t work.

For the purpose of the discussion, let’s consider the following analyzer:

In other words, when using this analyzer, we’ll have the following transformations:

“Singing avocadoes” – will be: “sing”, “avocadoes”
“Sterling silver” – will be: “ster”, “silver”
“Singularity Trailer” – will be “singularity”, “trailer”

As a reminder, this is used in a reverse index, which gives us the ability to lookup a term and find all the documents containing that term.

An analyzer is applied on the text that is being indexed, but also on the queries. In other words, because during indexing I changed “singing” to “sing”, I also need to do the same for the query. Otherwise a query for “singing voice” will have no results, even if the “singing” term was in the original data.

The rules change when we do a prefix search, though. Consider the following query:

What should we be searching on here? Remember, this is using an analyzer, but we are also doing a prefix search. Lets consider our options. If we pass this through an analyzer, the query will change its meaning. Instead of searching for terms starting with “sing”, we’ll search for terms starting with “s”.

That will give us results for “Sterling Silver”, which is probably not expected. In this case, by the way, I’m actually looking for the term “singularity”, and processing the term further would prevent that.

For that reason, RavenDB assumes that queries using wildcard searches are not subject to an analyzer and will not process them using one. The reasoning is simple, by using a wildcard you are quite explicitly stating that this is not a real term.

RavenDB

RavenDB Cloud

Try

Experience interactive demos and playground server

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Unlock your business potential

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events

What’s New

Roadmap

On-premise Pricing

Cloud Pricing

Support

Proof of Concept Program

Academic Program

Understanding query processing and wildcards in RavenDB

Woah, already finished? 🤯

Related Articles

CollabTalk Podcast | Episode 123 with Oren Eini–Building a business with Open Source foundations

RavenDB’s storage engine: Voron–unlocking the secret

Certificates from the Ground Up

Watch Live Demo

RavenDB

RavenDB Cloud

Try

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events