Searching

One of the most common functionality that many real world applications provide is a search feature. Many times it will be enough to apply Where closure to create a simple condition, for example to get all users whose age is greater that 20 use the code:

users = session.Query<User>("UsersByAge").Where(x => x.Age > 20).ToList();

where User class and UsersByName index are defined as follow:

public class User
{
	public string Id { get; set; }

	public string Name { get; set; }

	public byte Age { get; set; }

	public ICollection<string> Hobbies { get; set; } 
}
documentStore.DatabaseCommands.PutIndex("UsersByName", new IndexDefinition
{
	Map = "from user in docs.Users select new { user.Name }",
	Indexes = { { "Name", FieldIndexing.Analyzed } }
});

The Where statement also is good if you want to perform a really simple text field search, for example let's create a query to retrieve users whose name starts with Jo:

users = session.Query<User>("UsersByName").Where(x => x.Name.StartsWith("Jo")).ToList();

Eventually all queries are always transformed into a Lucene query. The query like above will be translated into Name:Jo*.

Warning

An attempt to use string.Contains() method as condition of Where closure, will throw NotSupportedException. That is because the search term like *term* (note wildcards at the beginning and at the end) can cause performance issues. Due to Raven's safe-by-default paradigm such operation is forbidden. If you really want to achieve this case, you will find more details in one of the next section below.

Information

Note that that results of a query might be different depending on an analyzer that was applied.

Multiple terms

When you need to do a more complex text searching use Search extension method (in Raven.Client namespace). This method allows you to pass a few search terms that will be used in searching process for a particular field. Here is a sample code that uses Search extension to get users with name John or Adam:

users = session.Query<User>("UsersByName").Search(x => x.Name, "John Adam").ToList();

Each of search terms (separated by space character) will be checked independently. The result documents must match exact one of the passed terms.

The same way you are also able to look for users that have some hobby. Create the index:

documentStore.DatabaseCommands.PutIndex("UsersByHobbies", new IndexDefinition
{
	Map = "from user in docs.Users select new { user.Hobbies }",
	Indexes = { { "Hobbies", FieldIndexing.Analyzed } }
});

Now you are able to execute the following search:

users = session.Query<User>("UsersByHobbies")
	.Search(x => x.Hobbies, "looking for someone who likes sport books computers").ToList();

In result you will get users that are interested in sport, books or computers.

Multiple fields

By using Search extension you are also able to look for by multiple indexed fields. First let's introduce the index:

documentStore.DatabaseCommands.PutIndex("UsersByNameAndHobbies", new IndexDefinition
{
	Map = "from user in docs.Users select new { user.Name, user.Hobbies }",
	Indexes = { { "Name", FieldIndexing.Analyzed }, { "Hobbies", FieldIndexing.Analyzed } }
});

Now we are able to search by using Name and Hobbies properties:

users = session.Query<User>("UsersByNameAndHobbies")
               .Search(x => x.Name, "Adam")
               .Search(x => x.Hobbies, "sport").ToList();

Boosting

Indexing in RavenDB is built upon Lucene engine that provides a boosting term mechanism. This feature introduces the relevance level of matching documents based on the terms found. Each search term can be associated with a boost factor that influences the final search results. The higher the boost factor, the more relevant the term will be. RavenDB also supports that, in order to improve your searching mechanism and provide the users with much more accurate results you can specify the boost argument. Let's see the example:

users = session.Query<User>("UsersByHobbies")
			   .Search(x => x.Hobbies, "I love sport", boost:10)
			   .Search(x => x.Hobbies, "but also like reading books", boost:5).ToList();

The search above will promote users who do sports before book readers and they will be placed at the top of the result list.

Search options

In order to specify the logic of search expression specify the options argument of the Search method. It is SearchOptions enum with the following values:

  • Or,
  • And,
  • Not,
  • Guess (default).

By default RavenDB attempts to guess and match up the semantics between terms. If there are consecutive searches, they will be OR together, otherwise AND semantic will be used by default.

The following query:

users = session.Query<User>("UsersByNameAndHobbiesAndAge")
			   .Search(x => x.Hobbies, "computers")
			   .Search(x => x.Name, "James")
			   .Where(x => x.Age == 20).ToList();
will be translated into ( Hobbies:(computers) Name:(James)) AND (Age:20) (if there is no boolean operator then OR is used).

You can also specify what exactly the query logic should be. The applied option will influence a query term where it was used. The query as follow:

users = session.Query<User>("UsersByNameAndHobbies")
			   .Search(x => x.Name, "Adam")
			   .Search(x => x.Hobbies, "sport", options: SearchOptions.And).ToList();

will result in the following Lucene query: Name:(Adam) AND Hobbies:(sport)

If you want to negate the term use SearchOptions.Not:

users = session.Query<User>("UsersByName")
		.Search(x => x.Name, "James", options: SearchOptions.Not).ToList();
According to Lucene syntax it will be transformed to the query: -Name:(James).

You can treat SearchOptions values as bit flags and create any combination of the defined enum values, e.g:

users = session.Query<User>("UsersByNameAndHobbies")
		.Search(x => x.Name, "Adam")
		.Search(x => x.Hobbies, "sport", options: SearchOptions.Not | SearchOptions.And)
		.ToList();
It will produce the following Lucene query: Name:(Adam) AND -Hobbies:(sport).

Query escaping

The code examples presented in this section have hard coded searching terms. However in a real use case the user will specify the term. You are able to control the escaping strategy of the provided query by specifying the EscapeQueryOptions parameter. It's the enum that can have one of the following values:

  • EscapeAll (default),
  • AllowPostfixWildcard,
  • AllowAllWildcards,
  • RawQuery.

By default all special characters contained in the query will be escaped (EscapeAll). However you can add a bit more of flexibility to your searching mechanism. EscapeQueryOptions.AllowPostfixWildcard enables searching against a field by using search term that ends with wildcard character:

users = session.Query<User>("UsersByName")
	.Search(x => x.Name, "Jo* Ad*", 
			escapeQueryOptions:EscapeQueryOptions.AllowPostfixWildcard).ToList();

The next option EscapeQueryOptions.AllowAllWildcards extends the previous one by allowing the wildcard character to be present at the beginning as well as at the end of the search term.

users = session.Query<User>("UsersByName")
	.Search(x => x.Name, "*oh* *da*", 
			escapeQueryOptions: EscapeQueryOptions.AllowAllWildcards).ToList();

Warning

RavenDB allows to search by using such queries but you have to be aware that leading wildcards drastically slow down searches. Consider if you really need to find substrings, most cases looking for words is enough. There are also other alternatives for searching without expensive wildcard matches, e.g. indexing a reversed version of text field or creating a custom analyzer.

The last option makes that the query will not be escaped and the raw term will be relayed to Lucene:

users = session.Query<User>("UsersByName")
	.Search(x => x.Name, "*J?n*",
			escapeQueryOptions: EscapeQueryOptions.RawQuery).ToList();

Highlights

Another feature called Highlights has been added to RavenDB to enhance the search UX.

Usage

Lets consider a class and index as follows:

public class SearchItem
{
	public string Id { get; set; }

	public string Text { get; set; }
}

public class ContentSearchIndex : AbstractIndexCreationTask<SearchItem>
{
	public ContentSearchIndex()
	{
		Map = (docs => from doc in docs
					   select new { doc.Text });

		Index(x => x.Text, FieldIndexing.Analyzed);
		Store(x => x.Text, FieldStorage.Yes);
		TermVector(x => x.Text, FieldTermVector.WithPositionsAndOffsets);
	}
}

Now to use Highlights we just need to use one of the Highlight query extension methods. The basic usage can be as simple as:

FieldHighlightings highlightings;

var results = session.Advanced.LuceneQuery<SearchItem>("ContentSearchIndex")
				 .Highlight("Text", 128, 1, out highlightings)
				 .Search("Text", "raven")
				 .ToArray();

var builder = new StringBuilder()
	.AppendLine("<ul>");

foreach (var result in results)
{
	var fragments = highlightings.GetFragments(result.Id);
	builder.AppendLine(string.Format("<li>{0}</li>", fragments.First()));
}

var ul = builder
	.AppendLine("</ul>")
	.ToString();

This will return the list of results and for each result we will be displaying first found fragment with the length up to 128 characters.

Customization

IDocumentQuery<T> Highlight(
	string fieldName,
	int fragmentLength,
	int fragmentCount,
	out FieldHighlightings highlightings);

IDocumentQuery<T> Highlight<TValue>(
	Expression<Func<T, TValue>> propertySelector,
	int fragmentLength,
	int fragmentCount,
	out FieldHighlightings highlightings);

where:
* fieldName or propertySelector is used to mark a field/property for highlight.
* fragmentLength this is the maximum length of text fragments that will be returned.
* fragmentCount this is the maximum number of fragments that will be returned.
* highlightings this will return an instance of a FieldHighlightings that contains the highlight fragments for each returned result.

By default, the highlighted text is wrapped with <b></b> tags, to change this behavior the SetHighlighterTags method was introduced.

IDocumentQuery<T> SetHighlighterTags(string preTag, string postTag);

IDocumentQuery<T> SetHighlighterTags(string[] preTags, string[] postTags);

Example. To wrap highlighted text with ** we just need to execute following query:

FieldHighlightings highlightings;

var results = session.Advanced.LuceneQuery<SearchItem>("ContentSearchIndex")
				 .Highlight("Text", 128, 1, out highlightings)
				 .SetHighlighterTags("**", "**")
				 .Search("Text", "raven")
				 .ToArray();

Note

Default <b></b> tags are coloured and colours are returned in following order: yellow, lawngreen, aquamarine, magenta, palegreen, coral, wheat, khaki, lime, deepskyblue, deeppink, salmon, peachpuff, violet, mediumpurple, palegoldenrod, darkkhaki, springgreen, turquoise and powderblue