Indexing basics

To achieve very fast response times, RavenDB handles indexing in the background whenever data is added or changed. This approach allows the server to respond quickly even when large amounts of data have changed, however drawback of this choice is that results might be stale (more about staleness in next section). Underneath, the server is using Lucene to perform indexation and because of that queries are using Lucene-syntax (with some additions) to perform search.

Stale indexes

The notion of stale indexes comes from an observation deep in RavenDB's design, assuming that the user should never suffer from assigning the server big tasks. As far as RavenDB is concerned, it is better to be stale than offline, and as such it will return results to queries even if it knows they may not be as up-to-date as possible.

And indeed, RavenDB returns quickly for every client request, even if involves re-indexing hundreds of thousands of documents. And since the previous request has returned so quickly, the next query can be made a millisecond after that and results will be returned, but they will be marked as Stale.

Information

You can read more about stale indexes here.

Lucene

As mentioned earlier, RavenDB uses Lucene syntax (with some additions) for querying. The easiest way for us would be to expose a method in which you could pass your Lucene-flavored query as a string (we did that) and do not bother about anything else.

The fact is, that we did not stop at this point, we went much further, by exposing LINQ-based querying with strong-type support that hides all Lucene syntax complexity:

List<Employee> employees = session
	.Query<Employee>("Employees/ByFirstName")
	.Where(x => x.FirstName == "Robert")
	.ToList();
IRavenQueryable<Employee> employees = from employee in session.Query<Employee>("Employees/ByFirstName")
									  where employee.FirstName == "Robert"
									  select employee;

Of course, we did not forget about our advanced users, they can create queries manually by using low-level commands or DocumentQuery (API reference here) in advanced session operations:

List<Employee> employees = session
	.Advanced
	.DocumentQuery<Employee>("Employees/ByFirstName")
	.WhereEquals(x => x.FirstName, "Robert")
	.ToList();

By the end, all of the above queries will execute Query command with (in this case) a very simple Lucene-flavored query FirstName:Robert:

store
	.DatabaseCommands
	.Query(
		"Employees/ByFirstName",
		new IndexQuery
		{
			Query = "FirstName:Robert"
		});

Type of indexes

You probably know that indexes can be divided by their source of origin to the static and auto indexes (if not, read about it here), but more interesting division is by functionality and in this case we have Map and Map-Reduce indexes.

Map indexes (sometimes referred as simple indexes) contain one (or more) mapping functions that indicate which fields from documents should be indexed (in other words they indicate which documents can be searched by which fields).

On the other hand there are Map-Reduce indexes that allow complex aggregations to be performed in two-step process. First by selecting appropriate records (using Map function), then by applying specified reduce function to these records to produce smaller set of results.

Map Indexes

We urge you to read more about Map indexes here.

Map-Reduce Indexes

More detailed information about Map-Reduce indexes can be found here.

Default index

Each RavenDB database comes with a built-in index called Raven/DocumentsByEntityName. Its purpose is to index all documents by using two metadata fields: Raven-Entity-Name and Last-Modified (if they are present). Thanks to this we can query for documents that belong only to the given collection or by specifying the modification date.

public class RavenDocumentsByEntityName : AbstractIndexCreationTask
{
	public override bool IsMapReduce
	{
		get { return false; }
	}

	public override string IndexName
	{
		get { return "Raven/DocumentsByEntityName"; }
	}

	public override IndexDefinition CreateIndexDefinition()
	{
		return new IndexDefinition
		{
			Map = @"from doc in docs 
					select new 
					{ 
						Tag = doc[""@metadata""][""Raven-Entity-Name""], 
						LastModified = (DateTime)doc[""@metadata""][""Last-Modified""],
						LastModifiedTicks = ((DateTime)doc[""@metadata""][""Last-Modified""]).Ticks 
					};",

			Indexes =
			{
				{ "Tag", FieldIndexing.NotAnalyzed },
				{ "LastModified", FieldIndexing.NotAnalyzed },
				{ "LastModifiedTicks", FieldIndexing.NotAnalyzed }
			},

			SortOptions =
			{
				{ "LastModified",SortOptions.String },
				{ "LastModifiedTicks", SortOptions.Long }
			},

			DisableInMemoryIndexing = true,
			LockMode = IndexLockMode.LockedIgnore,
		};
	}
}