Unbounded results API

Query streaming

By default RavenDB limits the number of returned query results to protect you against unbounded result sets. You can use a paging mechanism to control the range and the number of items that a server will return. However, there are times when you really want to get all of them in a single request. This is especially important for exporting purposes on a running system, because if you used the paging you could get duplicates or miss some items.

The unbounded results implementation is based on the following approach:

  • do it in a single request (we don't want to store any state to keep the client and the server in sync),
  • use a streaming based model (to avoid memory usage problems in case of loading millions of records),
  • freeze the returned stream (what you read is a snapshot of the data as it was when you started reading it).

Important side notes:

  • the index already exists. Creation of a index won't occur and the query error with an IndexDoesNotExistsException exception.
  • WaitForNonStaleResults and its various friends are not respected/used.

In order to take advantage of the query results streaming use the code:

var query = session.Query<User>("Users/ByActive").Where(x => x.Active);

using (var enumerator = session.Advanced.Stream(query))
{
	while (enumerator.MoveNext())
	{
		User activeUser = enumerator.Current.Document;
	}
}

As you can see to get all of the query results you need to iterate by using the enumerator returned by Stream() method. The important thing is that the streaming API does not track the entities in the session, and will not includes changes there when SaveChanges() is called.

The same way you can also stream the results of the Lucene query:

var luceneQuery = session.Advanced.LuceneQuery<User>("Users/ByActive").Where("Active:true");

using (var enumerator = session.Advanced.Stream(luceneQuery))
{
	while (enumerator.MoveNext())
	{
		User activeUser = enumerator.Current.Document;
	}
}

Apart from the query parameter you can also pass the second out parameter to retrieve QueryHeaderInformation:

QueryHeaderInformation queryHeaderInformation;
session.Advanced.Stream(query, out queryHeaderInformation);

public class QueryHeaderInformation
{
	public string Index { get; set; }
	public bool IsStable { get; set; }
	public DateTime IndexTimestamp { get; set; }
	public int TotalResults { get; set; }
	public Etag ResultEtag { get; set; }
	public Etag IndexEtag { get; set; }
}

Documents streaming

By using the Stream() method you are also able to download the documents directly without specifying any query. There are two methods for this purpose. The first one accepts ETag of a document that you want to starts from:

using (var enumerator = session.Advanced.Stream<User>(fromEtag: Etag.Empty,
                                                      start: 0, pageSize: int.MaxValue))
{
	while (enumerator.MoveNext())
	{
		User activeUser = enumerator.Current.Document;
	}
}

If you use the second one then you will have to provide a prefix key of the documents you want to stream and optionally a string value that should match the documet key after the specified prefix:

using (var enumerator = session.Advanced.Stream<User>(startsWith: "users/",
                                                      matches: "*Ra?en",
                                                      start: 0, pageSize: int.MaxValue))
{
	while (enumerator.MoveNext())
	{
		User activeUser = enumerator.Current.Document;
	}
}

The query above will return all documents where ID starts with "users/" and the end matches the expression "*Ra?en" (e.g. "users/Admins/Raven", "Users/Ragen").

Note that here you still have an option to page the results. The parameters start and pageSize have the default values here.

Async version

There is also an asynchronous version of Unbounded results API available. Here is the sample usage presented:

using (var asyncSession = store.OpenAsyncSession())
{
	var query = asyncSession.Query<User>("Users/ByActive").Where(x => x.Active);

	using (var enumerator = await asyncSession.Advanced.StreamAsync(query))
	{
		while (await enumerator.MoveNextAsync())
		{
			User activeUser = enumerator.Current.Document;
		}
	}
	
	using (var enumerator = await asyncSession.Advanced.StreamAsync<User>(Etag.Empty))
	{
		while (await enumerator.MoveNextAsync())
		{
			User activeUser = enumerator.Current.Document;
		}
	}
}