Querying: Paging

Paging, or pagination, is the process of splitting a dataset into pages, reading one page at a time. This is useful for optimizing bandwidth traffic and hardware usage or simply because no user can handle huge amounts of data at once.

Warning

Starting from version 4.0, if the page size is not specified on client side, the server will assume int.MaxValue (2,147,483,647) and all the results will be downloaded. It is recommended to set a page size explicitly to avoid long response times caused by sending excessive amounts of data over the network or high memory consumption caused by handling large quantities of documents.

Performance

By default, if the number of returned results exceeds 2048, the server will issue a Performance Hint notification (visible in the Studio) with information about query details. You can decide if this behavior is desired or not. The threshold can be adjusted by changing the PerformanceHints.MaxNumberOfResults configuration value.

Limits

When Corax is used as the search engine, indexes of more than int.MaxValue (2,147,483,647) documents can be created and used.
To match this capacity, queries over Corax indexes can skip a number of results that exceeds int.MaxValue and take documents from this location.

Example I - No Paging

The queries below will return all the results available.

IList<Product> results = session
    .Query<Product, Products_ByUnitsInStock>()
    .Where(x => x.UnitsInStock > 10)
    .ToList();
IList<Product> results = session
    .Advanced
    .DocumentQuery<Product, Products_ByUnitsInStock>()
    .WhereGreaterThan(x => x.UnitsInStock, 10)
    .ToList();
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
    public Products_ByUnitsInStock()
    {
        Map = products => from product in products
                          select new
                          {
                              UnitsInStock = product.UnitsInStock
                          };
    }
}

Example II - Basic Paging

Let's assume that our page size is 10, and we want to retrieve the 3rd page. To do this, we need to issue following query:

IList<Product> results = session
    .Query<Product, Products_ByUnitsInStock>()
    .Where(x => x.UnitsInStock > 10)
    .Skip(20)   // skip 2 pages worth of products
    .Take(10)   // take up to 10 products
    .ToList();  // execute query
IList<Product> results = session
    .Advanced
    .DocumentQuery<Product, Products_ByUnitsInStock>()
    .WhereGreaterThan(x => x.UnitsInStock, 10)
    .Skip(20)   // skip 2 pages worth of products
    .Take(10)   // take up to 10 products
    .ToList();  // execute query
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
    public Products_ByUnitsInStock()
    {
        Map = products => from product in products
                          select new
                          {
                              UnitsInStock = product.UnitsInStock
                          };
    }
}

Finding the Total Results Count When Paging

While paging, you sometimes need to know the exact number of results returned from the query. The Client API supports this explicitly:

IList<Product> results = session
    .Query<Product, Products_ByUnitsInStock>()
    .Statistics(out QueryStatistics stats)      // fill query statistics
    .Where(x => x.UnitsInStock > 10)
    .Skip(20)
    .Take(10)
    .ToList();

long totalResults = stats.TotalResults;
IList<Product> results = session
    .Advanced
    .DocumentQuery<Product, Products_ByUnitsInStock>()
    .Statistics(out QueryStatistics stats)      // fill query statistics
    .WhereGreaterThan(x => x.UnitsInStock, 10)
    .Skip(20)
    .Take(10)
    .ToList();

long totalResults = stats.TotalResults;
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
    public Products_ByUnitsInStock()
    {
        Map = products => from product in products
                          select new
                          {
                              UnitsInStock = product.UnitsInStock
                          };
    }
}

While the query will return with just 10 results, totalResults will hold the total number of matching documents.

Paging Through Tampered Results

For some queries, the server will skip over some results internally and invalidate the TotalResults value. When executing a Distinct query or index producing multiple index entries per document (a fanout index), then TotalResults will contain the total count of matching documents found, but it will not take into account results that were skipped as a result of the Distinct operator.

Whenever SkippedResults is greater than 0 and a query involved some non-stored fields, it implies that we skipped over some results in the index.

In order to do proper paging in those scenarios, you should use SkippedResults when telling RavenDB how many documents to skip. For each page, the starting point should be .Skip((currentPage * pageSize) + SkippedResults).

For example, let's page through all the results:

IList<Product> results;
int pageNumber = 0;
int pageSize = 10;
long skippedResults = 0;

do
{
    results = session
        .Query<Product, Products_ByUnitsInStock>()
        .Statistics(out QueryStatistics stats)
        .Skip((pageNumber * pageSize) + (int)skippedResults)
        .Take(pageSize)
        .Where(x => x.UnitsInStock > 10)
        .Distinct()
        .ToList();

    skippedResults += stats.SkippedResults;
    pageNumber++;
}
while (results.Count > 0);
IList<Product> results;
int pageNumber = 0;
int pageSize = 10;
long skippedResults = 0;

do
{
    results = session
        .Advanced
        .DocumentQuery<Product, Products_ByUnitsInStock>()
        .Statistics(out QueryStatistics stats)
        .Skip((pageNumber * pageSize) + skippedResults)
        .Take(pageSize)
        .WhereGreaterThan(x => x.UnitsInStock, 10)
        .Distinct()
        .ToList();

    skippedResults += stats.SkippedResults;
    pageNumber++;
}
while (results.Count > 0);
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
    public Products_ByUnitsInStock()
    {
        Map = products => from product in products
                          select new
                          {
                              UnitsInStock = product.UnitsInStock
                          };
    }
}

IList<Order> results;
int pageNumber = 0;
int pageSize = 10;
long skippedResults = 0;

do
{
    results = session
        .Query<Order, Orders_ByOrderLines_ProductName>()
        .Statistics(out QueryStatistics stats)
        .Skip((pageNumber * pageSize) + (int)skippedResults)
        .Take(pageSize)
        .ToList();

    skippedResults += stats.SkippedResults;
    pageNumber++;
}
while (results.Count > 0);
IList<Order> results;
long pageNumber = 0;
long pageSize = 10;
long skippedResults = 0;

do
{
    results = session
        .Advanced
        .DocumentQuery<Order, Orders_ByOrderLines_ProductName>()
        .Statistics(out QueryStatistics stats)
        .Skip((pageNumber * pageSize) + skippedResults)
        .Take(pageSize)
        .ToList();

    skippedResults += stats.SkippedResults;
    pageNumber++;
}
while (results.Count > 0);
public class Orders_ByOrderLines_ProductName : AbstractIndexCreationTask<Order>
{
    public Orders_ByOrderLines_ProductName()
    {
        Map = orders => from order in orders
                        from line in order.Lines
                        select new
                        {
                            Product = line.ProductName
                        };
    }
}

The situation would be different if a Distinct query and a projection applied to stored fields only. To get the correct results here, you shouldn't include SkippedResults into the paging formula. Let's take a look at the example (note the usage of Store method in the index definition):

IList<string> results;
int pageNumber = 0;
const int pageSize = 10;

do
{
    results = session
        .Query<Orders_ByStoredProductName.Result, Orders_ByStoredProductName>()
        .Select(x => x.Product)
        .Skip((pageNumber * pageSize))
        .Take(pageSize)
        .Distinct()
        .ToList();

    pageNumber++;
}
while (results.Count > 0);
IList<Orders_ByStoredProductName.Result> results;
int pageNumber = 0;
const int pageSize = 10;

do
{
    results = session
        .Advanced
        .DocumentQuery<Order, Orders_ByStoredProductName>()
        .SelectFields<Orders_ByStoredProductName.Result>("Product")
        .Skip((pageNumber * pageSize))
        .Take(pageSize)
        .Distinct()
        .ToList();

    pageNumber++;
}
while (results.Count > 0);
public class Orders_ByStoredProductName : AbstractIndexCreationTask<Order>
{
    public class Result
    {
        public string Product { get; set; }
    }

    public Orders_ByStoredProductName()
    {
        Map = orders => from order in orders
                        from line in order.Lines
                        select new Result
                        {
                            Product = line.ProductName
                        };

        Store("Product", FieldStorage.Yes);
    }
}