Querying: Paging
Paging, or pagination, is the process of splitting a dataset into pages, reading one page at a time. This is useful for optimizing bandwidth traffic and hardware usage or simply because no user can handle huge amounts of data at once.
Warning
Starting from version 4.0, if the page size is not specified on client side, the server will assume int.MaxValue (2,147,483,647) and all the results will be downloaded. It is recommended to set a page size explicitly to avoid long response times caused by sending excessive amounts of data over the network or high memory consumption caused by handling large quantities of documents.
You can also set DocumentConventions.ThrowIfQueryPageSizeIsNotSet
convention to true to guard yourself from executing queries without the page size explicitly set. We recommend turning this convention on, especially during development or testing phases to detect early the queries that potentially can return an excessive amount of results.
Performance
By default, if the number of returned results exceeds 2048, the server will issue a Performance Hint
notification (visible in the Studio) with information about query details. You can decide if this behavior is desired or not.
The threshold can be adjusted by changing the PerformanceHints.MaxNumberOfResults
configuration value.
Example I - No Paging
The queries below will return all the results available.
IList<Product> results = session
.Query<Product, Products_ByUnitsInStock>()
.Where(x => x.UnitsInStock > 10)
.ToList();
IList<Product> results = session
.Advanced
.DocumentQuery<Product, Products_ByUnitsInStock>()
.WhereGreaterThan(x => x.UnitsInStock, 10)
.ToList();
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
public Products_ByUnitsInStock()
{
Map = products => from product in products
select new
{
UnitsInStock = product.UnitsInStock
};
}
}
Example II - Basic Paging
Let's assume that our page size is 10
, and we want to retrieve the 3rd page. To do this, we need to issue following query:
IList<Product> results = session
.Query<Product, Products_ByUnitsInStock>()
.Where(x => x.UnitsInStock > 10)
.Skip(20) // skip 2 pages worth of products
.Take(10) // take up to 10 products
.ToList(); // execute query
IList<Product> results = session
.Advanced
.DocumentQuery<Product, Products_ByUnitsInStock>()
.WhereGreaterThan(x => x.UnitsInStock, 10)
.Skip(20) // skip 2 pages worth of products
.Take(10) // take up to 10 products
.ToList(); // execute query
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
public Products_ByUnitsInStock()
{
Map = products => from product in products
select new
{
UnitsInStock = product.UnitsInStock
};
}
}
Finding the Total Results Count When Paging
While paging, you sometimes need to know the exact number of results returned from the query. The Client API supports this explicitly:
IList<Product> results = session
.Query<Product, Products_ByUnitsInStock>()
.Statistics(out QueryStatistics stats) // fill query statistics
.Where(x => x.UnitsInStock > 10)
.Skip(20)
.Take(10)
.ToList();
int totalResults = stats.TotalResults;
IList<Product> results = session
.Advanced
.DocumentQuery<Product, Products_ByUnitsInStock>()
.Statistics(out QueryStatistics stats) // fill query statistics
.WhereGreaterThan(x => x.UnitsInStock, 10)
.Skip(20)
.Take(10)
.ToList();
int totalResults = stats.TotalResults;
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
public Products_ByUnitsInStock()
{
Map = products => from product in products
select new
{
UnitsInStock = product.UnitsInStock
};
}
}
While the query will return with just 10 results, totalResults
will hold the total number of matching documents.
Paging Through Tampered Results
For some queries, the server will skip over some results internally and invalidate the TotalResults
value. When executing a Distinct
query or index producing multiple index entries per document (a fanout index), then TotalResults
will contain the total count of matching documents found, but it will not take into account results that were skipped as a result of the Distinct
operator.
Whenever SkippedResults
is greater than 0 and a query involved some non-stored fields, it implies that we skipped over some results in the index.
In order to do proper paging in those scenarios, you should use SkippedResults
when telling RavenDB how many documents to skip. For each page, the starting point should be .Skip((currentPage * pageSize) + SkippedResults)
.
For example, let's page through all the results:
IList<Product> results;
int pageNumber = 0;
int pageSize = 10;
int skippedResults = 0;
do
{
results = session
.Query<Product, Products_ByUnitsInStock>()
.Statistics(out QueryStatistics stats)
.Skip((pageNumber * pageSize) + skippedResults)
.Take(pageSize)
.Where(x => x.UnitsInStock > 10)
.Distinct()
.ToList();
skippedResults += stats.SkippedResults;
pageNumber++;
}
while (results.Count > 0);
IList<Product> results;
int pageNumber = 0;
int pageSize = 10;
int skippedResults = 0;
do
{
results = session
.Advanced
.DocumentQuery<Product, Products_ByUnitsInStock>()
.Statistics(out QueryStatistics stats)
.Skip((pageNumber * pageSize) + skippedResults)
.Take(pageSize)
.WhereGreaterThan(x => x.UnitsInStock, 10)
.Distinct()
.ToList();
skippedResults += stats.SkippedResults;
pageNumber++;
}
while (results.Count > 0);
public class Products_ByUnitsInStock : AbstractIndexCreationTask<Product>
{
public Products_ByUnitsInStock()
{
Map = products => from product in products
select new
{
UnitsInStock = product.UnitsInStock
};
}
}
IList<Order> results;
int pageNumber = 0;
int pageSize = 10;
int skippedResults = 0;
do
{
results = session
.Query<Order, Orders_ByOrderLines_ProductName>()
.Statistics(out QueryStatistics stats)
.Skip((pageNumber * pageSize) + skippedResults)
.Take(pageSize)
.ToList();
skippedResults += stats.SkippedResults;
pageNumber++;
}
while (results.Count > 0);
IList<Order> results;
int pageNumber = 0;
int pageSize = 10;
int skippedResults = 0;
do
{
results = session
.Advanced
.DocumentQuery<Order, Orders_ByOrderLines_ProductName>()
.Statistics(out QueryStatistics stats)
.Skip((pageNumber * pageSize) + skippedResults)
.Take(pageSize)
.ToList();
skippedResults += stats.SkippedResults;
pageNumber++;
}
while (results.Count > 0);
public class Orders_ByOrderLines_ProductName : AbstractIndexCreationTask<Order>
{
public Orders_ByOrderLines_ProductName()
{
Map = orders => from order in orders
from line in order.Lines
select new
{
Product = line.ProductName
};
}
}
The situation would be different if a Distinct
query and a projection applied to stored fields only. To get the correct results here, you shouldn't include SkippedResults
into the paging formula. Let's take a look at the example (note the usage of Store
method in the index definition):
IList<string> results;
int pageNumber = 0;
const int pageSize = 10;
do
{
results = session
.Query<Orders_ByStoredProductName.Result, Orders_ByStoredProductName>()
.Select(x => x.Product)
.Skip((pageNumber * pageSize))
.Take(pageSize)
.Distinct()
.ToList();
pageNumber++;
}
while (results.Count > 0);
IList<Orders_ByStoredProductName.Result> results;
int pageNumber = 0;
const int pageSize = 10;
do
{
results = session
.Advanced
.DocumentQuery<Order, Orders_ByStoredProductName>()
.SelectFields<Orders_ByStoredProductName.Result>("Product")
.Skip((pageNumber * pageSize))
.Take(pageSize)
.Distinct()
.ToList();
pageNumber++;
}
while (results.Count > 0);
public class Orders_ByStoredProductName : AbstractIndexCreationTask<Order>
{
public class Result
{
public string Product { get; set; }
}
public Orders_ByStoredProductName()
{
Map = orders => from order in orders
from line in order.Lines
select new Result
{
Product = line.ProductName
};
Store("Product", FieldStorage.Yes);
}
}