When it comes to index creation, the only difference between simple indexes and the map-reduce ones
is an additional reduce function defined in the index definition.
To deploy an index we need to create a definition and deploy it using one of the ways described in the
creating and deploying article.
Example I - Count
Let's assume that we want to count the number of products for each category.
To do it, we can create the following index using LoadDocument inside:
In addition to storing the aggregation results in the index, the map-reduce index can also output
those reduce results as documents to a specified collection. In order to create these documents,
called "artificial", you need to define the target collection using the outputReduceToCollection
property in the index definition.
Writing map-reduce outputs into documents allows you to define additional indexes on top of them
that give you the option to create recursive map-reduce operations. This makes it cheap and easy
to, for example, recursively create daily, monthly, and yearly summaries on the same data.
In addition, you can also apply the usual operations on artificial documents (e.g. data
subscriptions or ETL).
If the aggregation value for a given reduce key changes, we overwrite the output document. If the
given reduce key no longer has a result, the output document will be removed.
To help organize these output documents, the map-reduce index can also create an additional
collection of artificial reference documents. These documents aggregate the output documents
and store their document IDs in an array field ReduceOutputs.
The document IDs of reference documents are customized to follow some pattern. The format you
give to their document ID also determines how the output documents are grouped.
Because reference documents have well known, predictable IDs, they are easier to plug into
indexes and other operations, and can serve as an intermediary for the output documents whose
IDs are less predictable. This allows you to chain map-reduce indexes in a recursive fashion,
see Example II.
Optional collection name for the reference documents - by default it is <outputReduceToCollection>/References.
Document ID format for reference documents. This ID references the fields of the reduce function output, which determines how the output documents are aggregated. The type of this parameter is different depending on if the index is created using IndexDefinition or AbstractIndexCreationTask.
It is forbidden to output reduce results to collections such as the following:
A collection that the current index is already working on.
E.g., an index on a DailyInvoices collection outputs to DailyInvoices.
A collection that the current index is loading a document from.
E.g., an index with LoadDocument(id, "Invoices") outputs to Invoices.
Two collections, each processed by a map-reduce indexes,
when each index outputs to the second collection.
An index on the Invoices collection outputs to the DailyInvoices collection,
while an index on DailyInvoices outputs to Invoices.
When an attempt to create such an infinite indexing loop is
detected a detailed error is generated.
Output to an Existing collection
Creating a map-reduce index which defines an output collection that already
exists and contains documents, will result in an error.
Delete all documents from the target collection before creating the index,
or output results to a different collection.
Modification of Artificial Documents
Artificial documents can be loaded and queried just like regular documents.
However, it is not recommended to edit artificial documents manually since
any index results update would overwrite all manual modifications made in them.
Map-Reduce Indexes on a Sharded Database
On a sharded database, the behavior of map-reduce
indexes is altered in in a few ways that database operators should be aware of.
about map-reduce indexes on a sharded database.
Read here about querying
map-reduce indexes on a sharded database.