You are currently browsing legacy 5.3 version of documentation. Click here to switch to the newest 6.0 version.

We can help you with migration to the latest RavenDB

see on GitHub

Create Map-Reduce Index

Map-Reduce indexes allow you to perform complex data aggregation that can be queried on with very little cost, regardless of the data size.
The aggregation is done during the indexing phase, not at query time.
Once new data comes into the database, or existing documents are modified,
the Map-Reduce index will re-calculate the aggregated data,
so that the aggregation results are always available and up-to-date !
The aggregation computation is done in two separate consecutive actions: the Map and the Reduce.
The Map stage:
This first stage runs the defined Map function(s) on each document, indexing the specified fields.
The Reduce stage:
This second stage groups the specified requested fields that were indexed in the Map stage,
and then runs the Reduce function to get a final aggregation result per field value.
The Map-Reduce results can be visualized in the Map-Reduce Visualizer.
In this page:

The Map Stage

Define a Map Function:

In the following example, we want to get the following aggregated values:
- The number of orders each company makes &
- The accumulative amount spent on all orders by each company.
Lets define the following Map function on the Orders collection:

The Map Function

Index Name - An index name can be composed of letters, digits, ., /, -, and _. The name must be unique in the scope of the database.
- Uniqueness is evaluated in a case-insensitive way - you can't create indexes named both usersbyname and UsersByName.
- The characters _ and / are treated as equivalent - you can't create indexes named both users/byname and users_byname.
- If the index name contains the character ., it must have some other character on both sides to be valid. /./ is a valid index name, but ./, /., and /../ are all invalid.
The Map function in defines the following 3 fields that will be indexed:
- order.Company -
  The company
- OrdersCount -
  In the Map stage, per single Order document, the value of this field is '1',
  as each order document in the Order collection was made by a single, specific company.
  This field will be aggregated later in the Reduce stage, accumulating the data from all the Orders documents, per company.
  The accumulative value of this field will represent the number of all orders a company has made.
- TotalOrdersAmount -
  In the Map stage, per single Order document, the value of this field is the total order amount for that document.
  (Summing up all products in the 'Lines' field in the document, and taking the discount into account).
  This field will be aggregated later in the Reduce stage, accumulating the data from all the Orders documents, per company.
  The accumulative value of this field will represent the total amount spent by a company on all orders.
Next, click 'Add Reduction' to continue and add the 'Reduce' function. See The Reduce Stage.

The Reduce Stage

Define a Reduce Function:

The Reduce Function

- In the Reduce function above, results are grouped by the Company field,
  so that we can get the data per company. (group result by result.Company)
- The index results will show in the following format:
  - Company - will be the company for which we see the results.
  - OrdersCount - is the aggregation of the orders count value from the Map stage
    (How many orders were made by each company).
  - TotalOrdesAmount - is the aggregation of the total orders amount made by each company
    (How much money the company has spent all together, on all orders).
Optional: The results of the Map-Reduce index can be saved in a new collection.
Learn more in Saving Map-Reduce Results in a Collection (Artificial Documents)

Important Guidelines

Both the Map and the Reduce functions must be pure functions, they should have no external input.
i.e. usage of Random, DateTime.Now or any similar calls is not allowed.
Calling them with the same input must always return the same output.
The Reduce output must match the Map output, they must have the same structure.
RavenDB will error if you have a different shape for each of the functions.

Map-Reduce Query Results

Map-Reduce Query Result

In the query results, the number of orders per company is represented in the OrdersCount column.
The total amount of all orders per company is represented in the TotalOrdersAmount column.
The column names correspond to the Map-Reduce fields definition.
The Map-Reduce results can also be visualized in Map-Reduce Visualizer.

Multi-Map-Reduce

Multi-Map-Reduce indexes allow us to aggregate data from multiple collections.
In the below example we define three maps, on the Companies, Suppliers and Employees collections.
In each map, we output a count for the type of the document we're mapping, as well as the relevant City.

Define Multi Maps

In the Reduce part we group by City and then sum up all the results from all the intermediate steps,
to get the final city count in each collection.

The Multi-Map-Reduce

Saving Map-Reduce Results in a Collection (Artificial Documents)

The results of the Map-Reduce index can be saved as output documents in a new output collection.
These output documents can be further aggregated by reference documents, documents that contain the document IDs of output documents.
These documents created by Map-Reduce Indexes are called Artificial Documents.
Learn more about using Artificial Documents from the client code in Map-Reduce Indexes: Reduce Results as Artificial Documents.

Save Map-Reduce Results into a Collection

Specify the name of the collection you want the output documents to be saved in.
Note: the collection specified must be empty (contain no documents).
Specify a pattern for the reference document IDs. By including reduce function fields, this pattern determines which output documents will be included in each reference document.
The name of the collection for the reference documents. By default, this is <output collection name>/Reference.

An Artificial Document in the collection CompaniesOrders

A Reference Document in the collection CompaniesOrders/Reference

Artificial Documents -vs- Regular Documents

Artificial documents are created directly by the index.
They behave just like standard documents, except that they are not replicated to other nodes in the database group.
Artificial documents are updated whenever the index completes indexing a batch of documents.

While artificial documents can be loaded and queried just like regular documents, it is not recommended to edit them manually since any index results update would overwrite all manual modifications made in them.

Artificial Documents Usage

You can set up indexes on top of the Artificial Documents collection, including additional MapReduce indexes,
giving you the option to create recursive map-reduce operations.
You can set up a RavenDB ETL Task on the Artificial Documents collection to a dedicated database on a separate cluster for further processing, as well as other ongoing tasks such as: SQL ETL and Subscriptions.

Limitations

RavenDB will detect and generate an error if you have a cycle of artificial documents. You can't define another Map-Reduce index that will output artificial documents if that will trigger (directly or indirectly) the same index.
Otherwise, you might set up a situation where the indexes run in an infinite loop.
An empty collection must be used as the target collection for the artificial documents.
This is mandatory since the Map-Reduce index overwrites any existing document in the collection.
You have no control over the artificial documents IDs.
These identifiers are generated by RavenDB based on the hash of the reduce key.
Artificial documents are not sent over replication,
each node in the database group has its own (independent) copy of the index results.
Therefore:
1. It is recommended to use artificial documents with Subscriptions only on a single node.
  A Subscription failover to another node may cause the subscription to send Artificial Documents
  that the subscription has already acknowledged.
2. Artificial documents cannot use Revisions or Attachments.

see on GitHub

RavenDB

RavenDB Cloud

Try

Experience interactive demos and playground server

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Unlock your business potential

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events

What’s New

Roadmap

On-premise Pricing

Cloud Pricing

Support

Proof of Concept Program

Academic Program

Create Map-Reduce Index

The Map Stage

The Reduce Stage

Important Guidelines

Map-Reduce Query Results

Multi-Map-Reduce

Saving Map-Reduce Results in a Collection (Artificial Documents)

Artificial Documents -vs- Regular Documents

Artificial Documents Usage

Limitations

Related Articles

Indexes

Studio

RavenDB

RavenDB Cloud

Try

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Use Cases

Articles

Whitepapers

Press Releases