Indexes: Creating and Deploying Indexes


  • Indexes are used by the server to satisfy queries.
    They are at the heart of RavenDB's efficiency and should be understood before indexes and queries are defined in production.

  • Static indexes can do a number of operations on the data behind the scenes so that queries that use this already processed data are as fast as possible.
    Indexes keep the processed data in a separate storage so that the raw data isn't affected.

  • Whenever a user issues a query that doesn't specify an index, RavenDB's Query Optimizer will try to find an existing auto-index that fulfills the query.

    • If one doesn't yet exist, RavenDB will either create an auto-index or optimize an existing one if it almost satisfies the query.
  • Indexes process data assigned to them as the data changes. For example, if changes are made to documents in the collection "Orders", the indexes that are defined to handle queries on "Orders" will be triggered to update the index with the new data.

    • These behind-the-scenes processes remove a lot of burden from queries. Also, indexes need to process entire datasets just once, after which, they only process new data.
      Still, they utilize machine resources and this should be considered when defining indexes and queries.

Auto and Static Indexes

  • Indexes created by issuing a query are called dynamic or Auto indexes.
    • They can be easily identified. Their name starts with the Auto/ prefix.
    • If no Auto-Index exists to satisfy a query, a new Auto-Index will be created and maintained automatically.
  • Indexes created explicitly by the user are called static.

Static indexes

There are a couple of ways to create a static index and send it to the server. We can use maintenance operations or create a custom class.


Using AbstractIndexCreationTask

AbstractIndexCreationTask let you avoid hard-coding index names in every query.

Note

We recommend creating and using indexes in this form due to its simplicity. There are many benefits and few disadvantages.

Naming Convention

There is only one naming convention: each _ in the class name will be translated to / in the index name.

e.g.

In the Northwind samples, there is a index called Orders/Totals. To get such a index name, we need to create a class called Orders_Totals.

public class Orders_Totals extends AbstractIndexCreationTask {
    /// ...
}

Sending to Server

There is not much use from an index if it is not deployed to the server. To do so, we need to create an instance of our class that inherits from AbstractIndexCreationTask and use execute method.

// deploy index to database defined in `DocumentStore.getDatabase` method
// using default DocumentStore `conventions`
new Orders_Totals().execute(store);

// deploy index to `Northwind` database
// using default DocumentStore `conventions`
new Orders_Totals().execute(store, store.getConventions(), "Northwind");

Safe By Default

If an index exists on the server and the stored definition is the same as the one that was sent, it will not be overwritten. The indexed data will not be deleted and indexation will not start from scratch.

Creating an Index with Custom Configuration

If you need to create an index with a custom index configuration you can set them in the index class constructor like so:

public class Orders_Totals extends AbstractIndexCreationTask {
    public Orders_Totals() {
        // ...
        configuration.put("MapTimeoutInSec","30");
        setConfiguration(configuration);
    }
}

Example

public static class Orders_Totals extends AbstractIndexCreationTask {
    public static class Result {
        private String employee;
        private String company;
        private double total;

        public String getEmployee() {
            return employee;
        }

        public void setEmployee(String employee) {
            this.employee = employee;
        }

        public String getCompany() {
            return company;
        }

        public void setCompany(String company) {
            this.company = company;
        }

        public double getTotal() {
            return total;
        }

        public void setTotal(double total) {
            this.total = total;
        }
    }

    public Orders_Totals() {
        map = "docs.Orders.Select(order => new { " +
            "    Employee = order.Employee, " +
            "    Company = order.Company, " +
            "    Total = Enumerable.Sum(order.Lines, l => ((decimal)((((decimal) l.Quantity) * l.PricePerUnit) * (1M - l.Discount)))) " +
            "})";
    }

    public static void main(String[] args) {
        try (IDocumentStore store = new DocumentStore(new String[]{ "http://localhost:8080" }, "Northwind")) {
            store.initialize();

            new Orders_Totals().execute(store);

            try (IDocumentSession session = store.openSession()) {
                List<Order> orders = session
                    .query(Result.class, Orders_Totals.class)
                    .whereGreaterThan("Total", 100)
                    .ofType(Order.class)
                    .toList();
            }
        }
    }
}

Using Maintenance Operations

The PutIndexesOperation maintenance operation (which API references can be found here) can be used also to send index(es) to the server.

The benefit of this approach is that you can choose the name as you feel fit, and change various settings available in IndexDefinition. You will have to use string-based names of indexes when querying.

IndexDefinition indexDefinition = new IndexDefinition();
indexDefinition.setName("Orders/Totals");
indexDefinition.setMaps(Collections.singleton(
    "from order in docs.Orders " +
    " select new " +
    " { " +
    "    order.employee, " +
    "    order.company, " +
    "    total = order.lines.Sum(l => (l.quantity * l.pricePerUnit) * (1 - l.discount)) " +
    "}"
));

store
    .maintenance()
    .send(new PutIndexesOperation(indexDefinition));

IndexDefinitionBuilder

IndexDefinitionBuilder is a very useful class that enables you to create IndexDefinitions using strongly-typed syntax with access to low-level settings not available when the AbstractIndexCreationTask approach is used.

IndexDefinitionBuilder builder = new IndexDefinitionBuilder();
builder.setMap(
        "from order in docs.Orders \n" +
                "select new \n" +
                " {\n" +
                "    order.employee,\n" +
                "    order.company,\n" +
                "    total = order.lines.Sum(l => (l.quantity * l.pricePerUnit) * (1 - l.discount))\n" +
                "}");

store.maintenance()
        .send(new PutIndexesOperation(builder.toIndexDefinition(store.getConventions())));

Remarks

Information

Maintenance Operations or IndexDefinitionBuilder approaches are not recommended and should be used only if you can't do it by inheriting from AbstractIndexCreationTask.

Side-by-Side

Since RavenDB 4.0, all index updates are side-by-side by default. The new index will replace the existing one once it becomes non-stale. If you want to force an index to swap immediately, you can use the Studio for that.

Auto indexes

Auto-indexes are created when queries that do not specify an index name are executed and, after in-depth query analysis, no matching AUTO index is found on the server-side.

Note

The query optimizer doesn't take into account the static indexes when it determines what index should be used to handle a query.

Naming Convention

Auto-indexes can be recognized by the Auto/ prefix in their name. Their name also contains the name of a collection that was queried, and list of fields that were required to find valid query results.

For instance, issuing a query like this

List<Employee> employees = session
    .query(Employee.class)
    .whereEquals("firstName", "Robert")
    .andAlso()
    .whereEquals("lastName", "King")
    .toList();
from Employees
where FirstName = 'Robert' and LastName = 'King'

will result in a creation of a index named Auto/Employees/ByFirstNameAndLastName.

Auto Indexes and Indexing State

To reduce the server load, if auto-indexes are not queried for a certain amount of time defined in Indexing.TimeToWaitBeforeMarkingAutoIndexAsIdleInMin setting (30 minutes by default), then they will be marked as Idle. You can read more about the implications of marking index as Idle here.

Setting this configuration option to a high value may result in performance degradation due to the possibility of having a high amount of unnecessary work that is all redundant and not needed by indexes to perform. This is not a recommended configuration.

If Indexes Exhaust System Resources

  • Indexes process data assigned to them as the data changes. For example, if changes are made in the collection "Orders", the indexes that are defined to handle queries on "Orders" will be triggered to update.
  • These processes utilize machine resources.
    If indexing drains system resources, it usually means that either they were defined in a way that causes inefficient processing, or that your license, cloud instance or hardware must be optimized to satisfy your usage needs.