Querying Unlike Documents Using a Multi-Map Index

by Ronnie Overby

Learn how to take advantage of Multi-Map Indexes to query different kinds of documents as if they were the same.

Querying Unlike Documents Using a Multi-Map Index

Introduction

A typical database will store many different kinds of things. RavenDB databases are no exception. This is why RavenDB has the notion of Collections, as can be seen in the Management Studio: to group the various kinds of documents, which makes it easier to browse them.

There may be a need to query the various kinds of documents as if they were the same. Take, for example, a web site using RavenDB to store its data. The developer of the web site has created these classes to represent the site's entities:

Entity Classes Diagram

Notice that these classes have varying properties and are not presently sharing a base class or interface. Now, assume the requirement that an end user of the web site can search all of the Page, BlogPost, and Event documents in the system from a single text box. How might this be implemented?

One possibility is to execute 3 separate queries (one for each entity type) and then aggregate the result sets for the end user to view. That may work, but it wouldn't be very efficient nor elegant. Another possibility is to create a multi-map index that could be queried. Let's take a look at that approach.

Multi-Map Indexes

"Multi-map" may sound complicated, but is actually simple. Let's first get the computer-sciency terminology out of the way:

  • Map - Think of a map as a function or a transformation that is applied to every document that we are going to index.
  • Multi-Map - This just refers to the fact that we are going to map more than one kind of document (three in this case).

To put it plainly, we are going to use a multi-map index to transform the Page, BlogPost, and Event documents into something similar. This will make them easy to search against. The first step is to look at each kind of document and determine what needs to be indexed. In the case of our web site search functionality, we want to search over the highlighted fields below:

Highlighted searchable fields

We'll need to transform the data in those highlighted fields into something similar that can be indexed and ultimately queried. Here's the multi-map index creation task:

    
    public class SearchIndex : AbstractMultiMapIndexCreationTask<SearchIndex.Result>
    {
        public class Result
        {
            public object[] Content { get; set; }
        }
    
        public override string IndexName
        {
            get
            {
                return "SearchableItems/ByContent";
            }
        }
    
        public SearchIndex()
        {
            AddMap<BlogPost>(items => from x in items
                                      select new Result { Content = new object[] { x.Author, x.Content, x.Summary, x.Title, x.Tags } });
    
            AddMap<Event>(items => from x in items
                                   select new Result { Content = new object[] { x.Details, x.Title } });
    
            AddMap<Page>(items => from x in items
                                  select new Result { Content = new object[] { x.Body, x.Title } });
    
            Index(x => x.Content, FieldIndexing.Analyzed);
        }
    }
    

The important part of the code is the constructor of the SearchIndex class. You can see that for each of the document types, we are extracting an object[]. RavenDB takes care of flattening any collections that we may be pulling out of our documents, such as the Tags property of the BlogPost class. The last line of the constructor, Index(x=>x.Content, FieldIndexing.Analyzed), tells RavenDB that the index should support full text searching.

The other two pieces of the code are the overridden IndexName property, which defines how we specify the index to query, and the nested Result class, which defines the structure of the map result.

Index Creation

With the SearchIndex class in place, we can have this index automatically created or updated every time the application is started by doing:

    
        IndexCreation.CreateIndexes(typeof(SearchIndex).Assembly, documentStore);
    

Querying the Index

Before implementing the final part, querying the database using the multi-map index, it would be a good idea to have our three document types each implement a similar interface. The reason for this is so that the RavenDB client can load all of the query results into memory using the similar interface. Additionally, if the web site were being built using ASP.NET MVC, then we could create a single partial view for for display all of the results without carrying what type they are. For example, the

following interface could be implemented by Page, BlogPost, and Event:

    
    interface IAmSearchable
    {
        string Title { get; }
        string Summary { get; }
        string Url { get; }
    }
    

Now, the only thing left to do is to implement the query:

    
    public ActionResult Search(string searchTerms)
    {
        using (var session = documentStore.OpenSession())
        {
            var results = session.Advanced
                .LuceneQuery<IAmSearchable, SearchIndex>()
                .Search("Content", searchTerms);
    
            return View(results);
        }
    }
    

Summary

With a simple multi-map index put into place, we were able to execute a polymorphic query against all of the document types that we were interested in. Multi-map indexes are a powerful feature of RavenDB.

Comments add new comment

The comments section is for user feedback or community content. If you seek assistance or have any questions, please post them at our support forums.

Mikael Henriksson
REPLY Posted by Mikael Henriksson on

Brilliant!! No stupid N+1 or inner/outer joins to get the data you need to display for the results!

Travis Laborde
REPLY Posted by Travis Laborde on

I fail to see how these three classes can implement IAmSearchable. While all three have "title" fields, only one has a "summary" field and none have a "url" field. While the page and the event could say that the body and details fields implement the summary part of the interface, where would we get the urls from? And why?

James
REPLY Posted by James on

The way I read it, the Url is generated however you see fit. Perhaps the BlogPost.Url would be based on the title, the Event.Url would be based on the start date + title, and so on without the interface exposing stuff (like start date) that isn't relevant to all of the search results.

This is just a first impression; I hope I'm not muddying the waters with bad info.

Kamran
REPLY Posted by Kamran on

This is just saying, you need a common interface to display in your view (i.e. @Model.Title) and could be implemented however you want on each domain object; it has nothing to do with the index creation or reading.

I think the confusion lies in how the LuceneQuery<IAmSearchable, SearchIndex> maps back to BlogPost, Event, and Page. I haven't tried looked into this at all but my guess is that this just tells Raven to return the documents as long as they implement this type (which they should). It would not "populate" these fields besides maybe title because your class does that when it gets constructed or when you read the property if it's read only.

That's my interpretation.

Patt
REPLY Posted by Patt on

With this type of multi-map, do we need still need to do some reduce?

wasfa
REPLY Posted by wasfa on

where is AbstractMultiMapIndexCreationTask??

SUBMIT COMMENT