Indexes: Indexing Hierarchical Data


Use the indexing Recurse method to recurse through the layers of a hierarchical document and index its elements.


Hierarchical Data

One of the significant advantages offered by document databases is their tendency not to force limits upon data structuring. Hierarchical data structures demonstrate this quality beautifully: take, for example, the commonly-used Comment thread, implemented using objects such as:

private class BlogPost
{
    public string Author { get; set; }
    public string Title { get; set; }
    public string Text { get; set; }

    // Blog post readers can leave comments
    public List<BlogPostComment> Comments { get; set; }
}

public class BlogPostComment
{
    public string Author { get; set; }
    public string Text { get; set; }

    // Comments can be left recursively
    public List<BlogPostComment> Comments { get; set; }
}

Readers of a post created using the above BlogPost structure, can add BlogPostComment comments to its Comments field. And readers of these comments can reply with comments of their own, creating a recursive hierarchical structure.

BlogPosts/1-A, for example, is a blog entry posted by John, that contains several layers of comments left by various authors.

BlogPosts/1-A:

{
    "Author ": "John",
    "Comments": [
        {
            "Author": "Moon",
            "Comments": [
                {
                    "Author": "Bob"
                },
                {
                    "Author": "Adel",
                    "Comments": {
                        "Author": "Moon"
                    }
                }
            ]
        }
    ],
    "@metadata": {
        "@collection": "BlogPosts"
    }
}

Indexing Hierarchical Data

To index the elements of a hierarchical structure like the one demonstrated above, use RavenDB's Recurse method.

In the sample below, we use Recurse to go through comments in the post thread and index them by their authors.

private class BlogPosts_ByCommentAuthor : 
    AbstractIndexCreationTask<BlogPost, BlogPosts_ByCommentAuthor.Result>
{
    public class Result
    {
        public IEnumerable<string> Authors { get; set; }
    }

    public BlogPosts_ByCommentAuthor()
    {
        Map = blogposts => from blogpost in blogposts
                           let authors = Recurse(blogpost, x => x.Comments)
                           select new Result
                           {
                               Authors = authors.Select(x => x.Author)
                           };
    }
}
store.Maintenance.Send(new PutIndexesOperation(
    new IndexDefinition
    {
        Name = "BlogPosts/ByCommentAuthor",
        Maps =
        {
            @"from blogpost in docs.BlogPosts
              from comment in Recurse(blogpost, (Func<dynamic, dynamic>)(x => x.Comments))
              select new
              {
                  Author = comment.Author
              }"
        }
    }));
public class BlogPosts_ByCommentAuthor_JS : AbstractJavaScriptIndexCreationTask
{
    public class Result
    {
        public string[] Authors { get; set; }
    }

    public BlogPosts_ByCommentAuthor_JS()
    {
        Maps = new HashSet<string>
        {
            @"map('BlogPosts', function (blogpost) {
                return recurse(blogpost, x => x.Comments).map(function (comment) {
                    if (comment.Author != null) {
                        return {
                            Authors: comment.Author
                        };
                    }
                });
            });"
        };
    }
}

Querying the created index

  • The index we created can be queried using code.

    IList<BlogPost> results = session
        .Query<BlogPosts_ByCommentAuthor.Result, BlogPosts_ByCommentAuthor>()
        .Where(x => x.Authors.Any(a => a == "John"))
        .OfType<BlogPost>()
        .ToList();
    IList<BlogPost> results = session
        .Advanced
        .DocumentQuery<BlogPost, BlogPosts_ByCommentAuthor>()
        .WhereEquals("Authors", "John")
        .ToList();
  • The index can also be queried using Studio.

    • Use Studio's List of Indexes view to define and query the index.

      "List of Indexes view"

      List of Indexes view

    • Use the Query view to see the results and the list of terms indexed by the Recurse method.

      "Query View"

      Query View

      "Click to View Index Terms"

      Click to View Index Terms

      "Index Terms"

      Index Terms