Indexing Hierarchical Data
-
Use the
Recurse
method to traverse the layers of a hierarchical document and index its fields. -
In this Page:
Hierarchical data
One significant advantage of document databases is their tendency not to impose limits on data structuring. Hierarchical data structures exemplify this quality well; for example, consider the commonly used comment thread, implemented using objects such as:
public class BlogPost
{
public string Author { get; set; }
public string Title { get; set; }
public string Text { get; set; }
// Blog post readers can leave comments
public List<BlogPostComment> Comments { get; set; }
}
public class BlogPostComment
{
public string Author { get; set; }
public string Text { get; set; }
// Allow nested comments, enabling replies to existing comments
public List<BlogPostComment> Comments { get; set; }
}
Readers of a post created using the above BlogPost
structure can add BlogPostComment
entries to the post's Comments field,
and readers of these comments can reply with comments of their own, creating a recursive hierarchical structure.
For example, the following document, BlogPosts/1-A
, represents a blog post by John that contains multiple layers of comments from various authors.
BlogPosts/1-A
:
{
"Author": "John",
"Title": "Post title..",
"Text": "Post text..",
"Comments": [
{
"Author": "Moon",
"Text": "Comment text..",
"Comments": [
{
"Author": "Bob",
"Text": "Comment text.."
},
{
"Author": "Adel",
"Text": "Comment text..",
"Comments": {
"Author": "Moon",
"Text": "Comment text.."
}
}
]
}
],
"@metadata": {
"@collection": "BlogPosts"
}
}
Index hierarchical data
To index the elements of a hierarchical structure like the one above, use RavenDB's Recurse
method.
The sample index below shows how to use Recurse
to traverse the comments in the post thread and index them by their authors.
We can then query the index for all blog posts that contain comments by specific authors.
public class BlogPosts_ByCommentAuthor :
AbstractIndexCreationTask<BlogPost, BlogPosts_ByCommentAuthor.IndexEntry>
{
public class IndexEntry
{
public IEnumerable<string> Authors { get; set; }
}
public BlogPosts_ByCommentAuthor()
{
Map = blogposts =>
from blogpost in blogposts
let authors = Recurse(blogpost, x => x.Comments)
select new IndexEntry
{
Authors = authors.Select(x => x.Author)
};
}
}
public class BlogPosts_ByCommentAuthor_JS : AbstractJavaScriptIndexCreationTask
{
public class Result
{
public string[] Authors { get; set; }
}
public BlogPosts_ByCommentAuthor_JS()
{
Maps = new HashSet<string>
{
@"map('BlogPosts', function (blogpost) {
var authors =
recurse(blogpost.Comments, function(x) {
return x.Comments;
})
.filter(function(comment) {
return comment.Author != null;
})
.map(function(comment) {
return comment.Author;
});
return {
Authors: authors
};
});"
};
}
}
store.Maintenance.Send(new PutIndexesOperation(
new IndexDefinition
{
Name = "BlogPosts/ByCommentAuthor",
Maps =
{
@"from blogpost in docs.BlogPosts
let authors = Recurse(blogpost, (Func<dynamic, dynamic>)(x => x.Comments))
let authorNames = authors.Select(x => x.Author)
select new
{
Authors = authorNames
}"
}
}));
Query the index
The index can be queried for all blog posts that contain comments made by specific authors.
Query the index using code:
List<BlogPost> results = session
.Query<BlogPosts_ByCommentAuthor.IndexEntry, BlogPosts_ByCommentAuthor>()
// Query for all blog posts that contain comments by 'Moon':
.Where(x => x.Authors.Any(a => a == "Moon"))
.OfType<BlogPost>()
.ToList();
List<BlogPost> results = await asyncSession
.Query<BlogPosts_ByCommentAuthor.IndexEntry, BlogPosts_ByCommentAuthor>()
// Query for all blog posts that contain comments by 'Moon':
.Where(x => x.Authors.Any(a => a == "Moon"))
.OfType<BlogPost>()
.ToListAsync();
List<BlogPost> results = session
.Advanced
.DocumentQuery<BlogPost, BlogPosts_ByCommentAuthor>()
// Query for all blog posts that contain comments by 'Moon':
.WhereEquals("Authors", "Moon")
.ToList();
from index "BlogPosts/ByCommentAuthor"
where Authors == "Moon"
Query the index using the Studio:
-
Query the index from the Studio's List of Indexes view:
List of Indexes view
-
View the query results in the Query view:
Query view
-
View the list of terms indexed by the
Recurse
method:Click to view index terms
Index terms