Creating a Tag Cloud
A Tag Cloud is a nice way of showing your users the available tags, while highlighting tags that are more frequently used. In this short tutorial we will see how to make one with RavenDB.
For the purpose of this tutorial, we will assume our tagged content is posts in a blog. Each BlogPost object can have multiple tags, and is modeled as follows:
public class Post
{
public string Id { get; set; }
public string Title { get; set; }
public string Slug { get; set; }
public string Content { get; set; }
public ICollection<string> Tags { get; set; }
public DateTimeOffset CreatedAt { get; set; }
// ...
}
The index
To create a Tag Cloud, we need to be able to get the number of posts cataloged under a specified tag. In order to do that, we need to define a Map/Reduce index. In its Map function, the index will collect all the available tags; in its Reduce function, it will count how many posts are under each unique tag by grouping posts by similar tags. It looks like this:
public class Tags_Count : AbstractIndexCreationTask<Post, Tags_Count.ReduceResult>
{
public class ReduceResult
{
public string Name { get; set; }
public int Count { get; set; }
}
public Tags_Count()
{
Map = posts => from post in posts
from tag in post.Tags
select new { Name = tag.ToString().ToLower(), Count = 1 };
Reduce = results => from tagCount in results
group tagCount by tagCount.Name
into g
select new {Name = g.Key, Count = g.Sum(x => x.Count) };
Sort(result => result.Count, SortOptions.Int);
}
}
It is easy to see how this is done: we select each tag under each post as an anonymous object with Count of 1. The Reduce is then called on the collection of items that is the result of the Map function, and is grouping them by the tag name and aggregating all the Count properties in each group. The output of the Reduce function, and therefore of the index, is a list of tag names with the number of posts it has cataloged under it.
The tag name was normalized to lowercase, to make sure grouping is done correctly regardless of case.
Now, given a tag name, we can actually query RavenDB to get the number of posts tagged with it:
int count = 0;
var result = session.Query<Tags_Count.ReduceResult, Tags_Count>()
.Where(x => x.Name == "RavenDB").FirstOrDefault();
if (result != null) count = result.Count;
Building the Tag Cloud
Just as we can query our index for a number of posts given a specific tag name, we can use it to get all tag names ordered by their significance:
var result = session.Query<Tags_Count.ReduceResult, Tags_Count>()
.OrderByDescending(x => x.Count).ToArray();
With this array in hand, we can easily produce HTML for showing the most significant tags in a big font, and less significant tags in a smaller one. The specifics of creating this HTML, figuring out the minimum frequency to work with, and to allow for several font sizes - is all out of scope for this tutorial.
It is important to note, however, that by default the array will contain the first 128 unique tags with their respective Count. If for some reason you are interested in showing more than 128 tags, and you indeed have that many, you should use paging. If you know in advance you are going to work with less tags for the cloud, it is recommended that you limit the request further by specifying the page size in .Take().
The comments section is for user feedback or community content. If you seek assistance or have any questions, please post them at our support forums.
For the OrderByDescending in the final example the following has to be added to the index, otherwise the results will be ordered lexically descending (9, 82, 7, 33, ...):
Sort(result => result.Count, SortOptions.Int);
You are absolutely correct, I've fixed the code sample.
just like linq
That is pretty much the intent, yes.
How could I query for tags that match a range of values? Fo example I might have var localTags = new [] { "RavenDB", "NH", "EF" } and I want to get all the posts that have at least 1 of these tags in them. Can this be done in a raven query?
You can do that by using an In, like so.