RavenDB version 2.5. Other versions:

Faceted Search

When displaying a large amount of data, often paging is used to make viewing the data manageable. However it's also useful to give some context of the entire data-set and a easy way to drill-down into particular categories. The common approach to doing this is "faceted search", as shown in the image below. Note how the count of each category within the current search is across the top.

Facets

To achieve this in RavenDB, lets say you have a document like this:

    { 
    DateOfListing: "2000-09-01T00:00:00.0000000+01:00" 
    Manufacturer: "Jessops" 
    Model: "blah" 
    Cost: 717.502206059872 
    Zoom: 9 
    Megapixels: 10.4508949012733 
    ImageStabiliser: false 
}

Step 1

You need to setup your facet definitions and store them in RavenDB as a document, like so:

    _facets = new List<Facet>
    	          {
    		          new Facet
    			          {
    				          Name = "Manufacturer"
    			          },
    		          new Facet
    			          {
    				          Name = "Cost_Range",
    				          Mode = FacetMode.Ranges,
    				          Ranges =
    					          {
    						          "[NULL TO Dx200.0]",
    								  "[Dx200.0 TO Dx400.0]",
    								  "[Dx400.0 TO Dx600.0]",
    								  "[Dx600.0 TO Dx800.0]",
    								  "[Dx800.0 TO NULL]",
    					          }
    			          },
    		          new Facet
    			          {
    				          Name = "Megapixels_Range", 
    						  Mode = FacetMode.Ranges, 
    						  Ranges =
    							  {
    								  "[NULL TO Dx3.0]",
    								  "[Dx3.0 TO Dx7.0]",
    								  "[Dx7.0 TO Dx10.0]",
    								  "[Dx10.0 TO NULL]",
    							  }
    			          }
    	          };
    
    session.Store(new FacetSetup { Id = "facets/CameraFacets", Facets = _facets });
    
    

This tells RavenDB that you would like to get the following facets.

  • For the Manufacturer field look at the documents and return a count for each unique Term found
  • For the Cost field, return the count of the following ranges:
    • Cost <= 200.0
    • 200.0 <= Cost <= 400.0
    • 400.0 <= Cost <= 600.0
    • 600.0 <= Cost <= 800.0
    • Cost >= 800.0
  • For the Megapixels field, return the count of the following ranges:
    • Megapixels <= 3.0
    • 3.0 <= Megapixels <= 7.0
    • 7.0 <= Megapixels <= 10.0
    • Megapixels >= 10.0

Step 2

Next you need to create an index to work against, this can be setup like so:

    store.DatabaseCommands.PutIndex("CameraCost",
    			new IndexDefinition
    			{
    				Map = @"from camera in docs 
                                    select new 
                                    { 
                                        camera.Manufacturer, 
                                        camera.Model, 
                                        camera.Cost,
                                        camera.DateOfListing,
                                        camera.Megapixels
                                    }"
    			});
    
    

Step 3

Finally you can write the following code and you get back the data below.

    var facetResults = session.Query<Camera>("CameraCost")
    	.Where(x => x.Cost >= 100 && x.Cost <= 300)
    	.ToFacets("facets/CameraFacets");
    
    

This is equivalent to hitting the following Url:

    http://localhost:8080/facets/CameraCost?facetDoc=facets/CameraFacets&query=Cost_Range:[Dx100 TO Dx300.0]

The data returned represents the count of the faceted data that satisfies the query Where(x => x.Cost >= 100 && x.Cost <= 300 )

    {
   Manufacturer: [
      {
         Range: 'canon',
         Count: 42
      },
      {
         Range: 'jessops',
         Count: 50
      },
      {
         Range: 'nikon',
         Count: 46
      },
      {
         Range: 'phillips',
         Count: 44
      },
      {
         Range: 'sony',
         Count: 35
      }
   ],
   Cost_Range: [
      {
         Range: '[NULL TO Dx200.0]',
         Count: 115
      },
      {
         Range: '[Dx200.0 TO Dx400.0]',
         Count: 102
      }
   ],
   Megapixels_Range: [
      {
         Range: '[NULL TO Dx3.0]',
         Count: 42
      },
      {
         Range: '[Dx3.0 TO Dx7.0]',
         Count: 79
      },
      {
         Range: '[Dx7.0 TO Dx10.0]',
         Count: 82
      },
      {
         Range: '[Dx10.0 TO NULL]',
         Count: 14
      }
   ]
}

Stale results

The faceted search does not take into account a stealeness of an index. You can't wait for non stale results by customize you query by one of WaitForNonStaleResultsXXX method.

Comments add new comment

The comments section is for user feedback or community content. If you seek assistance or have any questions, please post them at our support forums.

Vishnoo Rath
REPLY Posted by Vishnoo Rath on

It would really help in faster adoption of RavenDB if an GITHUB link to a repo with the entire solution VS 2010 is provided at the end of each topic. We could then download the same, change certain settings and run / debug the code to see how it all comes together.

Fitzchak Yitzchaki
REPLY Posted by Fitzchak Yitzchaki on

Agreed! I added the following feature request: http://issues.hibernatingrhinos.com/issue/RDoc-61. Question: would you (or anyone) would like to start this project? if so, please open a discussion about this on the group, or email us at support@hibernatingrhinos.com.

Georgios
REPLY Posted by Georgios on

Probably easier to make/maintain/use using LINQPad!

SUBMIT COMMENT