Suggestions
RavenDB indexing mechanism in built upon Lucene engine that has a great suggestions feature. This capability has been also introduced to RavenDB and allows a significant improvement of search functionalities enhancing the overall user experience of the application.
Let's consider an example where the users have the option to look for products by their name. The index and query would look as follow:
public static class Products_ByName extends AbstractIndexCreationTask {
public Products_ByName() {
QProduct p = QProduct.product;
map =
" from product in docs.Products " +
" select new " +
" { " +
" product.Name " +
" }; ";
index(p.name, FieldIndexing.ANALYZED); // (optional) splitting name into multiple tokens
suggestion(p.name); // configuring suggestions
}
}
QProduct p = QProduct.product;
IRavenQueryable<Product> query = session
.query(Product.class, Products_ByName.class)
.where(p.name.eq("chaig"));
Product product = query.firstOrDefault();
If our database have Northwind
samples deployed then it will not return any results, but we can ask RavenDB for help by using:
if (product == null) {
SuggestionQueryResult suggestionResult = query.suggest();
System.out.println("Did you mean?");
for (String suggestion : suggestionResult.getSuggestions()) {
System.out.println("\t" + suggestion);
}
}
It will produce the suggestions:
Did you mean?
chang
chai
The suggest
method has an overload that takes one parameter - SuggestionQuery
that allows
you to specify the suggestion query options:
- Field - the name of the field that you want to find suggestions in,
- Term - the provided by user search term,
- MaxSuggestions - the number of suggestions to return (default:
15
), - Distance - the enum that indicates what string distance algorithm should be used: JaroWinkler, Levenshtein (default) or NGram,
- Accuracy - the minimal accuracy required from a string distance for a suggestion match (default: 0.0),
- Popularity - determines whether the returned terms should be in order of popularity (default: false).
SuggestionQuery suggestionQuery = new SuggestionQuery();
suggestionQuery.setField("Name");
suggestionQuery.setTerm("chaig");
suggestionQuery.setAccuracy(0.4f);
suggestionQuery.setMaxSuggestions(5);
suggestionQuery.setDistance(StringDistanceTypes.JARO_WINKLER);
suggestionQuery.setPopularity(true);
session
.query(Product.class, Products_ByName.class)
.suggest(suggestionQuery);
SuggestionQuery suggestionQuery = new SuggestionQuery();
suggestionQuery.setField("Name");
suggestionQuery.setTerm("chaig");
suggestionQuery.setAccuracy(0.4f);
suggestionQuery.setMaxSuggestions(5);
suggestionQuery.setDistance(StringDistanceTypes.JARO_WINKLER);
suggestionQuery.setPopularity(true);
store
.getDatabaseCommands()
.suggest("Products/ByName", suggestionQuery);
public static class Products_ByName extends AbstractIndexCreationTask {
public Products_ByName() {
QProduct p = QProduct.product;
map =
" from product in docs.Products " +
" select new " +
" { " +
" product.Name " +
" }; ";
index(p.name, FieldIndexing.ANALYZED); // (optional) splitting name into multiple tokens
suggestion(p.name); // configuring suggestions
}
}
Suggest over multiple words
RavenDB allows you to perform a suggestion query over multiple words. In order to use this functionalify you have to pass words that you are looking for in Term by using special RavenDB syntax (more details here):
SuggestionQuery suggestionQuery = new SuggestionQuery();
suggestionQuery.setField("Name");
suggestionQuery.setTerm("<<chaig tof>>");
suggestionQuery.setAccuracy(0.4f);
suggestionQuery.setMaxSuggestions(5);
suggestionQuery.setDistance(StringDistanceTypes.JARO_WINKLER);
suggestionQuery.setPopularity(true);
SuggestionQueryResult resultsByMultipleWords = store
.getDatabaseCommands()
.suggest("Products/ByName", suggestionQuery);
System.out.println("Did you mean?");
for (String suggestion : resultsByMultipleWords.getSuggestions()) {
System.out.println("\t" + suggestion);
}
This will produce the results:
Did you mean?
chai
chang
chartreuse
chef
tofu
Remarks
Warning
Suggestions does not take advantage of the encryption bundle. You should never use this feature on information that should be encrypted, because then you have a risk of storing sensitive data on a disk in unsecured manner.
Increased indexing time
Indexes with turned on suggestions tend to use much more CPU power than other indexes, this can impact indexing speed (querying is not impacted).