Query for Suggestions


  • Given a string term, the Suggestion feature will offer similar terms from your data.

  • Word similarities are found using string distance algorithms.

  • Examples in this article demonstrate getting suggestions with a dynamic-query.
    For getting suggestions with an index-query see query for suggestions with index.



What are terms

  • All queries in RavenDB use an index - learn more about that here.
    Whether making a dynamic query which generates an auto-index or using a static index,
    the data from your documents is 'broken' into terms that are kept in the index.

  • This tokenization process (what terms will be generated) depends on the analyzer used,
    various analyzers differ in the way they split the text stream. Learn more in Analyzers.

  • The terms can then be queried to retrieve matching documents that contain them.

When to use suggestions

Querying for suggestions is useful in the following scenarios:

  • When query has no results:

    • When searching for documents that match some condition on a given string term,
      if the term is misspelled then you will Not get any results.
      You can then ask RavenDB to suggest similar terms that do exist in the index.

    • The suggested terms can then be used in a new query to retrieve matching documents,
      or simply presented to the user asking what they meant to query.

  • When looking for alternative terms:

    • When simply searching for additional alternative terms for a term that does exist.

The resulting suggested terms will Not include the term for which you search,
they will only contain the similar terms.

Suggest terms - for single term

Consider this example:
Based on the Northwind sample data, the following query has no resulting documents,
as no document in the Products collection contains the term chaig in its Name field.

# This dynamic query on the 'Products' collection has NO documents
products = list(session.query(object_type=Product).where_equals("name", "chaig"))
  • Executing the above query will generate the auto-index Auto/Products/ByName.
    This auto-index will contain a list of all available terms from the document field Name.
    The generated terms are visible in the Studio - see image below.

  • If you suspect that the term chaig in the query criteria is written incorrectly,
    you can ask RavenDB to suggest existing terms that are similar to chaig, as follows:.

# Query for suggested terms for single term:
# ==========================================
suggestions = (
    session.query(object_type=Product)
    .suggest_using(lambda builder: builder.by_field("name", "chaig"))
    .execute()
)
# Define the suggestion request for single term
suggestion_request = SuggestionWithTerm("name")
suggestion_request.term = "chaig"

# Query for suggestions
suggestions = (
    session.query(object_type=Product)
    # Call 'suggest_using' - pass the suggestion request
    .suggest_using(suggestion_request).execute()
)
// Query for terms from field 'Name' that are similar to 'chaig'
from "Products"
select suggest(Name, "chaig")

# The resulting suggested terms:
# ==============================

print("Suggested terms in field 'name' that are similar to 'chaig':")
for suggested_term in suggestions["name"].suggestions:
    print(f"\t{suggested_term}")

#  Suggested terms in field 'Name' that are similar to 'chaig':
#  chai
#  chang

Suggest terms - for multiple terms

# Query for suggested terms for multiple terms:
# =============================================

suggestions = (
    session
    # Make a dynamic query on collection 'Products'
    .query(object_type=Product)
    # Call 'suggest_using'
    .suggest_using(
        lambda builder: builder
        # Request to get terms from field 'name' that are similar to 'chaig' OR 'tof'
        .by_field("name", ["chaig", "tof"])
    ).execute()
)
# Define the suggestion request for multiple terms
suggestion_request = SuggestionWithTerms("name")
# Looking for terms from field 'name' that are similar to 'chaig' OR 'tof'
suggestion_request.terms = ["chaig", "tof"]

# Query for suggestions
suggestions = (
    session.query(object_type=Product)
    # Call 'suggest_using' - pass the suggestion request
    .suggest_using(suggestion_request).execute()
)
// Query for terms from field 'Name' that are similar to 'chaig' OR 'tof'
from "Products" select suggest(Name, $p0)
{ "p0" : ["chaig", "tof"] }

# The resulting suggested terms:
#  ==============================
#
# Suggested terms in field 'Name' that are similar to 'chaig' OR to 'tof':
#      chai
#      chang
#      tofu

Suggest terms - for multiple fields

# Query for suggested terms in multiple fields:
# =============================================

suggestions = (
    session
    # Make a dynamic query on collection 'Companies'
    .query(object_type=Company)
    # Call 'suggest_using' to get suggestions for terms that are
    # similar to 'chop-soy china' in first document field (e.g. 'name')
    .suggest_using(lambda builder: builder.by_field("name", "chop-soy china"))
    # Call 'and_suggest_using' to get suggestions for terms that are
    # similar to 'maria larson' in an additional field (e.g. 'Contact.Name')
    .and_suggest_using(lambda builder: builder.by_field("contact.name", "maria larson")).execute()
)
# Define suggestion requests for multiple fields:

request1 = SuggestionWithTerm("name")
# Looking for terms from field 'Name' that are similar to 'chop-soy china'
request1.term = "chop-soy china"

request2 = SuggestionWithTerm("contact.name")
# Looking for terms from nested field 'Contact.Name' that are similar to 'maria larson'
request2.term = ["maria larson"]

suggestions = (
    session.query(object_type=Company)
    # Call 'suggest_using' - pass the suggestion request for the first field
    .suggest_using(request1)
    # Call 'and_suggest_using' - pass the suggestion request for the second field
    .and_suggest_using(request2).execute()
)
// Query for suggested terms from field 'Name' and field 'Contact.Name'
from "Companies"
select suggest(Name, "chop-soy china"), suggest(Contact.Name, "maria larson")

# The resulting suggested terms:
# ==============================
#
# Suggested terms in field 'name' that is similar to 'chop-soy china':
#     chop-suey chinese
#
# Suggested terms in field 'contact.name' that are similar to 'maria larson':
#     maria larsson
#     marie bertrand
#     aria cruz
#     paula wilson
#     maria anders

Suggest terms - customize options and display name

#  Query for suggested terms - customize options and display name:
#  ===============================================================
suggestions = (
    session
    # Make a dynamic query on collection 'Products'
    .query(object_type=Product)
    # Call 'suggest_using'
    .suggest_using(
        lambda builder: builder.by_field("name", "chaig")
        # Customize suggestion options
        .with_options(
            SuggestionOptions(
                accuracy=0.4,
                page_size=5,
                distance=StringDistanceTypes.JARO_WINKLER,
                sort_mode=SuggestionSortMode.POPULARITY,
            )
        )
        # Customize display name for results
        .with_display_name("SomeCustomName")
    ).execute()
)
# Define the suggestion request
suggestion_request = SuggestionWithTerm("name")
# Looking for terms from field 'Name' that are similar to term 'chaig'
suggestion_request.term = "chaig"
# Customize options
suggestion_request.options = SuggestionOptions(
    accuracy=5,
    page_size=5,
    distance=StringDistanceTypes.JARO_WINKLER,
    sort_mode=SuggestionSortMode.POPULARITY,
)
# Customize display name
suggestion_request.display_field = "SomeCustomName"

# Query for suggestions
suggestions = (
    session.query(object_type=Product)
    # Call 'suggest_using' - pass the suggestion request
    .suggest_using(suggestion_request).execute()
)
// Query for suggested terms - customize options and display name
from "Products"
select suggest(
    Name,
    'chaig',
    '{ "Accuracy" : 0.4, "PageSize" : 5, "Distance" : "JaroWinkler", "SortMode" : "Popularity" }'
) as "SomeCustomName"

# The resulting suggested terms:
# ==============================

print("Suggested terms:")
# Results are available under the custom name entry
for suggested_term in suggestions["SomeCustomName"].suggestions:
    print(f"\t{suggested_term}")

# Suggested terms:
#     chai
#     chang
#     chartreuse verte

The auto-index terms in Studio

Based on the Northwind sample data, these are the terms generated for index Auto/Products/ByName:

Figure 1. Auto-index terms

Terms generated for index Auto/Products/ByName

  1. The field name - derived from the document field that was used in the dynamic-query.
    In this example the field name is Name.

  2. The terms generated from the data that the Products collection documents have in their Name field.

Syntax

Suggest using:

# Method for requesting suggestions for term(s) in a field:
def suggest_using(
    self, suggestion_or_builder: Union[SuggestionBase, Callable[[SuggestionBuilder[_T]], None]]
) -> SuggestionDocumentQuery[_T]: ...

# Method for requesting suggestions for term(s) in another field in the same query:
def and_suggest_using(
    self, suggestion_or_builder: Union[SuggestionBase, Callable[[SuggestionBuilder[_T]], None]]
) -> SuggestionDocumentQuery[_T]: ...
Parameter Type Description
suggestion_or_builder
(Union)
SuggestionBase Suggestion instance
Pass suggest_using a SuggestionBase instance with the term or terms (SuggestionWithTerm or SuggestionWithTerms) it will generate suggestions by.
Callable[[SuggestionBuilder[_T]], None] Suggestion builder
Use suggest_using's fluent API to pass it a method that takes SuggestionBuilder as a parameter and generate a suggestion definition that matches your needs.
Return type Description
SuggestionDocumentQuery[_T] The generated suggestions query, that can now be executed using execute() or further altered.
When execute() is called, it will return the suggestions in a Dict[str, SuggestionResult] dictionary.

Builder operations:

def by_field(self, field_name: str, term_or_terms: Union[str, List[str]]) -> SuggestionOperations[_T]: ...

def with_display_name(self, display_name: str) -> SuggestionOperations[_T]: ...
def with_options(self, options: SuggestionOptions) -> SuggestionOperations[_T]: ...
Parameter Type Description
field_name str The index field to search for similar terms
term_or_terms (Union) str or List[str] Term or List of terms to get suggested similar terms for
display_name str A custom name for the suggestions result
options SuggestionOptions Non-default options to use in the operation

Suggestion options:

DEFAULT_ACCURACY = 0.5
DEFAULT_PAGE_SIZE = 15
DEFAULT_DISTANCE = StringDistanceTypes.LEVENSHTEIN
DEFAULT_SORT_MODE = SuggestionSortMode.POPULARITY

def __init__(
    self,
    page_size: int = DEFAULT_PAGE_SIZE,
    distance: StringDistanceTypes = DEFAULT_DISTANCE,
    accuracy: float = DEFAULT_ACCURACY,
    sort_mode: SuggestionSortMode = DEFAULT_SORT_MODE,
):
    self.page_size = page_size
    self.distance = distance
    self.accuracy = accuracy
    self.sort_mode = sort_mode
page_size int Maximum number of suggested terms that will be returned
Default: 15
distance StringDistanceTypes String distance algorithm to use (NONE / LEVENSHTEIN / JAROWINKLER / NGRAM)
Default: LEVENSHTEIN
accuracy float Suggestion accuracy
Default: 0.5
sort_mode SuggestionSortMode Indicates the order by which results are returned (NONE / POPULARITY)
Default: POPULARITY