Full-Text Search with Index


  • Prior to reading this article, please refer to full-Text search with dynamic queries to learn about the search method.

  • All capabilities provided by search with a dynamic query can also be used when querying a static-index.

  • However, as opposed to making a dynamic search query where an auto-index is created for you,
    when using a static-index:

    • You must configure the index-field in which you want to search.
      See examples below.

    • You can configure which analyzer will be used to tokenize this field.
      See selecting an analyzer.



Indexing single field for FTS

The index:

class Employees_ByNotes(AbstractIndexCreationTask):
    # The IndexEntry class defines the index-fields
    class IndexEntry:
        def __init__(self, employee_notes: str = None):
            self.employee_notes = employee_notes

    def __init__(self):
        super().__init__()
        # The 'Map' function defines the content of the index-fields
        self.map = "from employee in docs.Employees " "select new " "{ " " employee_notes = employee.Notes[0]" "}"

        # Configure the index-field for FTS:
        # Set 'FieldIndexing.Search' on index-field 'employee_notes'
        self._index("employee_notes", FieldIndexing.SEARCH)

        # Optionally: Set your choice of analyzer for the index-field:
        # Here the text from index-field 'EmployeeNotes' will be tokenized by 'WhitespaceAnalyzer'.
        self._analyze("employee_notes", "WhitespaceAnalyzer")

        # Note:
        # If no analyzer is set then the default 'RavenStandardAnalyzer' is used.

  • Use Search to make a full-text search when querying the index.

  • Refer to Full-Text search with dynamic queries for all available Search options,
    such as using wildcards, searching for multiple terms, etc.

employees = list(
    session
    # Query the index
    .query_index_type(Employees_ByNotes, Employees_ByNotes.IndexEntry)
    # Call 'search':
    # pass the index field that was configured for FTS and the term to search for.
    .search("employee_notes", "French").of_type(Employee)
)
# * Results will contain all Employee documents that have 'French' in their 'Notes' field.

# * Search is case-sensitive since field was indexed using the 'WhitespaceAnalyzer'
#   which preserves casing.
from index "Employees/ByNotes"
where search(EmployeeNotes, "French")

Indexing multiple fields for FTS

The index:

class Employees_ByEmployeeData(AbstractIndexCreationTask):
    class IndexEntry:
        def __init__(self, employee_data: List = None):
            self.employee_data = employee_data

    def __init__(self):
        super().__init__()
        self.map = (
            "from employee in docs.Employees "
            "select new {"
            "  employee_data = "
            "  {"
            # Multiple document-fields can be indexed
            # into the single index-field 'employee_data'
            "    employee.FirstName,"
            "    employee.LastName,"
            "    employee.Title,"
            "    employee.Notes"
            "  }"
            "}"
        )
        # Configure the index-field for FTS:
        # Set 'FieldIndexing.SEARCH' on index-field 'employee_data'
        self._index("employee_data", FieldIndexing.SEARCH)

        # Note:
        # Since no analyzer is set then the default 'RavenStandardAnalyzer' is used.

Sample query:

employees = list(
    session
    # Query the static-index
    .query_index_type(Employees_ByEmployeeData, Employees_ByEmployeeData.IndexEntry)
    .open_subclause()
    # A logical OR is applied between the following two search calls
    .search("employee_data", "Manager")
    # A logical AND is applied between the following two terms
    .search("employee_data", "French Spanish", operator=SearchOperator.AND)
    .close_subclause()
    .of_type(Employee)
)

# * Results will contain all Employee documents that have:
#   ('Manager' in any of the 4 document-fields that were indexed)
#   OR
#   ('French' AND 'Spanish' in any of the 4 document-fields that were indexed)

# * Search is case-insensitive since the default analyzer is used
from index "Employees/ByEmployeeData"
where (search(EmployeeData, "Manager") or search(EmployeeData, "French Spanish", and))

Boosting search results

  • In order to prioritize results, you can provide a boost value to the searched terms.
    This can be applied by either of the following:

    • Add a boost value to the relevant index-field inside the index definition.
      Refer to article indexes - boosting.

    • Add a boost value to the queried terms at query time.
      Refer to article Boost search results.