Full-Text Search with Index
-
Prior to this article, please refer to Full-Text search with dynamic queries to learn about the
Search
method. -
All capabilities provided by
Search
with a dynamic query can also be used when querying a static-index. -
However, as opposed to making a dynamic search query where an auto-index is created for you,
when using a static-index:-
You must configure the index-field in which you want to search.
See examples below. -
You can configure which analyzer will be used to tokenize this field.
See selecting an analyzer.
-
Indexing single field for FTS
The index:
public class Employees_ByNotes :
AbstractIndexCreationTask<Employee, Employees_ByNotes.IndexEntry>
{
// The IndexEntry class defines the index-fields
public class IndexEntry
{
public string EmployeeNotes { get; set; }
}
public Employees_ByNotes()
{
// The 'Map' function defines the content of the index-fields
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeNotes = employee.Notes[0]
};
// Configure the index-field for FTS:
// Set 'FieldIndexing.Search' on index-field 'EmployeeNotes'
Index(x => x.EmployeeNotes, FieldIndexing.Search);
// Optionally: Set your choice of analyzer for the index-field.
// Here the text from index-field 'EmployeeNotes' will be tokenized by 'WhitespaceAnalyzer'.
Analyze(x => x.EmployeeNotes, "WhitespaceAnalyzer");
// Note:
// If no analyzer is set then the default 'RavenStandardAnalyzer' is used.
}
}
Query with Search:
-
Use
Search
to make a full-text search when querying the index. -
Refer to Full-Text search with dynamic queries for all available Search options,
such as using wildcards, searching for multiple terms, etc.
List<Employee> employees = session
// Query the index
.Query<Employees_ByNotes.IndexEntry, Employees_ByNotes>()
// Call 'Search':
// pass the index field that was configured for FTS and the term to search for.
.Search(x => x.EmployeeNotes, "French")
.OfType<Employee>()
.ToList();
// * Results will contain all Employee documents that have 'French' in their 'Notes' field.
//
// * Search is case-sensitive since field was indexed using the 'WhitespaceAnalyzer'
// which preserves casing.
List<Employee> employees = await asyncSession
// Query the index
.Query<Employees_ByNotes.IndexEntry, Employees_ByNotes>()
// Call 'Search':
// pass the index field that was configured for FTS and the term to search for.
.Search(x => x.EmployeeNotes, "French")
.OfType<Employee>()
.ToListAsync();
// * Results will contain all Employee documents that have 'French' in their 'Notes' field.
//
// * Search is case-sensitive since field was indexed using the 'WhitespaceAnalyzer'
// which preserves casing.
List<Employee> employees = session.Advanced
// Query the index
.DocumentQuery<Employees_ByNotes.IndexEntry, Employees_ByNotes>()
// Call 'Search':
// pass the index field that was configured for FTS and the term to search for.
.Search(x => x.EmployeeNotes, "French")
.OfType<Employee>()
.ToList();
// * Results will contain all Employee documents that have 'French' in their 'Notes' field.
//
// * Search is case-sensitive since field was indexed using the 'WhitespaceAnalyzer'
// which preserves casing.
from index "Employees/ByNotes"
where search(EmployeeNotes, "French")
Indexing multiple fields for FTS
The index:
public class Employees_ByEmployeeData :
AbstractIndexCreationTask<Employee, Employees_ByEmployeeData.IndexEntry>
{
public class IndexEntry
{
public object[] EmployeeData { get; set; }
}
public Employees_ByEmployeeData()
{
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeData = new object[]
{
// Multiple document-fields can be indexed
// into the single index-field 'EmployeeData'
employee.FirstName,
employee.LastName,
employee.Title,
employee.Notes
}
};
// Configure the index-field for FTS:
// Set 'FieldIndexing.Search' on index-field 'EmployeeData'
Index(x => x.EmployeeData, FieldIndexing.Search);
// Note:
// Since no analyzer is set then the default 'RavenStandardAnalyzer' is used.
}
}
Sample query:
List<Employee> employees = session
// Query the static-index
.Query<Employees_ByEmployeeData.IndexEntry, Employees_ByEmployeeData>()
// A logical OR is applied between the following two Search calls:
.Search(x => x.EmployeeData, "Manager")
// A logical AND is applied between the following two terms:
.Search(x => x.EmployeeData, "French Spanish", @operator: SearchOperator.And)
.OfType<Employee>()
.ToList();
// * Results will contain all Employee documents that have:
// ('Manager' in any of the 4 document-fields that were indexed)
// OR
// ('French' AND 'Spanish' in any of the 4 document-fields that were indexed)
//
// * Search is case-insensitive since the default analyzer is used
List<Employee> employees = await asyncSession
// Query the static-index
.Query<Employees_ByEmployeeData.IndexEntry, Employees_ByEmployeeData>()
// A logical OR is applied between the following two Search calls:
.Search(x => x.EmployeeData, "Manager")
// A logical AND is applied between the following two terms:
.Search(x => x.EmployeeData, "French Spanish", @operator: SearchOperator.And)
.OfType<Employee>()
.ToListAsync();
// * Results will contain all Employee documents that have:
// ('Manager' in any of the 4 document-fields that were indexed)
// OR
// ('French' AND 'Spanish' in any of the 4 document-fields that were indexed)
//
// * Search is case-insensitive since the default analyzer is used
List<Employee> employees = session.Advanced
// Query the static-index
.DocumentQuery<Employees_ByEmployeeData.IndexEntry, Employees_ByEmployeeData>()
.OpenSubclause()
// A logical OR is applied between the following two Search calls:
.Search(x => x.EmployeeData, "Manager")
// A logical AND is applied between the following two terms:
.Search(x => x.EmployeeData, "French Spanish", @operator: SearchOperator.And)
.CloseSubclause()
.OfType<Employee>()
.ToList();
// * Results will contain all Employee documents that have:
// ('Manager' in any of the 4 document-fields that were indexed)
// OR
// ('French' AND 'Spanish' in any of the 4 document-fields that were indexed)
//
// * Search is case-insensitive since the default analyzer is used
from index "Employees/ByEmployeeData"
where (search(EmployeeData, "Manager") or search(EmployeeData, "French Spanish", and))
Boosting search results
-
In order to prioritize results, you can provide a boost value to the searched terms.
This can be applied by either of the following:-
Add a boost value to the relevant index-field inside the index definition.
Refer to article indexes - boosting. -
Add a boost value to the queried terms at query time.
Refer to article Boost search results.
-
Searching with wildcards
-
When making a full-text search with wildcards in the search terms, the presence of wildcards (
*
) in the terms sent to the search engine is determined by the transformations applied by the analyzer used in the index. -
Note the different behavior in the following cases, as described below:
-
When using Corax as the search engine,
this behavior will only apply to indexes that are newly created or have been reset.
When using StandardAnalyzer
or NGramAnalyzer
:
Usually, the same analyzer used to tokenize field content at indexing time is also used to process the terms provided in the full-text search query before they are sent to the search engine to retrieve matching documents.
However, in the following cases:
- When making a dynamic search query
- or when querying a static index that uses the default
StandardAnalyzer
- or when querying a static index that uses the
NGramAnalyzer
the queried terms in the Search method are processed with the LowerCaseKeywordAnalyzer
before being sent to the search engine.
This analyzer does Not remove the *
, so the terms are sent with *
, as provided in the search terms.
For example:
public class Employees_ByNotes_usingDefaultAnalyzer :
AbstractIndexCreationTask<Employee, Employees_ByNotes_usingDefaultAnalyzer.IndexEntry>
{
public class IndexEntry
{
public string EmployeeNotes { get; set; }
}
public Employees_ByNotes_usingDefaultAnalyzer()
{
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeNotes = employee.Notes[0]
};
// Configure the index-field for FTS:
Index(x => x.EmployeeNotes, FieldIndexing.Search);
// Since no analyzer is explicitly set
// then the default 'RavenStandardAnalyzer' will be used at indexing time.
// However, when making a search query with wildcards,
// the 'LowerCaseKeywordAnalyzer' will be used to process the search terms
// prior to sending them to the search engine.
}
}
List<Employee> employees = session
.Query<Employees_ByNotes_usingDefaultAnalyzer.IndexEntry,
Employees_ByNotes_usingDefaultAnalyzer>()
// If you request to include explanations,
// you can see the exact term that was sent to the search engine.
.ToDocumentQuery()
.IncludeExplanations(out var explanations)
.ToQueryable()
// Provide a term with a wildcard to the Search method:
.Search(x => x.EmployeeNotes, "*rench")
.OfType<Employee>()
.ToList();
// Results will contain all Employee documents that have terms that end with 'rench'
// (e.g. French).
// Checking the explanations, you can see that the search term 'rench'
// was sent to the search engine WITH the leading wildcard, i.e. '*rench'
// since the 'LowerCaseKeywordAnalyzer' is used in this case.
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"EmployeeNotes:*rench", explanation);
List<Employee> employees = await asyncSession
.Query<Employees_ByNotes_usingDefaultAnalyzer.IndexEntry,
Employees_ByNotes_usingDefaultAnalyzer>()
// If you request to include explanations,
// you can see the exact term that was sent to the search engine.
.ToDocumentQuery()
.IncludeExplanations(out var explanations)
.ToQueryable()
// Provide a term with a wildcard to the Search method:
.Search(x => x.EmployeeNotes, "*rench")
.OfType<Employee>()
.ToListAsync();
// Results will contain all Employee documents that have terms that end with 'rench'
// (e.g. French).
// Checking the explanations, you can see that the search term 'rench'
// was sent to the search engine WITH the leading wildcard, i.e. '*rench'
// since the 'LowerCaseKeywordAnalyzer' is used in this case.
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"EmployeeNotes:*rench", explanation);
List<Employee> employees = session.Advanced
.DocumentQuery<Employees_ByNotes_usingDefaultAnalyzer.IndexEntry,
Employees_ByNotes_usingDefaultAnalyzer>()
// If you request to include explanations,
// you can see the exact term that was sent to the search engine.
.IncludeExplanations(out var explanations)
// Provide a term with a wildcard to the Search method:
.Search(x => x.EmployeeNotes, "*rench")
.OfType<Employee>()
.ToList();
// Results will contain all Employee documents that have terms that end with 'rench'
// (e.g. French).
// Checking the explanations, you can see that the search term 'rench'
// was sent to the search engine WITH the leading wildcard, i.e. '*rench'
// since the 'LowerCaseKeywordAnalyzer' is used in this case.
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"EmployeeNotes:*rench", explanation);
from index "Employees/ByNotes/usingDefaultAnalyzer"
where search(EmployeeNotes, "*rench")
include explanations()
When using a custom analyzer:
-
When setting a custom analyzer in your index to tokenize field content, then when querying the index, the search terms in the query will be processed according to the custom analyzer's logic.
-
The
*
will remain in the terms if the custom analyzer allows it. It is the user’s responsibility to ensure that wildcards are not removed by the custom analyzer if they should be included in the query. -
Note:
An exception to the above is when the wildcard is used as a suffix in the search term (e.g.Fren*
).
In this case the wildcard will be included in the query regardless of the analyzer's logic.
For example:
public class Employees_ByNotes_usingCustomAnalyzer :
AbstractIndexCreationTask<Employee, Employees_ByNotes_usingCustomAnalyzer.IndexEntry>
{
public class IndexEntry
{
public string EmployeeNotes { get; set; }
}
public Employees_ByNotes_usingCustomAnalyzer()
{
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeNotes = employee.Notes[0]
};
// Configure the index-field for FTS:
Index(x => x.EmployeeNotes, FieldIndexing.Search);
// Set a custom analyzer for the index-field:
Analyze(x => x.EmployeeNotes, "CustomAnalyzers.RemoveWildcardsAnalyzer");
}
}
// The custom analyzer:
// ====================
const string RemoveWildcardsAnalyzer =
@"
using System.IO;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
namespace CustomAnalyzers
{
public class RemoveWildcardsAnalyzer : StandardAnalyzer
{
public RemoveWildcardsAnalyzer() : base(Lucene.Net.Util.Version.LUCENE_30)
{
}
public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
{
// Read input stream and remove wildcards (*)
string text = reader.ReadToEnd();
string processedText = RemoveWildcards(text);
StringReader newReader = new StringReader(processedText);
return base.TokenStream(fieldName, newReader);
}
private string RemoveWildcards(string input)
{
// Replace wildcard characters with an empty string
return input.Replace(""*"", """");
}
}
}";
// Deploying the custom analyzer:
// ==============================
store.Maintenance.Send(new PutAnalyzersOperation(new AnalyzerDefinition()
{
Name = "CustomAnalyzers.RemoveWildcardsAnalyzer",
Code = RemoveWildcardsAnalyzer,
}));
List<Employee> employees = session
.Query<Employees_ByNotes_usingCustomAnalyzer.IndexEntry,
Employees_ByNotes_usingCustomAnalyzer>()
.ToDocumentQuery()
.IncludeExplanations(out var explanations)
.ToQueryable()
// Provide a term with wildcards to the Search method:
.Search(x => x.EmployeeNotes, "*French*")
.OfType<Employee>()
.ToList();
// Even though a wildcard was provided,
// the results will contain only Employee documents that contain the exact term 'French'.
// The search term was sent to the search engine WITHOUT the wildcard,
// as the custom analyzer's logic strips them out.
// This can be verified by checking the explanations:
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"EmployeeNotes:french", explanation);
Assert.DoesNotContain($"EmployeeNotes:*french", explanation);
List<Employee> employees = await asyncSession
.Query<Employees_ByNotes_usingCustomAnalyzer.IndexEntry,
Employees_ByNotes_usingCustomAnalyzer>()
.ToDocumentQuery()
.IncludeExplanations(out var explanations)
.ToQueryable()
// Provide a term with wildcards to the Search method:
.Search(x => x.EmployeeNotes, "*French*")
.OfType<Employee>()
.ToListAsync();
// Even though a wildcard was provided,
// the results will contain only Employee documents that contain the exact term 'French'.
// The search term was sent to the search engine WITHOUT the wildcard,
// as the custom analyzer's logic strips them out.
// This can be verified by checking the explanations:
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"EmployeeNotes:french", explanation);
Assert.DoesNotContain($"EmployeeNotes:*french", explanation);
List<Employee> employees = session.Advanced
.DocumentQuery<Employees_ByNotes_usingCustomAnalyzer.IndexEntry,
Employees_ByNotes_usingCustomAnalyzer>()
.IncludeExplanations(out var explanations)
// Provide a term with wildcards to the Search method:
.Search(x => x.EmployeeNotes, "*French*")
.OfType<Employee>()
.ToList();
// Even though a wildcard was provided,
// the results will contain only Employee documents that contain the exact term 'French'.
// The search term was sent to the search engine WITHOUT the wildcard,
// as the custom analyzer's logic strips them out.
// This can be verified by checking the explanations:
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"EmployeeNotes:french", explanation);
Assert.DoesNotContain($"EmployeeNotes:*french", explanation);
from index "Employees/ByNotes/UsingCustomAnalyzer"
where search(EmployeeNotes, "*French*")
include explanations()
When using the Exact analyzer:
When using the default Exact analyzer in your index (which is KeywordAnalyzer
),
then when querying the index, the wildcards in your search terms remain untouched.
The terms are sent to the search engine exactly as produced by the analyzer.
For example:
public class Employees_ByFirstName_usingExactAnalyzer :
AbstractIndexCreationTask<Employee, Employees_ByFirstName_usingExactAnalyzer.IndexEntry>
{
public class IndexEntry
{
public string FirstName { get; set; }
}
public Employees_ByFirstName_usingExactAnalyzer()
{
Map = employees => from employee in employees
select new IndexEntry()
{
FirstName = employee.FirstName
};
// Set the Exact analyzer for the index-field:
// (The field will not be tokenized)
Indexes.Add(x => x.FirstName, FieldIndexing.Exact);
}
}
List<Employee> employees = session
.Query<Employees_ByFirstName_usingExactAnalyzer.IndexEntry,
Employees_ByFirstName_usingExactAnalyzer>()
.ToDocumentQuery()
.IncludeExplanations(out var explanations)
.ToQueryable()
// Provide a term with a wildcard to the Search method:
.Search(x => x.FirstName, "Mich*")
.OfType<Employee>()
.ToList();
// Results will contain all Employee documents with FirstName that starts with 'Mich'
// (e.g. Michael).
// The search term, 'Mich*', is sent to the search engine
// exactly as was provided to the Search method, WITH the wildcard.
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"FirstName:Mich*", explanation);
List<Employee> employees = await asyncSession
.Query<Employees_ByFirstName_usingExactAnalyzer.IndexEntry,
Employees_ByFirstName_usingExactAnalyzer>()
.ToDocumentQuery()
.IncludeExplanations(out var explanations)
.ToQueryable()
// Provide a term with a wildcard to the Search method:
.Search(x => x.FirstName, "Mich*")
.OfType<Employee>()
.ToListAsync();
// Results will contain all Employee documents with FirstName that starts with 'Mich'
// (e.g. Michael).
// The search term, 'Mich*', is sent to the search engine
// exactly as was provided to the Search method, WITH the wildcard.
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"FirstName:Mich*", explanation);
List<Employee> employees = session.Advanced
.DocumentQuery<Employees_ByFirstName_usingExactAnalyzer.IndexEntry,
Employees_ByFirstName_usingExactAnalyzer>()
.IncludeExplanations(out var explanations)
// Provide a term with a wildcard to the Search method:
.Search(x => x.FirstName, "Mich*")
.OfType<Employee>()
.ToList();
// Results will contain all Employee documents with FirstName that starts with 'Mich'
// (e.g. Michael).
// The search term, 'Mich*', is sent to the search engine
// exactly as was provided to the Search method, WITH the wildcard.
var explanation = explanations.GetExplanations(employees[0].Id)[0];
Assert.Contains($"FirstName:Mich*", explanation);
from index "Employees/ByFirstName/usingExactAnalyzer"
where search(FirstName, "Mich*")
include explanations()