Indexes: Term Vectors
- A Term Vector is a representation of a text document
as a vector of identifiers.
Lucene indexes can contain term vectors for documents they index. - Term vectors can be used for various purposes, including similarity searches, information filtering
and retrieval, and indexing.
A book's index, for example, may have term vector enabled on the book's subject field, to be able to use this field to search for books with similar subjects. -
RavenDB features like MoreLikeThis leverage stored term vectors to accomplish their goals.
-
In this page:
Creating an index and enabling Term Vectors on a field
Indexes that include term vectors can be created and configured using the API or Studio.
Using the API
To create an index and enable Term Vectors on a specific field, we can -
A. Create an index using the AbstractIndexCreationTask
, and specify the term vectors there.
B. Or, we can define our term vectors in the IndexDefinition
(directly or using the IndexDefinitionBuilder
).
public class BlogPosts_ByTagsAndContent : AbstractIndexCreationTask<BlogPost>
{
public BlogPosts_ByTagsAndContent()
{
Map = users => from doc in users
select new
{
doc.Tags,
doc.Content
};
Indexes.Add(x => x.Content, FieldIndexing.Search);
TermVectors.Add(x => x.Content, FieldTermVector.WithPositionsAndOffsets);
}
}
IndexDefinitionBuilder<BlogPost> indexDefinitionBuilder =
new IndexDefinitionBuilder<BlogPost>("BlogPosts/ByTagsAndContent")
{
Map = users => from doc in users
select new
{
doc.Tags,
doc.Content
},
Indexes =
{
{ x => x.Content, FieldIndexing.Search }
},
TermVectors =
{
{ x => x.Content, FieldTermVector.WithPositionsAndOffsets }
}
};
IndexDefinition indexDefinition = indexDefinitionBuilder
.ToIndexDefinition(store.Conventions);
store.Maintenance.Send(new PutIndexesOperation(indexDefinition));
Available Term Vector options include:
public enum FieldTermVector
{
/// <summary>
/// Do not store term vectors
/// </summary>
No,
/// <summary>
/// Store the term vectors of each document. A term vector is a list of the document's
/// terms and their number of occurrences in that document.
/// </summary>
Yes,
/// <summary>
/// Store the term vector + token position information
/// </summary>
WithPositions,
/// <summary>
/// Store the term vector + Token offset information
/// </summary>
WithOffsets,
/// <summary>
/// Store the term vector + Token position and offset information
/// </summary>
WithPositionsAndOffsets
}
Learn which Lucene API methods and constants are available here.
Using Studio
Let's use as an example one of Studio's sample indexes, Product/Search
, that has term vector
enabled on its Name
field so a feature like MoreLikeThis
can use this fiels to select a product and find products similar to it.

Term vector enabled on index field
We can now use a query like:
from index 'Product/Search'
where morelikethis(id() = 'products/7-A')