Indexes: Indexing Related Documents


  • Whenever a document references another document, the referenced document is called a Related Document.

  • In the image below, document products/34-A references documents categories/1-A & suppliers/16-A,
    which are considered Related Documents.

Referencing related documents

Referencing related documents

Example I - basic


What is tracked:

  • Both the documents from the indexed collection and the indexed related documents are tracked for changes.
    Re-indexing will be triggered per any change in either collection.
    (See changes that cause re-indexing here).

The index:

  • Following the above Product - Category relationship from the Northwind sample database,
    an index defined on the Products collection can index data from the related Category document.

public class Products_ByCategoryName : AbstractIndexCreationTask<Product>
{
    public class IndexEntry
    {
        public string CategoryName { get; set; }
    }

    public Products_ByCategoryName()
    {
        Map = products => from product in products
            
            // Call LoadDocument to load the related Category document
            // The document ID to load is specified by 'product.Category'
            let category = LoadDocument<Category>(product.Category)
            
            select new IndexEntry
            {
                // Index the Name field from the related Category document
                CategoryName = category.Name
            };
                
            // Any change to either Products or Categories will trigger reindexing 
    }
}
public class Products_ByCategoryName_JS : AbstractJavaScriptIndexCreationTask
{
    public Products_ByCategoryName_JS()
    {
        Maps = new HashSet<string>()
        {
            // Call method 'load' to load the related Category document
            // The document ID to load is specified by 'product.Category'
            // The Name field from the related Category document will be indexed
            
            @"map('products', function(product) {
                let category = load(product.Category, 'Categories')
                return {
                    CategoryName: category.Name
                };
            })"
            
            // Any change to either Products or Categories will trigger reindexing 
        };
    }
}

The query:

  • We can now query the index for Product documents by CategoryName,
    i.e. get all matching Products that reference a Category that has the specified name term.

IList<Product> matchingProducts = session
    .Query<Products_ByCategoryName.IndexEntry, Products_ByCategoryName>()
    .Where(x => x.CategoryName == "Beverages")
    .OfType<Product>()
    .ToList();
IList<Product> matchingProducts = await asyncSession
    .Query<Products_ByCategoryName.IndexEntry, Products_ByCategoryName>()
    .Where(x => x.CategoryName == "Beverages")
    .OfType<Product>()
    .ToListAsync();
from index "Products/ByCategoryName"
where CategoryName == "Beverages"

Example II - list


The documents:

// The referencing document
public class Author
{
    public string Id { get; set; }
    public string Name { get; set; }
    
    // Referencing a list of related document IDs
    public List<string> BookIds { get; set; }
}

// The related document
public class Book
{
    public string Id { get; set; }
    public string Name { get; set; }
}

The index:

  • This index will index all names of the related Book documents.

public class Authors_ByBooks : AbstractIndexCreationTask<Author>
{
    public class IndexEntry
    {
        public IEnumerable<string> BookNames { get; set; }
    }

    public Authors_ByBooks()
    {
        Map = authors => from author in authors
            select new IndexEntry
            {
                // For each Book ID, call LoadDocument and index the book's name
                BookNames = author.BookIds.Select(x => LoadDocument<Book>(x).Name)
            };
        
        // Any change to either Authors or Books will trigger reindexing 
    }
}
public class Authors_ByBooks_JS : AbstractJavaScriptIndexCreationTask
{
    public Authors_ByBooks_JS()
    {
        Maps = new HashSet<string>()
        {
            // For each Book ID, call 'load' and index the book's name
            @"map('Author', function(author) {
                return {
                    Books: author.BooksIds.map(x => load(x, 'Books').Name)
                }
            })"
            
            // Any change to either Authors or Books will trigger reindexing 
        };
    }
}

The query:

  • We can now query the index for Author documents by a book's name,
    i.e. get all Authors that have the specified book's name in their list.

// Get all authors that have books with title: "The Witcher"
IList<Author> matchingAuthors = session
    .Query<Authors_ByBooks.IndexEntry, Authors_ByBooks>()
    .Where(x => x.BookNames.Contains("The Witcher"))
    .OfType<Author>()
    .ToList();
// Get all authors that have books with title: "The Witcher"
IList<Author> matchingAuthors = await asyncSession
    .Query<Authors_ByBooks.IndexEntry, Authors_ByBooks>()
    .Where(x => x.BookNames.Contains("The Witcher"))
    .OfType<Author>()
    .ToListAsync();
// Get all authors that have books with title: "The Witcher"
from index "Authors/ByBooks"
where BookNames = "The Witcher"

Tracking implications

  • Indexing related data with tracking can be a useful way to query documents by their related data.
    However, that may come with performance costs.

  • Re-indexing will be triggered whenever any document in the collection that is referenced by LoadDocument is changed. Even when indexing just a single field from the related document, any change to any other field will cause re-indexing. (See changes that cause re-indexing here).

  • Frequent re-indexing will increase CPU usage and reduce performance,
    and index results may be stale for prolonged periods.

  • Tracking indexed related data is more useful when the indexed related collection is known not to change much.

Document changes that cause re-indexing

  • The following changes done to a document will trigger re-indexing:

    • Any modification to any document field (not just to the indexed fields)
    • Adding/Deleting an attachment
    • Creating a new Time series (modifying existing will not trigger)
    • Creating a new Counter (modifying existing will not trigger)
  • Any such change done either on any document in the indexed collection or in the indexed related documents will trigger re-indexing.

LoadDocument syntax

Syntax for LINQ-index:

T LoadDocument<T>(string relatedDocumentId);

T LoadDocument<T>(string relatedDocumentId, string relatedCollectionName);

T[] LoadDocument<T>(IEnumerable<string> relatedDocumentIds);

T[] LoadDocument<T>(IEnumerable<string> relatedDocumentIds, string relatedCollectionName);

Syntax for JavaScript-index:

object load(relatedDocumentId, relatedCollectionName);
Parameters
relatedDocumentId string ID of the related document to load
relatedCollectionName string The related collection name
relatedDocumentIds IEnumerable<string> A list of related document IDs to load