Indexes: Indexing Related Documents


  • Whenever a document references another document, the referenced document is called a Related Document.

  • In the image below, document products/34-A references documents categories/1-A & suppliers/16-A,
    which are considered Related Documents. Referencing related documents

Example I - basic


What is tracked:

  • Both the documents from the indexed collection and the indexed related documents are tracked for changes.
    Re-indexing will be triggered per any change in either collection.
    (See changes that cause re-indexing here).

The index:

  • Following the above Product - Category relationship from the Northwind sample database,
    an index defined on the Products collection can index data from the related Category document.

class Products_ByCategoryName extends AbstractCsharpIndexCreationTask {
    constructor() {
        super();
        
        // Call LoadDocument to load the related Category document
        // The document ID to load is specified by 'product.Category'
        // The Name field from the related Category document will be indexed
        
        this.map = `docs.Products.Select(product => new {
            CategoryName = (this.LoadDocument(product.Category, "Categories")).Name 
        })`;

        // Since NoTracking was Not specified,
        // then any change to either Products or Categories will trigger reindexing 
    }
}
class Products_ByCategoryName_JS extends AbstractJavaScriptIndexCreationTask {
    constructor () {
        super();

        const { load } = this.mapUtils();

        this.map("Products", product => {
            return {
                // Call method 'load' to load the related Category document
                // The document ID to load is specified by 'product.Category'
                // The Name field from the related Category document will be indexed                
                categoryName: load(product.Category, "Categories").Name

                // Since NoTracking was Not specified,
                // then any change to either Products or Categories will trigger reindexing
            };
        });
    }
}

The query:

  • We can now query the index for Product documents by CategoryName,
    i.e. get all matching Products that reference a Category that has the specified name term.

const matchingProducts = await session
    .query({indexName: "Products/ByCategoryName"})
    .whereEquals("CategoryName", "Beverages")
    .all();
from index "Products/ByCategoryName"
where CategoryName == "Beverages"

Example II - list


The documents:

// The referencing document
class Author {
    constructor(id, name, bookIds) {
        this.id = id;
        this.name = name;
        
        // Referencing a list of related document IDs
        this.bookIds = bookIds;
    }
}
// The related document
class Book {
    constructor(id, name) {
        this.id = id;
        this.name = name;
    }
}

The index:

  • This index will index all names of the related Book documents.

class Authors_ByBooks extends AbstractCsharpIndexCreationTask {
    constructor() {
        super();

        // For each Book ID, call LoadDocument and index the book's name
        this.map = `docs.Authors.Select(author => new {
            BookNames = author.bookIds.Select(x => (this.LoadDocument(x, "Books")).name) 
        })`;

        // Since NoTracking was Not specified,
        // then any change to either Authors or Books will trigger reindexing
    }
}
class Authors_ByBooks_JS extends AbstractJavaScriptIndexCreationTask {
    constructor() {
        super();

        const { load } = this.mapUtils();

        this.map("Authors", author => {
            return {
                // For each Book ID, call 'load' and index the book's name
                BookNames: author.bookIds.map(x => load(x, "Books").name)

                // Since NoTracking was Not specified,
                // then any change to either Products or Categories will trigger reindexing
            };
        });
    }
}

The query:

  • We can now query the index for Author documents by a book's name,
    i.e. get all Authors that have the specified book's name in their list.

const matchingProducts = await session
    .query({indexName: "Authors/ByBooks"})
    .whereEquals("BookNames", "The Witcher")
    .all();
// Get all authors that have books with title: "The Witcher"
from index "Authors/ByBooks"
where BookNames = "The Witcher"

Tracking implications

  • Indexing related data with tracking can be a useful way to query documents by their related data.
    However, that may come with performance costs.

  • Re-indexing will be triggered whenever any document in the collection that is referenced by LoadDocument is changed. Even when indexing just a single field from the related document, any change to any other field will cause re-indexing. (See changes that cause re-indexing here).

  • Frequent re-indexing will increase CPU usage and reduce performance,
    and index results may be stale for prolonged periods.

  • Tracking indexed related data is more useful when the indexed related collection is known not to change much.

Example III - no tracking


What is tracked:

  • Only the documents from the indexed collection are tracked for changes and can trigger re-indexing.
    Any change done to any document in the indexed related documents will Not trigger re-indexing.
    (See changes that cause re-indexing here).

The index:

class Products_ByCategoryName_NoTracking extends AbstractCsharpIndexCreationTask {
    constructor() {
        super();

        // Call NoTracking.LoadDocument to load the related Category document w/o tracking
        this.map = `docs.Products.Select(product => new {
            CategoryName = (this.NoTracking.LoadDocument(product.Category, "Categories")).Name 
        })`;

        // Since NoTracking is used -
        // then only the changes to Products will trigger reindexing
    }
}
class Products_ByCategoryName_NoTracking_JS extends AbstractJavaScriptIndexCreationTask {
    constructor() {
        super();

        const { noTracking } = this.mapUtils();

        this.map("Products", product => {
            return {
                // Call 'noTracking.load' to load the related Category document w/o tracking
                categoryName: noTracking.load(product.Category, "Categories").Name
            };
        });
        
        // Since noTracking is used -
        // then only the changes to Products will trigger reindexing
    }
}

The query:

  • When querying the index for Product documents by CategoryName,
    results will be based on the related data that was first indexed when the index was deployed.

const matchingProducts = await session
    .query({indexName: "Products/ByCategoryName/NoTracking"})
    .whereEquals("CategoryName", "Beverages")
    .all();
from index "Products/ByCategoryName/NoTracking"
where CategoryName == "Beverages"

No-tracking implications

  • Indexing related data with no-tracking can be a useful way to query documents by their related data.
    However, that may come with some data accuracy costs.

  • Re-indexing will Not be triggered when documents in the collection that is referenced by LoadDocument are changed. Although this may save system resources, the index entries and the indexed terms may not be updated with the current state of data.

  • Indexing related data without tracking is useful when the indexed related data is fixed and not supposed to change.

Document changes that cause re-indexing

  • The following changes done to a document will trigger re-indexing:

    • Any modification to any document field (not just to the indexed fields)
    • Adding/Deleting an attachment
    • Creating a new Time series (modifying existing will not trigger)
    • Creating a new Counter (modifying existing will not trigger)
  • Any such change done on any document in the indexed collection will trigger re-indexing.

  • Any such change done on any document in the indexed related documents will trigger re-indexing
    only if NoTracking was Not used in the index definition.

LoadDocument syntax

Syntax for LINQ-index:

T LoadDocument<T>(string relatedDocumentId);

T LoadDocument<T>(string relatedDocumentId, string relatedCollectionName);

T[] LoadDocument<T>(IEnumerable<string> relatedDocumentIds);

T[] LoadDocument<T>(IEnumerable<string> relatedDocumentIds, string relatedCollectionName);

Syntax for JavaScript-index:

object load(relatedDocumentId, relatedCollectionName);
Parameters
relatedDocumentId string ID of the related document to load
relatedCollectionName string The related collection name
relatedDocumentIds IEnumerable<string> A list of related document IDs to load