Index Attachments



Index attachments details

The index:


  • To index attachments' details, call attachmentsFor() within the index definition.

  • attachmentsFor() provides access to the name, size, hash, and content-type of each attachment a document has. These details can then be used when defining the index-fields.

  • To index attachments' content, see the examples below.

class Employees_ByAttachmentDetails extends AbstractJavaScriptIndexCreationTask {
    constructor () {
        super();

        const { attachmentsFor } = this.mapUtils();

        this.map("employees", employee => {
            // Call 'attachmentsFor' to get attachments details
            const attachments = attachmentsFor(employee);

            return {
                // Can index info from document properties:
                employeeName: employee.FirstName + " " + employee.LastName,

                // Index DETAILS of attachments:
                attachmentNames: attachments.map(x => x.Name),
                attachmentContentTypes: attachments.map(x => x.ContentType),
                attachmentSizes: attachments.map(x => x.Size)
            }
        });
    }
}

Query the Index:


You can now query for Employee documents based on their attachments details.

const employees = await session
     // Query the index for matching employees
    .query({ indexName: "Employees/ByAttachmentDetails" })
     // Filter employee results by their attachments details
    .whereEquals("attachmentNames", "photo.jpg")
    .whereGreaterThan("attachmentSizes", 20_000)
    .all();

// Results:
// ========
// Running this query on the Northwind sample data,
// results will include 'employees/4-A' and 'employees/5-A'.
// These 2 documents contain an attachment by name 'photo.jpg' with a matching size.
from index "Employees/ByAttachmentDetails"
where attachmentNames == "photo.jpg" and attachmentSizes > 20000

Index details & content - by attachment name

Sample data:


  • Each Employee document in the Northwind sample data already includes a photo.jpg attachment.

  • For all following examples, let's store a textual attachment (file notes.txt) on 3 documents
    in the 'Employees' collection.

const session = documentStore.openSession();

for (let i = 1; i <= 3; i++) {
    // Load an employee document:
    const employee = await session.load(`employees/${i}-A`);
    
    // Store the employee's notes as an attachment on the document:
    const stream = Buffer.from(employee.Notes[0]);
    session.advanced.attachments.store(`employees/${i}-A`, "notes.txt", stream, "text/plain");
}

await session.saveChanges();

The index:


  • To index the details & content for a specific attachment, call loadAttachment() within the index definition.

  • In addition to accessing the attachment details, loadAttachment() provides access to the attachment's content, which can be used when defining the index-fields.

class Employees_ByAttachment extends AbstractJavaScriptIndexCreationTask {
    constructor () {
        super();

        const { loadAttachment } = this.mapUtils();

        this.map("employees", employee => {
            // Call 'loadAttachment' to get attachment's details and content
            // pass the attachment name, e.g. "notes.txt"
            const attachment = loadAttachment(employee, "notes.txt");

            return {
                // Index DETAILS of attachment:
                attachmentName: attachment.Name,
                attachmentContentType: attachment.ContentType,
                attachmentSize: attachment.Size,

                // Index CONTENT of attachment:
                // Call 'getContentAsString' to access content
                attachmentContent: attachment.getContentAsString()
            }
        });

        // It can be useful configure Full-Text search on the attachment content index-field
        this.index("attachmentContent", "Search");

        // Documents with an attachment named 'notes.txt' will be indexed,
        // allowing you to query them by either the attachment's details or its content.
    }
}

Query the Index:


You can now query for Employee documents based on their attachment details and/or its content.

const employees = await session
    // Query the index for matching employees
    .query({indexName: "Employees/ByAttachment"})
    // Can make a full-text search
    // Looking for employees with an attachment content that contains 'Colorado' OR 'Dallas'
    .search("attachmentContent", "Colorado Dallas")
    .all();

// Results:
// ========
// Results will include 'employees/1-A' and 'employees/2-A'.
// Only these 2 documents have an attachment by name 'notes.txt'
// that conains either 'Colorado' or 'Dallas'.
from index "Employees/ByAttachment"
where search(attachmentContent, "Colorado Dallas")

Index details & content - all attachments

The index:


  • Use loadAttachments() to be able to index the details & content of ALL attachments.

  • Note how the index example below is employing the Fanout index pattern.

class Employees_ByAllAttachments extends AbstractJavaScriptIndexCreationTask {
    constructor () {
        super();

        const { loadAttachments } = this.mapUtils();

        this.map("employees", employee => {
            // Call 'loadAttachments' to get details and content for ALL attachments
            const allAttachments = loadAttachments(employee);

            // This will be a Fanout index -
            // the index will generate an index-entry for each attachment per document

            return allAttachments.map(attachment => ({

                // Index DETAILS of attachment:
                attachmentName: attachment.Name,
                attachmentContentType: attachment.ContentType,
                attachmentSize: attachment.Size,

                // Index CONTENT of attachment:
                // Call 'getContentAsString' to access content
                attachmentContent: attachment.getContentAsString()
            }));
        });

        // It can be useful configure Full-Text search on the attachment content index-field
        this.index("attachmentContent", "Search");
    }
}

Query the Index:


const employees = await session
    // Query the index for matching employees
    .query({indexName: "Employees/ByAllAttachments"})
    // Filter employee results by their attachments details and content
    .whereGreaterThan("attachmentSize", 20_000)
    .orElse()
    .search("attachmentContent", "Colorado Dallas")
    .all();

// Results:
// ========
// Results will include:
// 'employees/1-A' and 'employees/2-A' that match the content criteria 
// 'employees/4-A' and 'employees/5-A' that match the size criteria
from index "Employees/ByAllAttachments"
where attachmentSize > 20000 or search(attachmentContent, "Colorado Dallas")

Leveraging indexed attachments

  • Access to the indexed attachment content opens the door to many different applications,
    including many that can be integrated directly into RavenDB.

  • This blog post demonstrates how image recognition can be applied to indexed attachments using the additional sources feature. The resulting index allows filtering and querying based on image content.

Syntax

attachmentsFor(document);
Parameter Type Description
document object The document whose attachments details you want to load

// Returns a list containing the following attachment details object:
{    
    name;         // string
    hash;         // string
    contentType;  // string
    size;         // number
}

loadAttachment(document, attachmentName);
Parameter Type Description
document object The document whose attachment you want to load
attachmentName string The name of the attachment to load

// Returns the following attachment object:
{
    // Properties accessing DETAILS:
    // =============================
    name;         // string
    hash;         // string
    contentType;  // string
    size;         // number
    
    // Methods accessing CONTENT:
    // ==========================
    getContentAsStream();
    getContentAsString(encoding);
    getContentAsString(); // Default encoding is "utf8"
}

loadAttachments(document);

// Returns a list containing the above attachment object per attachment.