Using NuGet Packages to Power Up RavenDB Indexes
published: May 13, 2021 |
updated: October 30, 2025
Uncover the power of NuGet packages within RavenDB indexes to offload work like image EXIF indexing or ML.NET analysis.
In RavenDB 5.1, you can now use third-party NuGet packages and load binary data (or “attachments”) within your indexes without any additional code deployment necessary. In this article, I’ll showcase how to use these two features together to index EXIF metadata on images and run ML.NET sentiment analysis offline on product reviews.What Makes RavenDB Indexes So Powerful?
In traditional NoSQL or RDMS databases indexing is usually an afterthought until queries slow down. In RavenDB, you cannot query without an index which makes your app fast by default. Traditional indexes are just lookup tables. In MongoDB, the aggregation pipeline can perform Map-Reduce operations but pipelines are executed by the client code at a point in time. In contrast, RavenDB indexes can perform complex operations that combine the benefits of MongoDB’s aggregation pipeline with the power of RDMS-style queries. Since indexes are built in parallel in the background, queries respond in milliseconds even under high load. This makes RavenDB one of the fastest NoSQL databases on the market. For a deeper dive, learn what role indexes play in RavenDB compared to MongoDB and PostgreSQL.Table of contents
Indexing Image EXIF Data Using Attachments
Imagine a photo gallery web app. If you want to show the date a photo was taken, you need to read its EXIF metadata. EXIF data is stored in the image binary and follows a specific structure.



Photos/WithExifAttributes:

from photo in docs.Photos
let attachment = LoadAttachment(photo, AttachmentsFor(Photo)[0].Name)
select new {
photo.Title,
attachment.Name,
attachment.Size
}
This is madness! Can we just stop for a moment and acknowledge that it’s so cool we can load the photo while indexing and have full access to it?
I’ll issue a query and filter by the photo title:
from index 'Photos/WithExitAttributes'
where Title = 'Chateau in France'
The Studio interface can show “raw” index entries which reveal the photo filename and size:


- Get the attachment data as a
Stream - Read the image metadata into a “directory structure”
- Open the EXIF Sub IFD directory which holds some useful metadata
- Get the date the photo was taken
from photo in docs.Photos
let attachment = LoadAttachment(photo, AttachmentsFor(photo)[0].Name)
let directories = new DynamicArray(
ImageMetadataReader.ReadMetadata(attachment.GetContentAsStream()))
let ifdDirectory = directories.OfType().FirstOrDefault()
The first step is to load the attachment data using attachment.GetContentAsStream() which I pass to the ImageMetadataReader.ReadMetadata static utility. This will return an enumeration of “directories” as MetadataExtractor calls them (it’s a tree structure).
The new DynamicArray expression is a class RavenDB uses that wraps an enumerable so that you can safely perform dynamic LINQ operations. The OfType<ExifSubIfdDirectory> LINQ expression retrieves the first metadata directory matching the EXIF Sub IFD directory type.
Next, I get the date the photo was taken as DateTaken:

from photo in docs.Photos
let attachment = LoadAttachment(photo, AttachmentsFor(photo)[0].Name)
let directories = new DynamicArray(
ImageMetadataReader.ReadMetadata(attachment.GetContentAsStream()))
let ifdDirectory = directories.OfType().FirstOrDefault()
let dateTime = DirectoryExtensions.GetDateTime(ifdDirectory, ExifDirectoryBase.TagDateTimeOriginal)
select new {
DateTaken = dateTime,
photo.Title,
attachment.Name,
attachment.Size
}
You’ll notice I am using LINQ’s FirstOrDefault() method which can return a null value for ifdDirectory. Indexes in RavenDB are resilient to errors and what happens behind the scenes is some magic that will add null propagation when accessing any values that could be null. This avoids any NullReferenceException issues that could cause indexing to fail. I wish I had a null propagation fairy in my regular .NET code!
I use the DirectoryExtensions.GetDateTime static method to retrieve the photo’s “original date” field. Images can contain a lot of different date-time fields and it is not consistent between file formats. For this photo, the TagDateTimeOriginal field holds the timestamp the photo was taken so I am using that.
I can now query photos by date! RavenDB supports date range queries when filtering by a date field so I can use the filter expression:
where DateTaken between '2015-01-01' and '2015-03-01`

- Upload a photo to the database and associate it with a document
- Create an index to query the EXIF image data with the following steps:
- Load the photo during indexing
- Load a third-party NuGet package assembly
- Read the image metadata using the API
- Add the EXIF metadata as additional fields in the index
- Query documents by date taken
User Review Sentiment Analysis with ML.NET
Let’s change gears now and examine how we could leverage some offline machine learning to index the sentiment of user reviews. “Sentiment analysis” is classifying text as positive, neutral, or negative. Sounds simple but it’s complicated stuff! In a sentiment analysis machine learning program, the rough steps would be to:- Obtain a source dataset of values
- Train the AI model on the dataset
- Use the trained model to perform sentiment analysis against new data

Reviews/BySentiment, I’ll select the Author and Body fields from the document:

SentimentAnalyzer namespace:

Sentiments.Predict and pass the review text which returns a Prediction boolean where true is “positive” and false is “negative” sentiment. I’ll select that value out into the Sentiment field:
from review in docs.Reviews
select new {
review.Body,
review.Author,
Sentiment = Sentiments.Predict(
review.Body).Prediction ? "Positive" : "Negative"
}
What this enables me to do is query for “positive” or “negative” sounding reviews:
from index 'Reviews/BySentiment'
where Sentiment == 'Positive'
The query returns the expected positive-sounding review:

NuGet Packages from Custom Sources
I’ve shown how you can pull in third-party NuGet packages but you aren’t limited to the official Microsoft package source. You can use any package source when adding a NuGet package such as your company’s MyGet feed:
How Does Running Additional Code Affect Performance?
I am sure you may be curious to know what performance implication this has on the indexing process (remember, it doesn’t impact the query performance). This is additional code that is running so it will certainly incur overhead during the indexing process. For this article, I used the Free tier on RavenDB Cloud which is 2 vCPUs and 512MB of RAM. This is TINY when you think about what a production database server might require but I want to show you how fast RavenDB can be given these hardware constraints. I’ll use this sample IMDB dataset that has 50,000 movie reviews and create the same kind of sentiment analysis index to compare the indexing performance on a larger sample size. The first version will not be using any NuGet packages which will be our baseline. The Map operation just returns the text from each review:
from review in docs
select new {
review.Text
}
In RavenDB, you can view indexing performance for every index in a lot of detail. Indexing happens in batches. Think of putting a bunch of documents into a bucket and sending each bucket of documents down the “indexing assembly line.” There’s the time to process each bucket and then the total time to process all the buckets.
In this case, we have 49 batches of 1024 documents each and the total time to rebuild the index is about 2.75s (about 11-12k docs/s):

from review in docs
select new {
review.Text,
Sentiment = Sentiments.Predict(
review.Text).Prediction ? "Positive" : "Negative"
}
This time, it takes a total of 16 seconds to build the index or about 5 times as long. Each batch varies between 500ms to 1s depending on whether it needed to commit to disk:


Conclusion
Play with these new features yourself within the Live Test Playground or follow the step-by-step tutorials to get started with RavenDB. NuGet package support and attachment indexing enable new use cases that other databases aren’t able to match. Learn about other powerful features released in RavenDB 5.1 like Replication and Document Compression.Woah, already finished? 🤯
If you found the article interesting, don’t miss a chance to try our database solution – totally for free!