Bulk Insert: How to Work With Bulk Insert Operation
BulkInsert
is useful when inserting a large quantity of data from the client to the server.- It is an optimized time-saving approach with a few limitations like the possibility that interruptions will occur during the operation.
In this page:
Syntax
BulkInsertOperation BulkInsert(string database = null, CancellationToken token = default);
Parameters | ||
---|---|---|
database | string |
The name of the database to perform the bulk operation on. If null , the DocumentStore Database will be used. |
token | CancellationToken |
Cancellation token used to halt the worker operation. |
Return Value | |
---|---|
BulkInsertOperation |
Instance of BulkInsertOperation used for interaction. |
BulkInsertOperation BulkInsert(string database, BulkInsertOptions options, CancellationToken token = default);
Parameters | Type | Description |
---|---|---|
database | string |
The name of the database to perform the bulk operation on. If null , the DocumentStore Database will be used. |
options | BulkInsertOptions |
Options to configure BulkInsert. |
token | CancellationToken |
Cancellation token used to halt the worker operation. |
Return Value | |
---|---|
BulkInsertOperation |
Instance of BulkInsertOperation used for interaction. |
BulkInsertOperation BulkInsert(BulkInsertOptions options, CancellationToken token = default);
Parameters | Type | Description |
---|---|---|
options | BulkInsertOptions |
Options to configure BulkInsert. |
token | CancellationToken |
Cancellation token used to halt the worker operation. |
Return Value | |
---|---|
BulkInsertOperation |
Instance of BulkInsertOperation used for interaction. |
BulkInsertOperation
The following methods can be used when creating a bulk insert.
Methods
Signature | Description |
---|---|
void Abort() | Abort the operation |
void Store(object entity, IMetadataDictionary metadata = null) | Store the entity, identifier will be generated automatically on client-side. Optional, metadata can be provided for the stored entity. |
void Store(object entity, string id, IMetadataDictionary metadata = null) | Store the entity, with id parameter to explicitly declare the entity identifier. Optional, metadata can be provided for the stored entity. |
void StoreAsync(object entity, IMetadataDictionary metadata = null) | Store the entity in an async manner, identifier will be generated automatically on client-side. Optional, metadata can be provided for the stored entity. |
void StoreAsync(object entity, string id, IMetadataDictionary metadata = null) | Store the entity in an async manner, with id parameter to explicitly declare the entity identifier. Optional, metadata can be provided for the stored entity. |
void Dispose() | Dispose of an object |
void DisposeAsync() | Dispose of an object in an async manner |
Limitations
-
BulkInsert is designed to efficiently push large volumes of data.
Data is therefore streamed and processed by the server in batches.
Each batch is fully transactional, but there are no transaction guarantees between the batches and the operation as a whole is non-transactional.
If the bulk insert operation is interrupted mid-way, some of your data might be persisted on the server while some of it might not.- Make sure that your logic accounts for the possibility of an interruption that would cause some of your data not to persist on the server yet.
- If the operation was interrupted and you choose to re-insert the whole dataset in a new
operation, you can set
SkipOverwriteIfUnchanged
as
true
so the operation will overwrite existing documents only if they changed since the last insertion. - If you need full transactionality, using session
may be a better option.
Note that ifsession
is used all of the data is processed in a single transaction, so the server must have sufficient resources to handle the entire data set included in the transaction.
-
Bulk insert is not thread-safe.
A single bulk insert should not be accessed concurrently.- Using multiple bulk inserts concurrently on the same client is supported.
- Usage in an async context is also supported.
Example
Create bulk insert
Here we create a bulk insert operation and insert a million documents of type Employee
:
using (BulkInsertOperation bulkInsert = store.BulkInsert())
{
for (int i = 0; i < 1000 * 1000; i++)
{
bulkInsert.Store(new Employee
{
FirstName = "FirstName #" + i,
LastName = "LastName #" + i
});
}
}
BulkInsertOperation bulkInsert = null;
try
{
bulkInsert = store.BulkInsert();
for (int i = 0; i < 1000 * 1000; i++)
{
await bulkInsert.StoreAsync(new Employee
{
FirstName = "FirstName #" + i,
LastName = "LastName #" + i
});
}
}
finally
{
if (bulkInsert != null)
{
await bulkInsert.DisposeAsync().ConfigureAwait(false);
}
}
BulkInsertOptions
The following options can be configured for BulkInsert.
CompressionLevel
:
Parameter | Type | Description |
---|---|---|
Optimal | string |
Compression level to be used when compressing static files. |
Fastest (Default) |
string |
Compression level to be used when compressing HTTP responses with GZip or Deflate . |
NoCompression | string |
Does not compress. |
Default compression level
For RavenDB versions up to 6.2
, bulk-insert compression is Disabled (NoCompression
) by default.
For RavenDB versions from 7.0
on, bulk-insert compression is Enabled (set to Fastest
) by default.
SkipOverwriteIfUnchanged
:
Use this option to avoid overriding documents when the inserted document and the existing one are similar.
Enabling this flag can exempt the server of many operations triggered by document-change,
like re-indexation and subscription or ETL-tasks updates.
There is a slight potential cost in the additional comparison that has to be made between
the existing documents and the ones that are being inserted.
using (var bulk = store.BulkInsert(new BulkInsertOptions
{
SkipOverwriteIfUnchanged = true
}));