Skip to main content

Bulk Insert: How to Work With Bulk Insert Operation

Quick example

Open a bulk insert from the document store and store entities through it.
Each Store call is buffered and streamed to the server in batches; the operation finalizes when the bulk insert is disposed.

using (BulkInsertOperation bulkInsert = store.BulkInsert())
{
foreach (var employee in employees)
{
bulkInsert.Store(employee);
}
}

For the full set of BulkInsert overloads and their parameters, see the Syntax section at the end of this page.

BulkInsertOperation

The following methods can be used when creating a bulk insert.

Methods

SignatureDescription
void Abort()Abort the operation
string Store(object entity, IMetadataDictionary metadata = null)Store the entity, identifier will be generated automatically on client-side. Optional, metadata can be provided for the stored entity.
void Store(object entity, string id, IMetadataDictionary metadata = null)Store the entity, with id parameter to explicitly declare the entity identifier. Optional, metadata can be provided for the stored entity.
Task<string> StoreAsync(object entity, IMetadataDictionary metadata = null)Store the entity in an async manner, identifier will be generated automatically on client-side. Optional, metadata can be provided for the stored entity.
Task StoreAsync(object entity, string id, IMetadataDictionary metadata = null)Store the entity in an async manner, with id parameter to explicitly declare the entity identifier. Optional, metadata can be provided for the stored entity.
void Dispose()Dispose of an object
ValueTask DisposeAsync()Dispose of an object in an async manner

Limitations

  • BulkInsert is designed to efficiently push large volumes of data.
    Data is therefore streamed and processed by the server in batches.
    Each batch is fully transactional, but there are no transaction guarantees between the batches, and the operation as a whole is non-transactional.
    If the bulk insert operation is interrupted mid-way, some of your data might be persisted on the server while some of it might not.
    • Make sure that your logic accounts for the possibility of an interruption that would cause some of your data not to persist on the server yet.
    • If the operation was interrupted and you choose to re-insert the whole dataset in a new operation, you can set SkipOverwriteIfUnchanged to true so the operation overwrites existing documents only if they changed since the last insertion.
    • If you need full transactionality, using a session may be a better option.
      Note that if a session is used, all of the data is processed in a single transaction, so the server must have sufficient resources to handle the entire data set included in the transaction.
  • Bulk insert is not thread-safe.
    A single bulk insert should not be accessed concurrently.
    • Using multiple bulk inserts concurrently on the same client is supported.
    • Usage in an async context is also supported.

Example

Create bulk insert

Here we create a bulk insert operation and insert a million documents of type Employee:

using (BulkInsertOperation bulkInsert = store.BulkInsert())
{
for (int i = 0; i < 1000 * 1000; i++)
{
bulkInsert.Store(new Employee
{
FirstName = "FirstName #" + i,
LastName = "LastName #" + i
});
}
}

BulkInsertOptions

The following options can be configured for BulkInsert.

CompressionLevel

ValueTypeDescription
OptimalCompressionLevelCompression level to be used when compressing static files.
Fastest
(Default)
CompressionLevelCompression level to be used when compressing HTTP responses with GZip or Deflate.
NoCompressionCompressionLevelDoes not compress.

Default compression level

For RavenDB versions up to 6.2, bulk-insert compression is Disabled (NoCompression) by default.
For RavenDB versions from 7.0 on, bulk-insert compression is Enabled (set to Fastest) by default.


SkipOverwriteIfUnchanged

Use this option to avoid overriding documents when the inserted document and the existing one are similar.

Enabling this flag can exempt the server of many operations triggered by document-change, like re-indexation and subscription or ETL-tasks updates.
There is a slight potential cost in the additional comparison that has to be made between the existing documents and the ones that are being inserted.

using (var bulk = store.BulkInsert(new BulkInsertOptions
{
SkipOverwriteIfUnchanged = true
}))
{
// ...
}

Track progress with OnProgress

A long-running bulk insert can take some time to complete. To track its progress - for example, to show a progress bar or to log how many documents have been inserted so far - subscribe to the OnProgress event.

The bulk insert delivers a progress snapshot to your handler each time the server reports new progress, with counters such as DocumentsProcessed, BatchCount, and the ID of the document processed most recently (LastProcessedId).
See Classes for the full property list.

  • Updates arrive asynchronously over the Changes stream that the server opens for the operation.
    Your handler is invoked by the server's progress reports, not by your Store calls - a single progress event typically covers many Store calls at once.
  • The subscription opens after the first Store call. Attach the handler before storing anything to avoid missing early updates.
  • Each snapshot is cumulative from the start of the bulk insert. DocumentsProcessed, BatchCount, Total, and the other counters grow over time and do not reset between events.
    To find the number of documents added between two consecutive events, subtract the older snapshot's DocumentsProcessed from the newer one's. For example, if event A reports DocumentsProcessed = 1000 and event B reports DocumentsProcessed = 1500, then 500 documents were added in the interval.

Example: Print progress to the console

using (BulkInsertOperation bulkInsert = store.BulkInsert())
{
// Attach the handler before the first Store call so early updates are not missed.
bulkInsert.OnProgress += (sender, args) =>
{
// Each event carries a cumulative snapshot since the bulk insert started.
Console.WriteLine(
$"Processed {args.Progress.DocumentsProcessed} documents " +
$"(last: {args.Progress.LastProcessedId})");
};

// Each employee inserted here advances the counters reported to the handler above.
foreach (var employee in employees)
{
bulkInsert.Store(employee);
}
}

Syntax

Method signatures

Opens a bulk insert against the given database, or the document store's default database when database is null.

public BulkInsertOperation BulkInsert(
string database = null,
CancellationToken token = default);

Usage:

using (var bulkInsert = store.BulkInsert())
{
// ...
}

ParameterTypeDescription
databasestringThe name of the database to perform the bulk operation on.
If null, the DocumentStore.Database is used.
tokenCancellationTokenCancellation token used to halt the worker operation.
Return value
BulkInsertOperationInstance of BulkInsertOperation used for interaction.

Event signature

Raised while a bulk insert is running, each time the server reports new progress. Each invocation carries a cumulative progress snapshot since the start of the operation.

public event EventHandler<BulkInsertOnProgressEventArgs> OnProgress;

Usage:

bulkInsert.OnProgress += (sender, args) =>
{
// Read args.Progress for the current snapshot.
};

Classes

The event arguments delivered to an OnProgress handler.

public class BulkInsertOnProgressEventArgs : EventArgs
{
public BulkInsertProgress Progress { get; }
}

PropertyTypeDescription
ProgressBulkInsertProgressThe cumulative progress snapshot for this update.

In this article