Documents Compression



Overview

  • As a document database, RavenDB's schema-less nature presents many advantages,
    however, it requires us to manage the data structure on a per-document basis.
    In extreme cases, the majority of the data you store is the documents' structure.

  • The Zstd compression algorithm is used to learn your data model, identify common patterns,
    and create dictionaries that represent the redundant structural data across documents in a collection.

  • The algorithm is trained by each compression operation and continuously improves its compression ratio
    to maintain the most efficient compression model.
    In many datasets, this can reduce the storage space by more than 50%.

  • Compression and decompression are fully transparent to the user.
    Reading and querying compressed large datasets is usually as fast as reading and querying
    their uncompressed versions because the compressed data is loaded much faster.

  • Compression is Not applied to attachments, counters, and time series data,
    only to the content of documents and revisions.

  • Detailed information about the database's physical storage is visible in the Storage Report view.

Compression -vs- Compaction

  • The following table summarizes the differences between Compression and Compaction:
Compression
Action: Reduce storage space using the Zstd compression algorithm
Items that can be compressed: - Documents in collections that are configured for compression
- Revisions for all collections
Triggered by: The server
Triggered when: Compression feature is configured,
and when either of the following occurs for the configured collections:
   - Storing new documents
   - Modifying & saving existing documents
   - Compact operation is triggered, existing documents will be compressed
Compaction
Action: Remove empty gaps on disk that still occupy space after deletes
Items that can be compacted: Documents and/or indexes on the specified database
Triggered by: Client API code
Triggered when: Explicitly calling compact database operation

Set compression for all collections

// Compression is configured by setting the database record 

// Retrieve the database record
const dbrecord = await store.maintenance.server.send(new GetDatabaseRecordOperation(store.database));

// Set compression on ALL collections
dbrecord.documentsCompression.compressAllCollections = true;

// Update the the database record
await store.maintenance.server.send(new UpdateDatabaseOperation(dbrecord, dbrecord.etag));

Set compression for selected collections

// Retrieve the database record
const dbrecord = store.maintenance.server.send(new GetDatabaseRecordOperation(store.database));

// Turn on compression for specific collections
// Turn off compression for all revisions, on all collections
dbrecord.documentsCompression = {
    collections: ["Orders", "Employees"], 
    compressRevisions: false
};

// Update the the database record
store.maintenance.server.send(new UpdateDatabaseOperation(dbrecord, dbrecord.etag));

Syntax

  • Documents compression is configured using the following object in the database record:

// The documentsCompression object
{
    collections;            // string[], List of collections to compress 
    compressRevisions;      // boolean, set to true to compress revisions 
    compressAllCollections; // boolean, set to true to compress all collections
}