Time Series Design
-
Time series are sequences of numerical values, associated with timestamps and sorted chronologically.
-
RavenDB time series are stored and managed as document extensions, achieving much greater speed and efficiency compared to storing them as JSON-formatted data within a document.
-
In this page:
Time series architecture
Time series as a document extension
-
Each time series belongs to, or extends, one particular document.
-
The document and the time series reference each another:
- The document's metadata keeps a reference to the time series name.
The time series data itself is stored in a separate location. - The segments containing the time series data keep a reference to the document ID.
- The document's metadata keeps a reference to the time series name.
The HasTimeSeries
flag
-
When a document has one or more time series,
RavenDB automatically adds theHasTimeSeries
flag to the document's metadata under@flags
. -
When all time series are deleted from the document, RavenDB automatically removes the flag.
{
"Name": "Paul",
"@metadata": {
"@collection": "Users",
"@timeseries": [
"my time series name"
]
"@flags": "HasTimeSeries"
}
}
The time series entry
Each time series entry is composed of a TimeSeriesEntry
object which contains:
Parameter | Type | Description |
---|---|---|
timestamp | Date |
|
tag | string |
|
values | number[] |
|
value | number |
|
Doubles with higher precision - i.e. more digits after the decimal point, are much less compressible.
In other words, 1.672
takes up more space than 1672
.
Segmentation
-
At the server storage level, time series data is divided into segments.
-
Each segment contains a number of consecutive entries from the same time series and aggregated values that RavenDB automatically updates in the segment's header. See section Segment properties for more details.
-
Segments size and limitations:
- Segments have a maximum size of 2 KB.
What this limit practically means, is that a segment can only contain up to 32k entries.
Time series larger than that would always be stored in multiple segments. - In practice, segments usually contain far less than 32k entries, depending on the size of the entries (after compression).
For example, in the Northwind sample dataset, the Companies documents all have a time series called StockPrice.
These time series are stored in segments that have ~10-20 entries each. - The maximum time gap between the first and last entries in a segment is ~24.86 days
(equivalent to 2147483647 milliseconds, the maximum value of a 32-bit signed integer in C#).
Adding an entry that is further than that from the first segment entry, would add it as the first entry of a new segment.
As a consequence, segments of sparsely-updated time series can be significantly smaller than 2 KB. - The maximum number of unique tags allowed per segment, is 127.
A higher number than that, would cause the creation of a new segment.
- Segments have a maximum size of 2 KB.
-
Aggregated values:
RavenDB automatically stores and updates aggregated values in each segment's header.
These values summarize commonly-used values regarding the segment, including -- The segment's Max value
- The segment's Min value
- The segment's values Sum
- The segment's Count of entries
- The segment's First timestamp
- The segment's Last timestamp
When segment entries store multiple values, e.g. each entry contains a Latitude value and a Longitude value, the six aggregated values are stored for each value separately.
Compression
Time series data is stored using a format called Gorilla compression.
On top of the Gorilla compression, the time series segments are compressed using the LZ4 algorithm.
Updating Time Series
Document-change event
-
Time series name update:
Creating/deleting a time series adds/removes its name to/from the metadata of the document it belongs to.
This modification triggers a document-change event, thereby initiating various processes within RavenDB such as ongoing tasks, revision creation, subscriptions, etc. -
Time series entries updates:
As long as a new time series is not created, or an existing one is not removed,
modifying time series entries does Not invoke a document-change event,
No conflicts
Time series actions do not cause conflicts, updating a time series is designed to succeed without causing a concurrency conflict; as long as the document it extends exists, updating a time series will always succeed.
-
Updating time series concurrently by multiple cluster nodes:
When a time series' data is replicated by multiple nodes, the data from all nodes is merged into a single series.
When multiple nodes append different values at the same timestamp:
- If the nodes try to append a different number of values for the same timestamp, the greater number of values is applied.
-
If the nodes try to append the same number of values, the first values from each node are compared.
The append whose first value sorts higher lexicographically (not numerically) is applied.
For example, lexicographic order would sort numbers like this:1 < 10 < 100 < 2 < 21 < 22 < 3
- If an existing value at a certain timestamp is deleted by one node and updated by another node, the deletion is applied.
-
Updating time series by multiple clients to the same node:
- When a time series' value at a certain timestamp is appended by multiple clients more or less simultaneously,
RavenDB uses the last-write strategy. - When an existing value at a certain timestamp is deleted by a client and updated by another client,
RavenDB still uses the last-write strategy.
- When a time series' value at a certain timestamp is appended by multiple clients more or less simultaneously,
Transactions
When a session transaction that includes a time series modification fails for any reason,
the time series modification is reverted.
Case insensitive
All time series operations are case insensitive. E.g. -
session.timeSeriesFor("users/john", "HeartRate")
.deleteAt(timeStampOfEntry);
is equivalent to
session.timeSeriesFor("users/john", "HEARTRATE")
.deleteAt(timeStampOfEntry);