Design: Time Series
-
Time series are sequences of numerical values, associated with timestamps and sorted chronologically.
-
RavenDB Time Series are stored and managed as document extensions, gaining much greater speed and efficiency than they would have had as JSON-formatted data within a document.
-
In this page:
Time Series Structure
Document Extension
Each time series belongs to, or extends, one particular document. The document and the time series reference each other through:
- A reference to the time series in the document's metadata.
The time series' name is kept in the document's metadata.
The time series' data is stored in a separate location. - A reference to the document in the time series data.
Time Series Entries
Each time series entry is composed of:
-
TimeSeriesEntry
Parameters Type Description Timestamp
DateTime (UTC) The time of the event or data represented by the entry. Time is measured up to millisecond resolution. Tag
string An optional tag for an entry. Can be any string up to 255 bytes. Possible uses for the tag: descriptions or metadata for individual entries; storing a document id, which can then be referenced when querying a time series. This is the only component of the entry that is not numerical. Values
double[] An array of up to 32 double
values
Doubles with higher precision - i.e. more digits after the decimal point,
are much less compressible.
In other words, 1.672
takes up more space than 1672
.
The HasTimeSeries
Flag
- When a document has one or more time series, RavenDB automatically adds
aHasTimeSeries
flag in the document's metadata under@flags
:
{
"Name": "Paul",
"@metadata": {
"@collection": "Users",
"@timeseries": [
"my time series"
]
"@flags": "HasTimeSeries"
}
}
- When all time series are deleted from a document, RavenDB automatically removes the flag.
Segmentation
At the server storage level, time series data is divided into segments.
Each segment contains a number of consecutive entries from the same time series.
-
Segments Size and Limitations
- Segments have a maximum size of 2 KB.
What this limit practically means, is that a segment can only contain up to 32k entries and time series larger than that would always be stored in multiple segments. - In practice, segments usually contain far less than 32k entries,
depending on the size of the entries (after compression).
For example, in the Northwind sample dataset, the Companies documents all have a time series called StockPrice. These time series are stored in segments that have ~10-20 entries each. - The maximum time gap between the first and last entries in a segment is
~24.86 days (
int.MaxValue
milliseconds). Adding an entry that is further than that from the first segment entry, would add it as the first entry of a new segment. As a consequence, segments of sparsely-updated time series can be significantly smaller than 2 KB. - The maximum number of unique tags allowed per segment, is 127.
A higher number than that, would cause the creation of a new segment.
- Segments have a maximum size of 2 KB.
-
Aggregate Values
RavenDB automatically stores and updates aggregate values in each segment's header, that summarize commonly-used values regarding this segment, including -- The segment's First value
- The segment's Last value
- The segment's Max value
- The segment's Min value
- The segment's Count of values
- The segment's values Sum
The existence of aggregate values makes it worthwhile to reference individual segments in indexes and queries.
When segment entries store multiple values, e.g. each entry contains a Latitude value and a Longitude value, the six aggregate values are kept for each value separately.
Compression
Time series data is stored using a format called Gorilla compression. On top of the Gorilla compression, the time series segments are compressed using the LZ4 algorithm.
Updating Time Series
Document Change
- Name Change
Creating or deleting a time series adds or removes its name to/from the metadata of the document it belongs to.
This modification triggers a document-change event, and processes such as revisions. - Data Updates
Modifying time series data does not invoke a document-change event, as long as it doesn't create a new time series or remove an existing one.
Success
Updating a time series is designed to succeed without causing a concurrency conflict.
As long as the document it extends exists, updating a time series will always succeed.
No Conflicts
Time series actions do not cause conflicts.
-
Time series updated concurrently by multiple cluster nodes:
When a time series' data is replicated by multiple nodes, the data from all nodes is merged into a single series.
When multiple nodes append different values at the same timestamp:
- If the nodes try to append a different number of values for the same timestamp, the bigger amount of values is applied.
-
If the nodes to append the same number of values, the first values from
each node are compared. The append whose first value sorts higher
lexicographically
(not numerically) is applied.
For example, lexicographic order would sort numbers like this:1 < 10 < 100 < 2 < 21 < 22 < 3
- If an existing value at a certain timestamp is deleted by one node and updated by another node, the deletion is applied.
-
Time series update By multiple clients of the same node:
- When a time series' value at a certain timestamp is appended by multiple clients more or less simultaneously, RavenDB uses the last-write strategy.
- When an existing value at a certain timestamp is deleted by a client and updated by another client, RavenDB still uses the last-write strategy.
Transactions
When a session transaction that includes a time series modification fails for any reason, the time series modification is reverted.
Case Insensitive
All time series operations are case insensitive. E.g. -
session.TimeSeriesFor("users/john", "HeartRate")
.Delete(baseline.AddMinutes(1));
is equivalent to
session.TimeSeriesFor("users/john", "HEARTRATE")
.Delete(baseline.AddMinutes(1));