see on GitHub

Time Series Rollups and Retention


Many time series applications produce massive amounts of data at a steady rate. Time Series Policies help you to manage your data in two ways:

  • Creating Rollups - summarizing time series data by aggregating it into the form of a new, lower resolution time series.

  • Limiting a time series' Retention - the amount of time that time series data is kept before being deleted.

  • In this page:


Time Series Policies

What are Rollups?

A rollup is a time series that summarizes the data from another time series, with each rollup entry representing a specific time frame in the original time series. Each rollup entry contains 6 values that aggregate the data from all the entries in the original time frame:

  • First - the value of the first entry in the frame.
  • Last - the value of the last entry.
  • Min - the smallest value.
  • Max - the largest value.
  • Sum - the sum of all the values in the frame.
  • Count - the total number of entries in the frame.

This results in a much more compact time series that still contains useful information about the original time series (also called the "named" or "raw" time series). Rollup time series are created automatically according to rollup policies. Rollup policies apply to all time series of every document in the given collection. Each collection can be configured to have multiple policies.

The rollup policies for a given collection are not applied independently. A given raw time series is rolled up using the policy with the shortest aggregation frame. Then that rollup time series is rolled up using the policy with the next shortest aggregation frame, and so on.

Querying with group-by will transparently traverse over the rollups to retrieve the relevant results.

Let's look at an example of rollup data:

"Rollup time series entries"


1) Name:
A rollup time series' name has this format:
"<name of raw time series>@<name of time series policy>"
It is a combination of the name of the raw time series and the name of the time series policy separated by a @ character - in the image above these are "HeartRates" and "byHour" respectively. For this reason, neither a time series name nor a policy name can have the character @ in it.

2) Timestamp:
The aggregation frame always begins at a round number of one of these time units: a second, minute, hour, day, week, month, or year. So the frame includes all entries starting at a round number of time units, and ending at a round number minus one millisecond (since milliseconds are the minimal resolution in RavenDB time series). The timestamp for a rollup entry is the beginning of the frame it represents.

For example, if the aggregation frame is three days, a frame will start and end at a time stamps like:
2020-01-01 00:00:00 - 2020-01-03 23:59:59.999.

3) Values:
Each group of six values represents one value in the original entries. If the raw time series has n values per entry, the rollup time series will have 6*n per entry: the first six summarize the first raw value, the next six summarize the next raw value, and so on. The aggregate values have the names:
"First (<name of raw value>)", "Last (<name of raw value>)", ... respectively.
Because time series entries are limited to 32 values, rollups are limited to the first five values of an original time series entry, or 30 aggregate values.

Usage Flow and Syntax

To configure time series policies for one or more collections:

  • Create time series policy objects.
  • Use those to populate TimeSeriesCollectionConfiguration objects for each collection you want to configure.
  • Use those to populate a TimeSeriesConfiguration object which will belong to the whole database.
  • Finally, use the ConfigureTimeSeriesOperation operation to send the new configurations to the server.

Syntax

The two types of time series policy:

// Rollup policies
public class TimeSeriesPolicy
{
    public string Name;
    public TimeValue RetentionTime;
    public TimeValue AggregationTime;
}

// A retention policy for the raw TS
// Only one per collection
public class RawTimeSeriesPolicy : TimeSeriesPolicy
{
    public string Name;
    public TimeValue RetentionTime;
    // Does not perform aggregation
}

TimeSeriesPolicy:

Property Description
Name This string is used to create the names of the rollup time series created by this policy.
Name is added to the name of the raw time series - with @ as a separator - to create the name of the resulting rollup time series.
RetentionTime Time series entries older than this time span (see TimeValue below) are automatically deleted.
AggregationTime The time series data being rolled up is divided at round time units, into parts of this length of time. Each of these parts is aggregated into an entry of the rollup time series.

RawTimeSeriesPolicy:

Property Description
Name This string is used to create the names of the rollup time series created by this policy.
Name is added to the name of the raw time series - with @ as a separator - to create the name of the resulting rollup time series.
RetentionTime Time series entries older than this time span (see TimeValue below) are automatically deleted.


The TimeValue struct

public struct TimeValue
{
    public static TimeValue FromSeconds(int seconds);
    public static TimeValue FromMinutes(int minutes);
    public static TimeValue FromHours(int hours);
    public static TimeValue FromDays(int days);
    public static TimeValue FromMonths(int months);
    public static TimeValue FromYears(int years);
}

Each of the aboveTimeValuemethods returns aTimeValue` object representing a whole number of the specified time units. These are passed as the aggregation and retention spans of time series policies.

{INFO: } The main reason we use TimeValue rather than something like TimeSpan is that TimeSpan doesn't have a notion of 'months', because a calendar month is not a standard unit of time (since it ranges from 28-31 days). TimeValue enables you to define retention and aggregation spans for a calendar month.
{/INFO}

TimeSeriesCollectionConfiguration and TimeSeriesConfiguration

public class TimeSeriesCollectionConfiguration
{
    public bool Disabled;
    public List<TimeSeriesPolicy> Policies;
    public RawTimeSeriesPolicy RawPolicy;
}

public class TimeSeriesConfiguration
{
    public Dictionary<string, TimeSeriesCollectionConfiguration> Collections;
}
Property Description
Disabled If set to true, rollup processes will stop, and time series data will not be deleted by retention policies.
Policies Populate this List with your rollup policies
RawPolicy The RawTimeSeriesPolicy, the retention policy for the raw time series
Collections Populate this Dictionary with the TimeSeriesCollectionConfigurations and the names of the corresponding collections.


The Time Series Configuration Operation

public ConfigureTimeSeriesOperation(TimeSeriesConfiguration configuration);

Pass this your TimeSeriesConfiguration, see usage example below. How to use an operation.

Samples

How to create time series policies for a collection and pass them to the server:

//Policy for the original ("raw") time-series,
//to keep the data for one week
var oneWeek = TimeValue.FromDays(7);
var rawRetention = new RawTimeSeriesPolicy(oneWeek);

//Roll-up the data for each day,
//and keep the results for one year
var oneDay = TimeValue.FromDays(1);
var oneYear = TimeValue.FromYears(1);
var dailyRollup = new TimeSeriesPolicy("DailyRollupForOneYear",
                                       oneDay,
                                       oneYear);

//Enter the above policies into a 
//time-series collection configuration
//for the collection 'Sales'
var salesTSConfig = new TimeSeriesCollectionConfiguration
{
    Policies = new List<TimeSeriesPolicy>
    {
        dailyRollup
    },
    RawPolicy = rawRetention
};

//Enter the configuration for the Sales collection
//into a time-series configuration for the whole database
var DatabaseTSConfig = new TimeSeriesConfiguration();
DatabaseTSConfig.Collections["Sales"] = salesTSConfig;

//Send the time-series configuration to the server
store.Maintenance.Send(new ConfigureTimeSeriesOperation(DatabaseTSConfig));

How to access a rollup time series:

//Create local instance of the time-series "rawSales"
//in the document "sales/1"
var rawTS = session.TimeSeriesFor("sales/1", "rawSales");

//Create local instance of the rollup time-series - first method:
var dailyRollupTS = session.TimeSeriesFor("sales/1",
                                              "rawSales@DailyRollupForOneYear");

//Create local instance of the rollup time-series - second method:
//using the rollup policy itself and the raw time-series' name
var rollupTimeSeries2 = session.TimeSeriesFor("sales/1",
                                      dailyRollup.GetTimeSeriesName("rawSales"));

//Retrieve all the data from both time-series
var rawData = rawTS.Get(DateTime.MinValue, DateTime.MaxValue).ToList();
var rollupData = dailyRollupTS.Get(DateTime.MinValue, DateTime.MaxValue).ToList();