Querying Time Series: Overview & Syntax



Time series query capabilities

Time series query can -

  • Choose a range of time series entries to query from.
  • Filter time series entries by their tags, values and timestamps.
  • Aggregate time series entries into groups by a chosen time resolution,
    e.g. gather the prices of a stock that's been collected over the past two months to week-long groups.
    Entries can also be aggregated by their tags.
  • Select entries by various criteria, e.g. by the min and max values of each aggregated group,
    and project them to the client.
  • Calculate statistical measures: the percentile, slope, or standard deviation of a time series.

Server and client queries

Time series queries are executed by the server and their results are projected to the client,
so they require very little client computation resources.

  • The server runs time series queries using RQL.
  • Clients can phrase time series queries in raw RQL or using high level queries.
    High level queries are translated to RQL by the client before sending them to the server for execution.

Dynamic and index queries

  • Dynamic queries:

    • Time series indexes are Not created automatically by the server when making a dynamic query.
    • Use dynamic queries when time series you query are not indexed,
      or when you prefer that RavenDB would choose an index automatically. See queries always use an index.
      E.g. -

    // Define the time series query text
    const tsQueryText = "from HeartRates";
    
    // Make a dynamic query over the "employees" collection
    const queryResults = await session.query({ collection: "employees" })
         // Query for employees hired after 1994
        .whereGreaterThan("HiredAt", "1994-01-01")
         // Call 'selectTimeSeries' to project the time series entries in the query results
         // Pass the defined time series query text
        .selectTimeSeries(b => b.raw(tsQueryText), TimeSeriesRawResult)
        .all();
    
    // Results:
    // ========
    
    // 1. Results will include all entries from time series "HeartRates" for matching employee documents.
    // 2. Since this is a dynamic query that filters documents,
    //    an auto-index (Auto/employees/ByHiredAt) will be created if it doesn't already exist.
    //    However, it is NOT a time series index !! 
    //    It is a regular documents auto-index that allows querying for documents based on their HiredAt field.
    
    // Access a time series entry value from the results:
    const entryValue = queryResults[0].results[0].values[0];
    from "employees" as e
    where HiredAt > "1994-01-01"
    select timeseries (
        from HeartRates
    )
  • Index queries:

Scaling query results

  • Time series query results can be scaled, multiplied by some number. This doesn't change the values themselves, only the output of the query. Scaling can serve as a stage in a data processing pipeline, or just for the purposes of displaying the data in a more understandable format.

  • There are several use cases for scaling. For example, suppose your time series records the changing speeds of different vehicles as they travel through a city, with some data measured in miles per hour and others in kilometers per hour. Here, scaling can facilitate unit conversion.

  • Another use case involves the compression of time series data. Numbers with very high precision (i.e., many digits after the decimal point) are less compressible than numbers with low precision. Therefore, for efficient storage, you might want to change a value like 0.000018 to 18 when storing the data.
    Then, when querying the data, you can scale by 0.000001 to restore the original value.


Example:

// Add 'scale <number>' to your time series query text 
const tsQueryText = "from HeartRates scale 10";

const queryResults = await session.query({ collection: "users" })
    .selectTimeSeries(b => b.raw(tsQueryText), TimeSeriesRawResult)
    .all();

// The value in the query results is 10 times the value stored on the server
const scaledValue = queryResults[0].results[0].values[0];
from "users" 
select timeseries(
    from HeartRates
    scale 10
)

RQL syntax

A typical time series query can start by locating the documents whose time series we want to query.
For example, we can query for employees above 30:

from Employees as e
where Birthday < '1994-01-01'

Then, you can query their time series entries using either of the following two equivalent syntaxes:


select timeseries

This syntax allows you to encapsulate your query's time series functionality in a select timeseries section.

// Query for entries from time series "HeartRates" for employees above 30
// ======================================================================

// This clause locates the documents whose time series we want to query:
from Employees as e 
where Birthday < '1994-01-01'
 
// Query the time series that belong to the matching documents:
select timeseries (   // The `select` clause defines the time series query.  
    from HeartRates   // The `from` keyword is used to specify the time series name to query.  
)

declare timeseries

This syntax allows you to declare a time series function (using declare timeseries) and call it from your query.
It introduces greater flexibility to your queries as you can, for example, pass arguments to the time series function.

Here is a query written in both syntaxes.
It first queries for users above 30. If they possess a time series named "HeartRates", it retrieves a range of its entries.


With Time Series Function Without Time Series Function
// declare the time series function:
declare timeseries ts(jogger) {
    from jogger.HeartRates 
    between 
       "2020-05-27T00:00:00.0000000Z"
      and 
       "2020-06-23T00:00:00.0000000Z"
}

from Users as jogger
where Age > 30
// call the time series function
select ts(jogger)
from Users as jogger
where Age > 30
select timeseries(
    from HeartRates 
    between 
       "2020-05-27T00:00:00.0000000Z"
      and 
       "2020-06-23T00:00:00.0000000Z")

Combine time series and custom functions

  • You can declare and use both time series functions and custom JavaScript function in a query.
    The custom functions can call the time series functions, pass them arguments, and use their results.

  • In the example below, a custom function (customFunc) is called by the query select clause to fetch and format a set of time series entries, which are then projected by the query.
    The time series function (tsQuery) is called to retrieve the matching time series entries.

  • The custom function returns a flat set of values rather than a nested array, to ease the projection of retrieved values.

const queryResults = await session.advanced
     // Provide RQL to rawQuery
    .rawQuery(`
         // The time series function:
         // =========================
         declare timeseries tsQuery(user) {
             from user.HeartRates
             where (Values[0] > 100)
         }
         
         // The custom JavaScript function:
         // ===============================
         declare function customFunc(user) {
             var results = [];
         
             // Call the time series function to retrieve heart rate values for the user
             var r = tsQuery(user);
         
             // Prepare the results
             for(var i = 0 ; i < r.Results.length; i ++) {
                 results.push({
                     timestamp: r.Results[i].Timestamp, 
                     value: r.Results[i].Values.reduce((a, b) => Raven_Max(a, b)),
                     tag: r.Results[i].Tag  ?? "none"})
             }
             return results;
         }
         
         // Query & project results:
         // ========================
         from "users" as user
         select 
             user.name,
             customFunc(user) as timeSeriesEntries // Call the custom JavaScript function
     `)
     // Execute the query
    .all();
// The time series function:
// =========================
declare timeseries tsQuery(user) {
    from user.HeartRates
    where (Values[0] > 100.0)
}

// The custom JavaScript function:
// ===============================
declare function customFunc(user) {
    var results = [];

    // Call the time series function to retrieve heart rate values for the user
    var r = tsQuery(user);

    // Prepare the results
    for(var i = 0 ; i < r.Results.length; i ++) {
        results.push({
            timestamp: r.Results[i].Timestamp, 
            value: r.Results[i].Values.reduce((a, b) => Raven_Max(a, b)),
            tag: r.Results[i].Tag  ?? "none"})
    }
    return results;
}

// Query & project results:
// ========================
from "users" as user
select
    user.name,
    customFunc(user) as timeSeriesEntries // Call the custom JavaScript function

Use Studio to experiment

You can use Studio to try the RQL samples provided in this article and test your own queries.

"Time Series Query in Studio"

Time Series Query in Studio