see on GitHub

Querying: Time Series Querying Overview & Syntax



Time Series Queries

Time series query can -

  • Choose a range of time series entries to query from.
  • Filter time series entries by their tags, values and timestamps.
  • Aggregate time series entries into groups by a chosen time resolution, e.g. gather the prices of a stock that's been collected over the past two months to week-long groups). Entries can also be aggregated by their tags.
  • Select entries by various criteria, e.g. by the min and max values of each aggregated group, and project them to the client.
  • Calculate statistical measures: the percentile, slope, or standard deviation of a time series.

Server and Client Queries

Time series queries are executed by the server and their results are projected to the client, so they require very little client computation resources.

  • The server runs time series queries using RQL.
  • Clients can phrase time series queries in raw RQL or using LINQ expressions (which will be automatically translated to RQL before their execution by the server).

Dynamic and Indexed Queries

Time series indexes are not created automatically by the server, but static time series indexes can be created by clients (or using the Studio).

  • Use dynamic queries when time series you query are unindexed or when you prefer that RavenDB would choose an index automatically using its query optimizer. E.g. -

    //Look for time series named "HeartRates" in user profiles of users under 30.
    from Employees as e 
    where Birthday > '1990-01-01'
    select timeseries(
        from HeartRates
    )
  • Indexed queries can be performed over static indexes and their results. E.g. -

    from index 'SimpleIndex'
    where Tag = 'watches/fitbit'

Scaling Query Results

Time series query results can be scaled, multiplied by some number. This doesn't change the values themselves, only the output of the query. This can serve as a stage in a data processing pipeline, or just for the purposes of displaying the data in a more undertandable format.

There are many different use cases for this. For example, suppose your time series records the changing speeds of different vehicles as they travel through a city, but some of your data is in miles per hour, and some of it in kilometers per hour. In this case scaling can be used for unit conversion.

Another use case has to do with the compression of time series data. Numbers with very high precision (i.e., many digits after the decimal point) are less compressible than numbers with low precision. So for the purpose of storage, you might want to change a value like 0.000018 to 18. Then, when you query the data, you can scale by 0.000001 to restore the original value.

Scaling is a part of both RQL and LINQ syntax:

  • In RQL, use scale <double> in a time series query, and input your scaling factor as a double.
  • In LINQ, use .Scale(<double>).

Examples:

from Patients
select timeseries(
    from HeartRate
    scale 60
)
var query = session.Query<Patient>()
    .Select(p => RavenQuery.TimeSeries(p, "HeartRate")
    .Scale(60)
    .ToList());

Syntax

You can query time series using two equivalent syntaxes, choose the syntax you're comfortable with.


select timeseries Syntax: Creating a Time Series Section

This syntax allows you to encapsulate your query's time series functionality in a select timeseries section.

//Look for time series named "HeartRate" in user profiles of users under 30.

from Employees as e 
where Birthday > '1990-01-01'
select timeseries(
    from HeartRate
)
  • from Employees as e where Birthday > '1990-01-01'
    This document query locates the documents whose time series we want to query.

    A typical time series query starts by locating a single document.
    For example, to query a time series of stock prices, we can first locate a specific company's profile in the Companies collection, and then query the StockPrices time series that belongs to this profile.

    from Companies
          where Name = 'Apple'
          select timeseries(
              from StockPrices
          )
  • select timeseries
    The select clause defines the time series query.

  • from HeartRate
    The from keyword is used to select the time series we'd query, by its name.


declare timeseries Syntax: Declaring a Time Series Function

This syntax allows you to declare a time series function and call it from your query. It introduces greater flexibility to your queries as you can, for example, pass arguments to/by the time series function.

Here is a query in both syntaxes. It picks users whose age is under 30, and if they own a time series named "HeartRate", retrieves a range of its entries.

With Time Series Function Without Time Series Function
declare timeseries ts(jogger){
    from jogger.HeartRate 
    between 
       '2020-05-27T00:00:00.0000000Z'
      and 
       '2020-06-23T00:00:00.0000000Z'
}

from Users as jog where Age < 30
select ts(jog)
from Users as jog where Age < 30
 select timeseries(
    from HeartRate 
    between 
       '2020-05-27T00:00:00.0000000Z'
      and 
       '2020-06-23T00:00:00.0000000Z')

Queries can use both declared time series functions and custom JavaScript functions.
The JavaScript functions can then call the time series functions, pass them arguments, use and manipulate their results.