Earlier this week, we have released RavenDB 5.2 to the world. This is an exciting release for a few reasons. We have a bunch of new features available and as usual, the x.2 release is our LTS release.
RavenDB 5.2 is compatible with all 4.x and 5.x releases, you can simply update your server binaries and you’ll be up and running in no time. RavenDB 4.x clients can talk to RavenDB 5.2 server with no issue. Upgrading in a cluster (from 4.x or 5.x versions) can be done using rolling update mode and mixed version clusters are supported (some features will not be available unless a majority of the cluster is running on 5.2, though).
Let’s start by talking about the new features, they are more interesting, I’ll admit.
- OLAP ETL (see below for full details) – This is the flagship feature for RavenDB 5.2
- Rolling index deployment make it easier for RavenDB to control resource usage when introducing new indexes
- Telegraf integration & Grafana template to aid monitoring
- Cluster wide dashboard to make it easy to track what is going on across the entire cluster
- Subscriptions tracking allow you to figure out exactly where your subscriptions are and what they are doing
- Read only certificates allows you to reduce the access level for certain users
- Spatial queries have gotten a lot of performance improvements as well as much nicer analysis and debugging capabilities
- Custom Analyzers makes it easy to deploy your code for advanced full text scenarios
- Improved cluster wide transactions reduce the manual work you do to ensure the correct behavior of transactions across the cluster.
I’m going to be posting details about all those features, but I want to point out what is probably the most important aspect of RavenDB, even beyond the feature, OLAP ETL. RavenDB 5.2 is a LTS release.
Long Term Support release
LTS stands for Long Term Support, we support such releases for an extended period of time and they are recommended for production deployments and long term projects.
Our previous LTS release, RavenDB 4.2, was released in May 2019 and is still fully supported. Standard support for RavenDB 4.2 will lapse in July 2022 (a year from now), we’ll offer extended support for users who want to use that version afterward.
We encourage all RavenDB users to migrate to RavenDB 5.2 as soon as they are able.
This new feature deserve its own post (which will show up next week), but I wanted to say a few words on that. RavenDB is meant to serve as an application database, serving OLTP workloads. It has some features aimed at reporting, but that isn’t the primary focus.
For almost a decade, RavenDB has supported native ETL process that will push data on the fly from RavenDB to a relational database. The idea is that you can push the data into your reporting solution and continue using that infrastructure.
Nowadays, people are working with much larger dataset and there is usually not a single reporting database to work with. There are data lakes (and presumably data seas and oceans, I guess) and the cloud has a much higher presence in the story. Furthermore, there is another interesting aspect for us. RavenDB is often deployed on the edge, but there is a strong desire to see what is going across the entire system. That means pushing data from all the edge locations into the cloud and offering reports based on that.
To answer those needs, RavenDB 5.2 has the OLAP ETL feature. At its core, it is a simple concept. RavenDB allows you to define a script that will transform your data into a tabular format. So far, that is very much the same as the SQL ETL. The interesting bit happens afterward. Instead of pushing the data into a relational database, we’ll generate a set of Parquet files (columnar data store) and push them to a cloud location.
On the cloud, you can use your any data lake solution to process such file, issue reports, etc. For example, you can use Presto or AWS Athena to run queries over the uploaded files.
You can define the ETL process in a single location or across your entire fleet of databases on the edge, they’ll push the data to the cloud automatically and transparently. All the how, when, failure management, reties and other details are handled for you. RavenDB is also capable of integrating with custom solution, such as generating a one time token on each upload (no need to expose your credentials on the edge).
The end result is that you have a simple and native option to run large scale queries across your data, even if you are working in a widely distributed system. And even for those who run just a single cluster, you have a wider set of options on how to report and aggregate your data.