dpg media

How to Boost Queries by 800% Without Abandoning Your Relational System

by Karmel Indych

When DynamoDB, Redis, and Elasticsearch didn’t provide smooth and cost-efficient integration with their main PostgreSQL database, DPG Media integrated RavenDB into their system for faster complex queries. They saw an 800% performance increase and now have a far simpler and more cost-effective system, reducing their costs by 1/3.

DPG Media’s relational system has become so complex that the database is required to journey across many tables in every quest to answer a complex query. This means that their customers need to wait patiently to get the results they’re looking for. RavenDB is designed to answer complex queries at lightning speeds by putting all (or most) of the information onto one document instead of in multiple tables. That means only one quick trip to the server to get the results. It can also include multiple related documents in one trip. Furthermore, queries in RavenDB always use an index, which saves a lot of time and effort. It’s like looking at the index in the back of a huge reference book and then going straight to the document you need instead of going through the whole collection or piecing it together from different tables. RavenDB isn’t the only document database that uses indexes, but Ygor Castor’s team chose RavenDB because its query language is so similar to SQL that they didn’t have to change much of their code base to run queries.

Why they looked for a new data platform

As they wrote in an article about the transition to RavenDB, to speed queries, they originally went with big-name NoSQL databases DynamoDB, Redis, and Elasticsearch. Eventually, they needed to consolidate the data from their PostgreSQL database into a document model, and they didn’t find an efficient solution with the other three. After extensive research and tests, they found that RavenDB can meet their various needs and streamline their services while reducing costs and complexity.

As he was testing RavenDB to see if it fits, Ygor tried to “break Raven” on an older model of Raspberry Pi and on a low-end server. He achieved 15-20K operations/second on the Raspberry Pi and 400K/second on the server with RavenDB’s unique client-side Aggressive Caching that substantially reduces query trips to the server. When he saw that the limit was 400,000 reads per second, he noticed that the limit was on his network and not due to RavenDB. “I did not manage to break Raven. I tried to.” This was enough evidence for him that RavenDB’s performance is top-notch. On commodity hardware, RavenDB provides over 1M transactions per second, which means scaling won’t break your budget. Its efficiency on small machines also makes it ideal for edge/IoT deployments.

Why did they choose RavenDB over MongoDB and CouchDB?

RavenDB was easier than MongoDB or CouchDB to set-up. The RQL also allowed us to use the same logic as we use with SQL when querying data.

Ygor Castor

They also considered MongoDB and CouchDB, but it would be much harder to integrate them with their primary relational database. All of their data is written to SQL. Recently Ygor Castor’s department has been transferring data to RavenDB for their users to query much more quickly. Because the syntax and logic in RavenDB’s RQL is so similar to SQL, they are able to use the same logic when querying. This means that the two data platforms integrate well, and less work is needed.

They wanted RavenDB so much that they wrote a client and a driver

They are based on Elixir, and unfortunately, RavenDB didn’t have an Elixir client. Diving deeply into RavenDB documentation, they realized that it was definitely the solution that they needed and decided to write an Elixir client and a driver between Ecto and RavenDB. Ecto, Elixir’s data framework, typically only works with SQL, but because RavenDB’s RQL syntax is so similar to SQL, Ygor was able to translate it without any issues. 

Now, to minimize latency for their users, they continue writing to their organization’s main PostgreSQL management database and for a CQRS separation of commands, they transfer the data to RavenDB document databases for queries. RavenDB’s approach to data modeling and indexing vastly improves query efficiency compared with relational databases in complex queries

“Our API went from a median of 1~2 seconds to 180ms!”

At the same time, their costs were reduced by about ⅓ and their system is far less complex to maintain. 

Ygor’s and his team replaced DynamoDB, Redis, and Elasticsearch with RavenDB, which integrates smoothly with SQL in a far simpler system that costs much less.

Why they prefer RavenDB

Raven’s RQL is very similar to SQL

Any teammate who is accustomed to SQL can easily learn RQL.

Integration with SQL

Ygor Castor wrote a RavenDB driver for the Ecto data access layer. When they transfer data from PostgreSQL to RavenDB, they simply need to change the targets and it works. Because the syntax in RavenDB is so similar to SQL, they barely needed to make any changes in their code base. The two data platforms work together smoothly and require much less work to integrate than other NoSQL options do.

Simple to set up and learn

Ygor found that setting up new RavenDB databases is a simple process. The learning curve was easier than expected for everything except for RavenDB’s client certificate approach to controlling various users’ access

Reliability and peace of mind

We haven’t felt any downtime.

Ygor Castor

RavenDB clusters are known for providing clients with zero downtime if a server goes down. Looking at their logs, DPG noticed that whenever a server in the cluster went down, the others picked up the work instantly and seamlessly. Their users didn’t feel a thing. RavenDB cloud users can distribute nodes in their cluster to different data centers so that even when an entire data center goes offline, business continues as usual without a glitch. 

Better query speeds and costs with default client-side caching

Client-side caching that invalidates stale data in the cache by checking the ETag is default in RavenDB. Aggressive caching is another native RavenDB approach for even faster query performance where the server notifies the client cache in real-time whenever the data has changed. This approach is not default because it has a small chance of stale data if the data has changed at the same time as a query is made on it and aggressive caching hasn’t yet had a chance to notify the client. Both of these approaches speed queries substantially and reduce workload as the client doesn’t need to return to the server and carry the data for queries that have already been made unless the data has changed. 

Auto-indexes take the guesswork out of development cycles

RavenDB queries are always fast and efficient because they must use indexes instead of scanning the entire data set. If a user issues a query for which there isn’t an index, RavenDB creates or updates a dynamic index on the fly. Auto-indexes that aren’t used for a while are automatically removed to reduce needless work and free up disk space. These auto-indexes prevent expensive and time-consuming full scans.  

Attachments

DPG plans to attach videos and other such binary files to documents making them quick and easy to access. These attachments have a strong link to the document but are not in the document so that they don’t add weight to the document itself. This adds efficiency whenever the document is loaded, but the attachment isn’t needed.

Revision History

By using RavenDB’s Revisions, their users can now easily check changes made and revert to old documents. It leaves a trail of changes or an “automated audit log”.

Counters

They use counters for atomic control of their users’ gaming status.  They used DynamoDB for this previously, but it is built into RavenDB, so this simplifies their system.

Counters are connected to documents but are stored separately. Thus, changes to their values will not trigger indexing updates or modify documents. This means that multiple users can accurately increment counters without concurrency issues. Less users need to access the document which also reduces concurrency issues. Counters typically change frequently, while most documents don’t. Creating them can be included in the same trip to the server as its corresponding document when it is saved. Counters can also be defined to interact with other RavenDB features.   

Document Expirations

Some of their puzzles should be deleted after a few weeks, so they set document expirations to delete documents after the time that they configure. This automates cleaning useless data, which still requires a lot of work in PostgreSQL. The expiration date and time are part of each document’s metadata. 

Full-Text Search

Full-text search is a native feature in RavenDB via Lucene. Elasticsearch also uses Lucene. This means that RavenDB can be an out-of-the-box replacement for Elasticsearch, with added benefits, such as ACID transactions, and more. RavenDB’s native Lucene full-text search engine was found to be as fast and feature-rich as Elasticsearch in their tests. RavenDB’s upcoming indexing and query engine Corax (which is currently in beta) has the same functionality but is much faster. Corax trades off storage space for a substantial speed boost.

Data integrity with default ACID transactions

RavenDB features default ACID transactions without sacrificing performance. At over 1M reads and 150K writes per second, RavenDB makes it easy to maintain data integrity through ACID transactions and guaranteed writes while providing your users with impressively low latency.

Difficulties they had with RavenDB and how they solved them

To configure various users’ access privileges, Castor is accustomed to traditional databases identifying users and setting their permissions. RavenDB’s approach is to give each client a client certificate that has the access permissions set in it. Although this approach is very efficient in a microservices/DDD architecture, it took some getting used to. 

What’s next

They are planning to do more with counters in the future.

Also, some of their teams are considering NoSQL and are looking at MongoDB as a solution. Ygor plans on showing them how useful and simple RavenDB has been to work with.

About DPG Media

DPG Media offers entertainment, news, culture, and games via multiple media including paper, websites, video, and augmented reality. They provide services primarily in Belgium, The Netherlands, and Denmark but also have global media outlets. DPG is a media conglomerate with over 30 brands and has been active for over 50 years. Ygor Castor’s team is in charge of all video and game content, including playlists throughout the conglomerate. 

Tech stack: 6 microservices running on AWS Fargate and written in Elixir. A RavenDB cluster handles over 3TB of data and averages 3 million calls every 15 minutes.

Woah, already finished? 🤯

If you found the article interesting, don’t miss a chance to try our database solution – totally for free!

Try now try now arrow icon