Generating sequential numbers in a distributed manner

On its face, we have a simple requirement:

Generate sequential numbers
Ensure that there can be no gaps
Do that in a distributed manner

Generating the next number in the sequence is literally as simple as ++, so surely that is a trivial task, right?

The problem is with the second requirement. The need to ensure that there are no gaps comes often when dealing with things like invoices. The tax authorities are really keen on “show me all your invoices”, and if there are gaps in the numbers, you may have to provide Good Answers.

You may think that the third one, running in a distributed environment, is the tough challenge, but that isn’t actually the case. If we are running in a single location, that is fairly easy. Run the invoice id generation as a transaction, and you are done. But the normal methods of doing that are usually wrong in edge cases.

Let’s assume that we use an Oracle database, which uses the following mechanism to generate the new invoice id:

invoice_seq.NEXTVAL

Or SQL Server with an identity column:

CREATE TABLE invoices ( invoice_id INT IDENTITY(1,1) PRIMARY KEY, … );

In both cases, we may insert a new value to the invoices table, consuming an invoice id. At some later point in time, we may roll back the transaction. Care to guess what happens then?

You have INVOICE #1000 and INVOICE #1002, but nothing in between. In fact, no way to even tell what happened, usually.

In other words, identity, sequence, serial, or autonumber – regardless of what database platform you use, are not suitable for generating gapless numbers.

The reasoning is quite simple. Assume that you have two concurrent transactions, which generate two new invoices at roughly the same time. You commit the later one before the first one, and roll back the first. You now have:

Invoice #1
Invoice #2
…
Invoice #1000
Invoice #1002

However, you don’t have Invoice #1001, and you cannot roll back the sequence value there, because if you do so, it will re-generate the #1002 on the next call.

Instead, for gapless numbers, we need to create this as a dedicated part of the transaction. So there would be a record in our system that contains the NextInvoiceId, which will be incremented as part of the new invoice creation.

In order to ensure that there are no gaps, you need to ensure that the NextInvoideId record increment is handled as a user operation, not a database operation. In other words, in SQL Server, that is a row in a table, that you manually increment as part of adding a new invoice. Here is what this will look like:

As you can see, we increment the row directly. So it will be rolledback as well.

The downside here is that we can no longer create two invoices concurrently. The second transaction would have to wait on the lock on the row in the next_gapless_ids table.

All of that happens inside a single database server. What happens when we are running in a distributed environment?

The answer in this case, is the exact same thing. You need a transaction, a distributed one, using a consensus algorithm. Here is how you can achieve this using RavenDB’s cluster wide transactions, which use the Raft protocol behind the scenes:

The idea is simple, we have a transaction that modifies the gapless ids document and creates a new invoice at the same time. We have to handle a concurrency exception if two transactions try to create a new invoice at the same time (because they both want to use the same invoice id value), but in essence this is pretty much exactly the same behavior as when you are running on a single node.

In other words, to ensure the right behavior, you need to use a transaction. And if you need a distributed transaction, that is just a flag away with RavenDB.

RavenDB

RavenDB Cloud

Try

Experience interactive demos and playground server

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Unlock your business potential

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events

What’s New

Roadmap

On-premise Pricing

Cloud Pricing

Support

Proof of Concept Program

Academic Program

Generating sequential numbers in a distributed manner

Woah, already finished? 🤯

Related Articles

CollabTalk Podcast | Episode 123 with Oren Eini–Building a business with Open Source foundations

RavenDB’s storage engine: Voron–unlocking the secret

Certificates from the Ground Up

Watch Live Demo

RavenDB

RavenDB Cloud

Try

RavenDB Docs

RavenDB Cloud Docs

Documentation Guide

Download

Features

Performance

Comparison

What’s New

Demo

Bootcamp

Webinars

Workshops

Inside RavenDB Book

GitHub

StackOverflow

Articles

Whitepapers

Events

Promotional Materials

Use Cases

Articles

Whitepapers

Press Releases

Industry Reports

Performance

Comparison

Proof of Concept Program

Academic Program

Events