The bug in the random sort

Est. reading time: 2 min
RavenDB News

We needed to randomize a list of values, so I quickly wrote the following code:

What do you expect the output to be?

  • A –> 2431
  • B –> 2571
  • C -> 4998

The number vary slightly, but they are surprisingly consistent overall. C is about 50% likely to be at the head of the list, but I wanted the probability to be 33% obviously.

Can you figure out why this is the case? In order to sort, we need to compare the values. And we do that in a random fashion. We start by comparing A and B, and they have 50% change of either one of them being first.

Then we compare the winner (A | B) to C, and there is a 50% chance of C being first. In other words, because we compare C to the top once, it has a 50% chance of being first. But for A and B, they need to pass two 50% chances to get to the top, so they are only there 25% of the time.

When we figured out why this is happening, I immediately thought of the Monty Hall problem.

The right solution is the Fisher-Yates algorithm, and here is how you want to implement it.

Woah, already finished? 🤯

If you found the article interesting, don’t miss a chance to try our database solution – totally for free!

Try now try now arrow icon