I mentioned that I’m teaching a Cloud Computing course at university in a previous post. That lead to some good questions that I have to field about established wisdom that I have to really think about. One such question that I run into was about the intersection of databases and the cloud.
One of the most important factors for database performance is the I/O rate that you can get. Let’s take a fairly typical off the shelf drive, shall we?
Cost of the drive is less than 500 $ US for a 2TB disk and it can write at close 5GB / sec with sustained writes sitting at 3GB /sec at User Benchmark, it is also rated to hit 1 million IOPS. That is a lot. And that is when you spend less than 500$ on that.
On the other hand, a comparable drive would be Azure P40, which cost 235.52$ per month for 2TB of disk space. It also offers a stunning rate of 7,500 IOPS (with bursts of 30,000!). The write rate is 250MB/sec with bursts of 1GB/sec. The best you can get on Azure, though, is an Ultra disk. Where a comparable disk to the on premise option would cost you literally thousands per month (and would be about a tenth of the performance).
In other words, the cloud option is drastically more costly. To be fair, we aren’t comparing the same thing at all. A cloud disk is more than just renting of the hardware. There is redundancy to consider, the ability to “move” the disk between instances, the ability to take snapshots and restore, etc.
A more comparable scenario would be to look at NVMe instances. If we’ll take L8sv2 instance on Azure, that gives us a 2TB NVMe drive with 400,000 IOPS and 2GB/sec throughput. That is at least within reach of the off the shelf disk I pointed out before. The cost? About 500$ per month. But now we are talking about a machine that has 8 cores and 64 GB of RAM.
The downside of NVMe instances is that the disk are transient. If there is a failure that requires us to stop and start the machine (basically, moving hosts), that would mean that the data is lost. You can reboot the machine, but not stop the cloud allocation of the machine without losing the data.
The physical hardware option is much cheaper, it seems. If we add everything around the disk, we are going to get somewhat different costs. I found a similar server to L8sv2 on Dell for about 7,000 $ US, for example. Pretty sure that you can get it for less if you know what you are doing, but it was my first try and it included 3.2 TB of enterprise grade NVMe drives.
Colocation pricing can run about 100$ a month (again, first search result, you can get for less) and that means that the total monthly cost is roughly 685$. That is comparable to the cloud, actually, but doesn’t account for the fact that you can use the same server for much longer than a single year. It is also probably wasting a lot of money on bigger hardware. What you don’t get, which you probably want, is the envelope around that. The ability to say: “I want another server” (or ten), the ability to move and manage your resources easily, etc. And that is as long as you are managing just hardware resources.
You don’t get any of the services or the expertise in running things. Given that even professional organizations can suffer devastating issues, you want to have an expert manage than, because an armature handling that topic lead to problems.
A lot of the attraction of the cloud comes from a very simple reason. I don’t want to deal with all of that stuff. None of that is your competitive advantage and you would rather just pay and not think about that. The key for the success of the cloud is that globally, you are paying less (in time, effort and manpower) than taking the cost of managing it yourself.
There are two counterpoints here, though.
- At some scale, it would make sense to move out from the cloud to your own hardware. Dropbox did that at some point, moving some of its infrastructure off the cloud to savings of over 75 million dollars. You don’t have to be at Dropbox size to benefit from having some of your own servers, but you do need to hit some tipping point before that would make sense.
- StackOverflow is famously running on their own hardware, and is able to get great results out of that. I wonder how much the age of StackOveflow has to do with that, though.
The cloud is a pretty good abstraction, but it isn’t one that you get for free. There are a lot of scenarios where it makes a lot of sense to have some portions of your system outside of the cloud. The default of “everything is in the cloud”, however, make a lot of sense. Specifically because you don’t need to do complex (and costly) sizing computations. Once you have the system running and the load figured out, you can decide if it make sense to move things to your own severs.
And, of course, this all assumes that we are talking about just the hardware. That is far from the case in today’s cloud. Cloud services are another really important aspect of what you get in the cloud. Consider the complexity of running a Kubernetes cluster, or setting up a system for machine vision or distributed storage or any of the things that the cloud providers has commoditized.
The decision of cloud usage is no longer a simple buy vs. rent but a much more complex one about where do you draw the line of what should be your core concerns and what should be handled outside of your purview.