Questions and Answers about Modern Database Architecture
VoltDB and DBTA teamed up to host a webinar in October titled “Architectural Considerations for a Modern Database to Operate at Scale”. We had a number excellent questions during that session, which are reproduced below. For those who missed the webinar, you may view it here in its entirety.
Unless otherwise noted, the question is coming from a webinar attendee via our chat functionality during the session and voiced by host Stephen Faig. Answers were provided by VoltDB’s Dheeraj Remella. These answers have been edited for clarity and grammar.
Q: Does database as a service capability include database as a service with disaster recovery with fallback?
A: Something to think about is how much maturity a lot of these past operational solutions have actually come up with, and how much of this switchover can happen automatically. Several of our customers are actually scripted with switchover from a primary cluster to the secondary cluster, but they scripted it in their datacenter. That’s something to investigate that, “Okay, I’m asking for a database, but I also want to ask for a DR cluster from that service, and that service needs to route over to another datacenter perhaps to connect these two clusters together.” So, I would say from a maturity standpoint, it’s an absolutely meaningful ask.
Q: What about cloud storage when considering key architectural decisions?
A: That’s an interesting question. If you’re talking about cloud storage like Amazon S3 or something like that, you could actually use that. If you’re talking EBS volumes for persistence then you need to think about what kind of a band, because EBS volumes are SAN or NAS. You want to see what is the bandwidth available from your instance to the EBS volume, and think about how to get some kind of a provisioned IS, so that you don’t run out of bandwidth capacity given your EBS volumes.
Q: What are your thoughts about using public cloud for high performance applications?
A: I have firsthand experience working on this. We actually put out a performance benchmark report recently where we evaluated running VoltDB as a high-performance database engine on other IDLC cloud platforms out there. I’m sure the cloud platforms have actually increased their chip generation, and memory generation, and things like that for better performance, but public cloud decisions rely on cost. 90% of the time there is going to be some semblance of overallocation because these instances or hosts are meant to be shared. AWS, GCP, and Azure have high performance instances that are specifically bound for one-per-host kind of a thing just to keep the metrics or actually have allocations that are true to what is being shown as the instance details. Always go for those instances for high performance applications. Even then, you would probably see some kind of data relation compared to bare metal boxes. Bare metal boxes will blow through the performance that any public cloud can actually give you. If you use comparable hardware, your bare metal boxes are going to be definitely higher performance.
Q: How can we recognize areas where a lower latency can be achieved compared to my current situation?
A: If you have a current situation where each of your layers in your data architecture is optimally configured – has zero fat, so to speak – but you have way too many layers, try to figure out where you should consolidate. For example, if you’re using Kafka just for pure data transportation, there’s probably some more muscle on Kafka that you could perhaps use. If you’re just trying to do aggregations and store them, you could probably look into the Kafka streams platform as an example. But if you’re trying to make decisions, and send out notifications, and store the data like a database where you can query it and things like that, VoltDB may be able to help you. Essentially, look for opportunities where there is fat in your data architecture. Layers that are there for a specific purpose, and then they rely on other layers. Is there another technology, or is there another opportunity, where you could find a solution that can bring in capabilities together? The fewer layers you have, the better latency you would be able to experience.
Q: How does VoltDB guarantee immediate consistency at scale?
A: This is one of our core differentiators compared to various other databases. If you look at VoltDB, we are based on serializable isolation level, and in a highly available scenario we need consensus of all the replicas before the transaction is committed. Because of this, even if one of the replicas goes down, or the master goes down, the replica would be able to give you a consistency of the data. The thing that you need to consider to make sure this kind of a consistency works is idempotency of your transactions. So any number of times you apply, you need to make sure that your transaction is not just purely incremental. Usually, when you’re incrementing or decrementing without any checking, then you’re going to run into trouble. Checking your timestamp and all these kinds of things that you can actually take into consideration will allow you to have an immediately consistent database with no cost (in terms of additional gymnastics) associated with it.
Q: How does VoltDB enable mass user access with high concurrency?
A: VoltDB is built for fast transactions. When you’re thinking about fast transactions, because the transactions are happening in memory at CPU to memory speed, these transactions are usually satisfied within a few microseconds or milliseconds. Because VoltDB distributes the data not just across nodes but across CPU cores within each machine, you have a very high level of parallelism. Think of 32 core machines — if you have a MySQL application-driven shrouding environment across 10 nodes for example, then you have a parallelism factor of 10. But if you turn around and use those 10 nodes in a Volt cluster, and you have 32 cores, and you say 30 of them I’m going to be using for data processing, now you have 10 times 30 – 300 as your parallelism factor. Even if you take high availability applications into consideration, you still have 150 parallelism factor. That’s a huge level of parallelism. Because of which, when we are actually sending transactions in and out, they go very, very fast. Airpush, for example, has a need for 125,000 concurrent application servers getting served from VoltDB. We were able to provide them with a seven-node cluster. So we can do it, and the way we do it is basically we do the transactions so fast that these queues of work move very fast. I hope that helps. If you have more details that you want about a certain element of it, please reach out to us.