Hi, I’m Henning Diedrich, co-founder and CEO of Eonblast, Inc. I’m a guest contributor to VoltDB’s blog.
In February I was contacted by VoltDB about conducting a benchmarking project. The company had recently released an updated version of a Node.js client driver that had originally been authored by Jacob Wright, one of VoltDB’s community members. When I began looking into Node.js, it became clear that its architecture and scaling goals are quite well aligned with VoltDB’s, so I was intrigued by the idea of running a benchmark to see what the combined technologies could produce. Like all languages and libraries, Node.js is not a panacea for every kind of application, but the idea of matching it with VoltDB seemed very interesting. VoltDB was quite clear that they were not looking to do the benchmarking from inside of the company – they wanted the tests to be designed and run by a 3rd party who would use an objective mind and good computer science. I felt I could do those things, so I accepted the assignment. In the spirit of full disclosure, VoltDB paid me to do the benchmarking and write-up these results.
Choosing the Benchmark Platform – EC2
As part of the “objectivity” goal, I felt it would be important to choose a platform that would be familiar to many people, so I recommended running the benchmarks on EC2. It was pretty clear that EC2 would produce worse throughput numbers than running on bare metal, but using a “neutral” infrastructure was the best way to get objective numbers.
After fiddling around with a few different EC2 configurations, I ultimately settled on using Amazon’s m2.4xlarge instances for both the Node client side and the Volt database side of the tests. The main reason for choosing that system was that each machine comes with 8 virtual cores – a feature that’s beneficial to the scaling architectures of both Node.js and VoltDB. Other configuration details were:
Operating system: Ubuntu
Node.js version: 0.6.10
VoltDB Node.js driver version: 0.1.1 (6dcdcf5)
VoltDB DBMS version: 2.2
Client-side benchmark script: 0.71 – 0.74 (d5df513)
The Benchmark Application – Voter
Since the goal of the benchmark was to measure the scaling properties of Node.js itself and VoltDB’s Node.js driver, I wanted to use a test app that could “fire hose” a database. VoltDB has a pretty cool one called Voter, which simulates popular TV talent show applications like American Idol and Britain’s got Talent– when it’s time to vote, millions of “virtual” voters hammer the infrastructure at the same time.
One of the things I really like about Voter is that each virtual vote is actually comprised of four SQL statements (three reads and one write). So Voter does a nice job of simulating real-life high throughput OLTP apps. In addition to Voter, I also ran a few tests with simple Hello World workloads, which I discuss a bit in my long-form report. But the focus of my tests was on Voter.
Benchmark Results – the Short Story – 695k Transactions/Sec
As mentioned above, I ran tests with several different client- and server-side configurations because I wanted to observe how things scaled up. Although the goal of the benchmarks was to measure throughput and scaling of VoltDB’s updated Node.js (client) driver, I also needed to observe how the database (server) was handling incremental workloads coming from those Node clients. In my long-form report, I describe some of the things I observed at different points in my testing iterations.
The picture below tells the story of the largest configuration I tried. Here’s some additional information about what’s going on in the picture:
- The “big test” included eight Node.js client instances, each of which had eight virtual cores (64 total). I allocated one Node fork per core for the main test (so 64 total forks).
- The VoltDB database side used exactly the same machine configuration, except I used 12 machines on that side.
- VoltDB’s Node.js driver consistently delivered throughput of ~11k TPS across all of my scaling scenarios. In the end, the “big test” configuration topped out at 695k TPS. It was ultimately the network that seemed to throttle overall throughput, so I probably could have gotten more from the Node clients with a better back-end configuration (e.g., using EC2 cluster instances). It’s also noteworthy that I spent no time trying to tune the VoltDB database for EC2 because my goal was to test client-side throughput and scaling – so it’s likely that I left quite a bit of database throughput “on the table” as well.
- In a few additional small tests, I overloaded the client-side machines with additional Node forks (i.e., forks > virtual EC2 cores) and observed ~20% aggregate throughput improvement before throttling the client driver.
Summary Observations and Notes
Overall, I was very impressed with the throughput and scaling that can be achieved with VoltDB’s updated Node.js driver. A few other notes and comments are:
- On the database side, I configured VoltDB to run at K=0, which means the database was running with HA switched off. Most applications would not run that way in a production environment, but my goal was to test client-side throughput and scaling; I wasn’t really focused the database side of the tests. If you wish to read a good independent analysis of VoltDB’s scaling properties at various HA levels, this one is very informative.
- It’s interesting that, during my early testing, VoltDB asked the Nodejitsu team and Felix Geisendӧrfer to code review the Node.js driver (Felix is the main author of the Node driver for MySQL, a recognized Node expert, and one of the founders of Transloadit. After VoltDB implemented a few of the recommended optimizations, I observed improvements in both the performance and stability of VoltDB’s Node.js driver.
- Although aggregate throughput in the “big test” topped out at 695k TPS, I’m very confident the numbers could have gone a lot higher by tuning the network configuration and, ultimately, adding more machines to each side of the testing infrastructure; both Node and Volt are very scalable. How big could it have gone? I think the benchmark could have easily passed 1M TPS on EC2. And running on bare metal, I could have done a lot better still.
- As I mentioned earlier, the Voter app actually performs four SQL operations per transaction. So, although aggregate throughput in my tests topped out at 695k TPS, that means VoltDB was actually executing 2.8M SQL ops/sec – on an untuned platform. That’s pretty impressive.
- There is a great deal of technical synergy between Node.js and VoltDB. Both are optimized to support asynchronous, event-driven applications. VoltDB delivers its best throughput and scale when driven asynchronously.
- VoltDB could easily be used as the unified state holder for Node.js threads across a parallel cluster – especially in cases where there is no need to communicate across Node threads. And VoltDB supports true ACID transactions that guarantee 100% data accuracy without falling back to vector clocks or eventual consistency. In my domain, online gaming, that’s a huge win.
In summary, I would say that VoltDB’s Node.js driver proved to be highly stable and performed very consistently at scale. I would also observe that VoltDB and Node.js appear to be highly complementary technologies that can be used to build a variety of high-throughput, transactional applications.
If you’re interested in learning more about the Node.js/VoltDB benchmark, or you’d like to have a go at running the benchmark yourself, you might find the following resources helpful:
- Andy Wilson (VoltDB Engineering) and I will be doing a webinar on the Node.js benchmark on May 8. We’d love to have you participate. You can now find a link to the recorded version of our webinar on this page (you may need to scroll down a bit to find it).
- You can download the VoltDB database binaries (either the enterprise or community build), as well as the latest Node.js driver here. All VoltDB database distros include the Voter app.
- If you want to run the benchmark yourself, you can get set-up instructions and my test application here.
by Henning Diedrich