In the newest version of VoltDB, we’ve added a new client API to make calling a single-partition procedure at every partition easier. This blog post explains what it does, why we did it, and how to use it.
Partitioned and Multi-partition transactions
The current version of VoltDB offers two primary kinds of user procedures, single-partition and multi-partition. Single-partition transactions read and write data that lives on a single partition transactionally. Multi-partition transactions (or multi-partition) read and write data across multiple partitions or the entire cluster.
Because multi-partition transactions require more coordination than single-partition transactions, they support less throughput. While your mileage may vary, smallish VoltDB clusters can often support millions of read or write, single-partition transactions per second, each executing multiple SQL statements. Meanwhile most clusters can support tens of thousands of multi-partition reads, and hundreds to thousands of multi-partition writes. Naturally, these numbers assume each transaction is reading or writing a handful of keys, not sorting a table with millions of rows, but they offer some understanding of the scale we are talking about.
This is why we encourage VoltDB application developers to partition their workloads so their highest-volume operations are partitioned. We don’t like to use a rule such as 90% / 10%, because if application developers only have 10k ops/sec, then they could conceivably use mostly multi-partition transactions, but if they have millions of ops/sec, the application may need to be architected for more than 99% partitioned work.
The difference in performance between multi-partition transactions and single-partition transactions is large enough that people have occasionally turned to something we call “all-partition procedures.”
Rather than use a multi-partition transaction, users have figured out ways to call a single-partition transaction for each partition in the system. Originally users would create a single-partition procedure with a “dummy” partitioning parameter, and then call the procedure with the values 1 .. N, where N is the number of partitions. Our old (before 4.x) modulo hashing algorithm meant this would target every partition reliably. This approach does not provide transactional semantics across partitions, and it’s possible for some of these procedures to fail while others succeed.
In 4.x, we acknowledged this use. When we changed the hashing algorithm to a consistent hash ring, we added a system procedure named “@GetPartitionKeys”. A call to “@GetPartitionKeys” would return a set of values a developer could use to partition a procedure call to each partition.
Why do this?
Historically there have been a number of reasons to forsake multi-partition consistency and use the all-partition pattern. Developers initially used them for multi-partition reads, then they would aggregate the data themselves on the client-side. In VoltDB 4.x, we made most multi-partition reads as fast as the all-partition pattern reads, and eliminated the need to do any extra work.
The all-partition pattern also doesn’t make sense for debit-credit transactions across partitions, as it forgoes transactions in a use-case that really needs them.
The primary use case remaining is deleting old data. Many VoltDB applications are continuously inserting data, and need to also continuously age-out data. If you want to limit a table to a certain number of rows, that can now be done in the schema using LIMIT PARTITION ROWS. But if you want to age out data by date, or by some other classification, then you may want to periodically call a procedure to delete data. This is where the all-partition pattern is still the best option.
Here’s some pseudocode for a typical all-partition procedure to delete data that’s more than 30 days old:
- Query to find the newest row that has expired.
If no expired rows, return.
- Delete a batch of rows (maybe 1000, maybe 100, depends on size and other tradeoffs).
Use rank-supporting indexes to make limit + offset instant.
- Return the number of rows deleted.
Then the client just calls the delete procedure on every partition periodically. If the procedure is constantly deleting the max batch size number of rows, then run it more often. If it’s often deleting zero rows, run it a bit less often. Because it’s deleting small batches, and it never blocks execution of other procedures, this is a really unobtrusive way to gracefully age out data.
We’ve written about aging data out of VoltDB before in a previous blog post. Many of the concepts still apply, although some of the specific techniques have been simplified using the new VoltDB v6.7 features described below.
What’s new in v6.7
Until we can make multi-partition procedures fast enough that there are no more use cases for all-partition procedures (something we’re always working on), we’re going to continue suggesting all-partition procedures as a useful pattern.
Given that reality, all-partition procedures don’t meet our standards for ease of use. You can read a guide for how to use this pattern without the v6.7 enhancements.
This is similar to the standard, synchronous procedure call API, but returns an array of pairs, rather than a single response object. Calling this method will invoke a procedure for each partition to the server, block until all invocations have returned or timed out, and return a bundle of the collected response objects. Along with each response object, the APIs provide the partitioning parameter used to target the partition in question. This should be helpful for testing or resending failed requests. Note that each invocation can fail or succeed individually, and it requires a simple for-loop in order to verify all invocations were successful.
There is also an accompanying asynchronous version:
Note this version will only trigger the provided callback once all of the invocations have returned or timed out.
There are a few restrictions with these new APIs. The procedure you call must be partitioned, and it must partition on its first parameter. If you try to invoke an ineligible procedure with this API, you will get an error message.
If you would like to see these new APIs in action, we have updated our example application, “Windowing,” in the examples directory of our kit. This example has a background process that continuously deletes data that is older than needed for the app. The original example app used the all-partition pattern manually, using @GetPartitionKeys and making many invocations, but now it just makes a single call to delete old data across a cluster. It works much as described above in the “Why do this?” section.
We will continue to work to improve multi-partition performance, in the hope that we can obsolete the all-partition pattern. For now, we believe these two methods make it much easier to use the all-partition procedure pattern, without breaking existing apps, or complicating transaction management on the server itself.
We also have plans to make it easier to use this feature from non-Java clients. If you’re using the all-partition pattern in an interesting way or are waiting for these improvements for a specific client, please let us know. Understanding how people use and value VoltDB is the best way to improve our roadmap planning.
by John Hugg
John is a founding engineer at VoltDB. Now at his third database and second distributed systems startup, he currently splits his time between making VoltDB better and telling people about it.