Strategy Q&A from the Humble Cloud Architect
Recently, we hosted a webinar entitled “The Humble Cloud Architect: What is Your Database Strategy?” We had a number of excellent questions during that session, including some that we did not have the time to answer during the webinar. These questions are reproduced and answered below. For those who missed the webinar, you may view it on demand.
Editor’s Note: Unless otherwise noted, the question is coming from a webinar attendee via our chat functionality during the session. Answers were provided by VoltDB’s Seeta Somagani. These answers have been edited for clarity and grammar.
Q: How do you prevent failure in a given layer? Assuming you cannot, what is the best approach?
A: Each layer has to provide its own guarantee or its own SLAs. Taking those into account, we would have to construct the failure resolution heuristic, or failure resistance mechanisms for your entire application. If the storage layer failed and the processing layer and the query layer are still working, then that could be a business decision on what you would need to compromise. Would you say take this decision to [Unintelligible] the context? Maybe that’s a compromise you can make. Or would you have to stop the application and think, “Without the storage, without the context, I cannot process anymore data.” Architecturally, there are many ways of achieving say failure resilience between the different companies, and Netflix with their OSS model with their OSS open source product, they have contributed a lot towards building microservices database applications where you have circuit breakers that can say – well when one part of the application breaks, it doesn’t trigger failures in other parts of the application. I would say there are two techniques are out there, but most importantly it will be crucial to understand what your business is and what your business can compromise, and based on that take, the decision.
Q: How is the comparison between VoltDB with Hadoop Spark both using in-memory modules?
A: VoltDB is an in-memory relational database, so your data works in raw fashion in-memory and that’s the big difference. It is an actual database, so you have both storage and cross frame and with the processing layer, your data is processed event by event. Whereas in Spark, you could choose – in Spark streaming, Spark does not provide real-time for streaming does, and even for Spark streaming processes the events in a micro-batch fashion. In the case of applications where you need to account for everything in event, maybe it’s a control application or maybe it’s a fraud detection application, or maybe it’s a personalization application. If you need to respond on an event by event basis, then micro-batching is not going to be stalling the need. Spark is good at what it does. It is good at processing streams without much context, and processing streams in a micro-batch layer. VoltDB is a solid, horizontally scalable database that lets you crosses data as a stream.
Q: What are the ideal applications for VoltDB?
A: The ideal applications are where you need to take critical decisions in real-time and with context. Fraud detection is a great example. When you have credit card swipes coming in at thousands of times per second, if you need to be able to detect fraud and prevent it before the transaction actually happens, that’s the kind of critical decisions that VoltDB is really good at, and you can put the same model, same critical nature of event into other models as in personalization. When you are playing a game, the choices that are put in front of you are going to determine how the customer likes the game or the player likes the game, then that could be a critical decision as well. Those are the critical decisions that VoltDB is built to take, but our customers use VoltDB for other reasons as well. They just like to be able to change, to be able to rely on a strongly consistent database, but put it into using the applications that might not require such high consistency such as IRT applications, such as analytics applications for analytic dashboards. The ideal applications are of course the ones with critical decisions, but VoltDB is put to use in applications in many different verticals and many different used cases.
Q: How does VoltDB fit into a Lambda architecture?
A: The Lambda architecture, the data is streamed into two different paths: the fast path and the batch slower path. In the fast path the data is processed and the data that is currently in place is being processed and decisions are taking on that data. Lambda architecture is basically being able to reconcile with having to process big data and also how to process fast data. VoltDB fits into the fast data part of the Lambda architecture, whereas the data flows into Volt and Volt keeps the context of say an hour if that suits your business and runs the queries and runs the logics, and also is able to export the data out than into the big data layer so that the slower and the larger calculations and computations can be performed
Q: What’s the limitation of VoltDB in regard to capacity and complexity of the queries, et cetera?
A: VoltDB is a transaction database. It is not highly suitable for analytical database, analytical workloads where you are looking to slice and dice the data and perform analytical queries on it and perform correlations among different attributes. This kind of workload is more suitable for a database that takes behind Volt an analytical database or a data warehouse that sits behind Volt sort of like something like Hadoop or Teradata where you would crunch through this humongous amounts of data and identify patterns. What Volt is good at is being able to take those patterns and apply those to the streaming events in real time. Volt is good at transactional processing, not so much at analytical processing.
Q: What are the differences between Apache Ignite and VoltDB?
A: Apache Ignite is an in-memory data grid, and there are many differences between the category of an in-memory database and an in-memory data grid where these data grids are primarily key value stores, and being key value stores, the transaction boundaries that they provide are more in terms of single record, so you cannot perform a transaction let’s say withdraws money from one account and deposits into another. Those are the kind of transactions that you would need financial transactions or business transactions that you need actual database transactions for, and in-memory data grids are not particularly good at providing those such kinds of transaction. I believe Ignite does give you the option of providing a separate storage layer or using its own for which might be beneficial for some applications, but at the same time, that also means that you – there are separate parts, they are not tightly bound together into one product, and you might see latency coming into effect. The data grids are good for applications where the consistency is not really a concern and you’re tackling problems with scaling reading or scaling ingesting, but as an overall database, the consistency and maintenance of everything as a whole is not that much of a big concern.
Q: Is VoltDB suitable for analytics or just transactional workloads?
A: VoltDB is primarily built for OLTP applications, but in Volt you could perform real-time analytics. Using materialized views, VoltDB could take a picture of what is happening so you could see the trends and insights. So while VoltDB is built to take decisions on absolutely accurate data, you could also use VoltDB to run some analytics that can drive your decision in real-time.
Q: What about having data of a table partitioned over several VoltDB databases?
A: VoltDB will manage your data as one single database. As I understand, this might be the approach you take when using Redis — you will create a new instance for individual partitions. But VoltDB manages that for you. VoltDB is a cluster database; you specify the column that you’d like to partition it by, and VoltDB will manage the partitions across clustered nodes.
Q: How is security of the transmission and storage of database info handled?
A: We have SSL support out of the box for clients, servers, and messages between servers.
What questions do you have for our humble cloud architect? Send them in here and we’ll get right back to you. Or, watch a recording of the webinar on demand.