eBook: Fast Data Front Ends for Hadoop
Benefit from Fast Data
For Building streaming data applications that can manage the massive quantities of data generated from mobile devices, M2M, sensors, and other IoT devices is a big challenge many organizations face today.
Traditional tools, such as conventional database systems, do not have the capacity to ingest fast data, analyze it in real time, and make decisions. New technologies, such as Apache Spark and Apache Storm, are gaining interest as possible solutions to handling fast data streams. However, only solutions such as VoltDB provide streaming analytics with full Atomicity, Consistency, Isolation, and Durability (ACID) support.
Employing a solution such as VoltDB, which handles streaming data, provides state, ensures durability, and supports transactions and real-time decisions, is key to benefitting from fast (and big) data.
Data ingestion is a pressing problem for any large-scale system. Several architecture options are available for cleaning and preprocessing data for efficient and fast storage. In this report, we will discuss the advantages and disadvantages of various fast data front ends for Hadoop.