Apache Hadoop & Big Data

 

Apache Hadoop is an open source Big Data framework and ecosystem that enables distributed processing of large data sets across clusters of computers.

Using Hadoop and Big Data with VoltDB

VoltDB serves as a real-time application database used in conjunction with Hadoop and analytical results derived from Hadoop and big data in applications including real-time scoring, policy enforcement, and customer interaction. VoltDB provides the ability to ingest data as fast as it arrives; perform real-time analytics in-memory; make automated decisions in real time; and continuously pass, or export, processed data into Hadoop.

A Hadoop data pipeline with VoltDB is shown below:

VoltDB provides support for high-velocity export of processed data via a built-in, transactional extract feature. VoltDB Export feeds processed data to HDFS/Hadoop. Application developers can automate the export process by specifying tables in the schema as sources for export. At runtime, any data written to the specified tables is sent to an export connector, whose job it is to move these tuples to the export target safely and with the lowest possible latency. VoltDB provides connectors for export to files (CSV); via WebHDFS to Hadoop; via data serialization and exchange services such as Avro; and for export to other relational databases via JDBC. For more on Kafka connectors for VoltDB, click here.

VoltDB, the HTTP connector and WebHDFS

VoltDB’s connector to Hadoop receives serialized data from Export tables and writes it out to Hadoop via HTTP requests to WebHDFS.

The VoltDB HTTP connector is a general-purpose export utility that can export to any number of destinations from simple messaging services to more complex REST APIs. The properties work together to create a consistent export process.

The HTTP connector contains optimizations to support exporting data to Hadoop via the WebHDFS protocol. Developers can choose between two formats for export data when using WebHDFS: comma-separated values (CSV) and Apache Avro format. By default, data is written as CSV data; however, developers can choose to set the output format to Avro by setting the type property. Avro is a data serialization system that includes a binary format that is used natively by Hadoop utilities such as Pig and Hive. Because it is a binary format, Avro data takes up less network bandwidth than text-based formats such as CSV.

VoltDB with Hadoop and big data provides developers with a closed-loop system to deliver full visibility into an organization’s data, enriching vast incoming streams of event data with historical analytics to support business decisions. Read more about VoltDB WebHDFS.

VoltDB offers a broad set of Big Data ecosystem integrations, certifications, industry partnerships and connectors to enable high-speed data export to Hadoop-based data warehouses and long-term analytics stores such as HPE Vertica, Teradata, and IBM Netezza.

VoltDB Big Data integrations enable developers to take advantage of the speed and cyclical nature of the import-export data pipeline.

 


Hadoop Partners and Certifications:


Hortonworks

Hortonworks is a leading commercial vendor of Apache Hadoop, the open source platform for storing, managing and analyzing Big Data. The Hortonworks Data Platform distribution of Apache Hadoop provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy Big Data solutions. VoltDB is a Certified Hortonworks partner.

Cloudera

Cloudera offers an enterprise-class implementation of Apache Hadoop. The company’s Cloudera Enterprise helps developers benefit from the experience of the open source and Big Data/Hadoop communities. Cloudera Enterprise includes CDH, the world’s most popular open source Hadoop-based platform, as well as advanced system management and data management tools. VoltDB is a Certified Cloudera partner.

MapR

MapR provides developers with an enterprise-grade Hadoop platform. MapR offers dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified distribution for Hadoop. VoltDB is a MapR Advantage Technology partner.

Developer Resources:

There are numerous resources available to developers.

Get Connected:

  • Developer Central

    One centralized place with all developer resources. Go There

  • A Look at a VoltDB Sample App

    In this blog, John Hugg walks us through a sample app in VoltDB. Read More

  • How VoltDB Works

    Take a simple dive into the VoltDB structure. Read More

  • Build a Sample App

    After Downloading VoltDB, here's a tutorial in building a sample app. Dive In

ipad-1.png

Narrow the Field of Database Choices

Our two minute assessment will help narrow your choices to the right technology for your next application. Maybe it's VoltDB. Maybe not.

Launch Survey
icon-1.png

Get Started Today

It shouldn't take weeks to begin building blazing apps with real-time personalization and fast transactions. Developers: Download VoltDB and spin through our Quick Start Guide in less than 30 minutes.

Download & Quick Start