Volt Active Data and Amazon EBS – Optimal use of IO devices

Volt Active Data and Amazon EBS – Optimal use of IO devices

April 03, 2017

Volt Active Data is now available ‘by the hour’ on Amazon EC2. To get best results and optimal TCO from your Volt Active Data cloud instances, users need to understand how Volt Active Data handles IO, which we’ll discuss below. At the close of this blog, we’ll also review available IO options.

Why would an in-memory database like Volt Active Data need IO?

Even in a replicated, highly-available configuration, you will eventually need to restart the database, either because of planned maintenance or because of an incident at your hosting provider. This means you’ll need something to restore it from, which means disk and hence IO.

A legacy RDBMS is architected on the assumption that all data lives on disk and a small subset is in memory. The design compromises this leads to cause all sorts of side effects, among which is that IO can be both verbose and very tightly coupled to your transactions. Volt Active Data and other in-memory databases, on the other hand, take a different architectural approach, keeping all data in memory. Since all data is in memory, it needs to be flushed to disk every now and then as part of the database durability strategy.

Volt Active Data and IO

As we implied above, Volt Active Data’s approach to IO is different than that of a legacy RDBMS. In a conventional configuration, a Volt Active Data database will have multiple streams of serial command log related-IO (one for each core), with changes appended to files in micro-batches. As a consequence, the number of IOPS needed does not increase linearly with the number of transactions and is instead tied to the physical space needed to store the transactions. Volt Active Data also uses serial IO to flush database snapshots to disk, and thus is fundamentally thriftier in its IO expectations than legacy databases.

Traditional IO

When using your own dedicated hardware there are two kinds of IO:

  • Random IO occurs when a series of writes is issued to different locations on the file system. In traditional databases, Random IO is used to update data files to make them reflect changes that have happened in RAM. This kind of IO is generally measured in IOPS (Input/Outputs per second). A generic ‘spinning rust’ drive can do about 160 random IOPS per second. Many database products that rely on Random IO have significant IOPS requirements that may not be easily satisfied in a cloud environment. Volt Active Data does almost no Random IO.
  • Serial IO is used when the database is appending to an existing file, with a database journal being a classic example. IOPS work differently here – the key word in the definition above is random. With a physical drive, the actual execution cost is how long it takes to move the drive head into position, not how long it takes to transfer the data. If you are writing many blocks that are stored on the same column on the disk, you can often see a 10-15x improvement in observed throughput, provided nobody else is using the drive and the underlying mirroring scheme doesn’t fragment access (as does RAID 5). As a consequence, in a traditional DB a pair of mirrored drives, properly configured, can outperform a lot of entry-level SANS for serial IO, especially if every single transaction generates an IOP. If using Volt Active Data on conventional disks, you’ll get the best results if you use separate disks for command logs and snapshots, as you can avoid ‘random’ IO.

Deploying Volt Active Data in AWS

IO works differently in AWS than in the ‘traditional world’ described above. As database professionals, we are used to working with dedicated drives, or a SAN. In both cases, we are encouraged to think in terms of IOPS. AWS uses an Elastic Block Store (EBS) to provide disk volumes to servers. It provides four different types of volume, in three of which the number of IOPS you get is defined by the size of the volume as well as the volume type. This has interesting implications for deployment. A further complication is the concept of ‘Burst Balance’. In EBS there is a fundamental difference between the IOPs a volume will support in a ten-minute benchmark and what in can support under sustained load.

Burst Balance

“Burst Balance” allows you to get much higher IO throughput for a finite period than your EBS device would normally support. The diagram below shows this for the ‘sc1’ volume type – a 1TiB volume will support 80MiB/sec until Burst Balance reaches zero, at which point IO will be constrained to the base throughput of 12 MiB a second. The way to increase the sustained IO capacity is to make the volume bigger, even if you don’t need the space. Thus, if we need to sustain 100 MiB/sec, we can get that by creating an 8 TiB volume. The side effect will be that we need to pay for the extra TiB.

Burst vs Base Throughput

Burst Balance is available as a volume-level statistic. We strongly recommend creating a CloudWatch alarm to make sure it doesn’t reach zero, as this could have calamitous effects on throughput. Burst Balance must also be measured during benchmarks to make sure it is not degrading in an unhelpful manner. To complicate matters:

  1. Burst Balance stats are only generated when the volume is being used, not when it’s mounted.
  2. Burst Balance stats take up to 10 minutes to show up in CloudWatch.

We use a script like this to measure it:

STDATE=`date ‘+%Y-%m-%d’`
STDATEMIN=`date ‘+%H:%M’`

./runCluster.sh async-benchmark | tee -a $FNAME

EDDATE=`date ‘+%Y-%m-%d’`
EDDATEMIN=`date ‘+%H:%M’`

VOLUME_ID=`cat v.txt`

sleep 900

aws cloudwatch get-metric-statistics –metric-name BurstBalance \
–namespace AWS/EBS –period 60 \
–start-time ${STDATE}T${STDATEMIN}:00 \
–end-time ${EDDATE}T${EDDATEMIN}:59 \
–statistics Average –dimensions Name=VolumeId,Value=${VOLUME_ID}

AWS Volume Types

Sc1 and St1 are conventional ‘spinning rust’ disks; gp2 and io1 are solid state drives. With gp2 you get a fixed 10000 IOPS; io1 allows you to pick the number of IOPS you need.  To make things more complicated, AWS has a specific definition of an IOPS – changes to blocks that are ‘next’ to each other are merged, so a set of 8 contiguous 32 changes will be merged into a single 256K ‘IOP’.  Consequently, it’s hard to make clear cost predictions for io1, as we need to benchmark our application to establish average IOPS.

Type Description Official Use Cases Cost for a 4TiB Volume per month (2/2017)
Sc1 Lowest cost HDD volume
  • Throughput-oriented storage for large volumes of data that is infrequently accessed
  • Scenarios where the lowest storage cost is important

US$100

48 MiB/Sec Sustained

250 MiB/Sec Burst

St1 Through Optimized HDD
  • Streaming workloads requiring consistent, fast throughput at a low price
  • Big data
  • Data warehouses
  • Log processing

US$180

160 MiB/Sec Sustained

500 Mib/Burst

Gp2 Default SSD – used in Ec2 root volumes
  • Recommended for most workloads
  • System boot volumes
  • Virtual desktops
  • Low-latency interactive apps
  • Development and test environments

US$397

160 MiB/Sec Sustained or Burst.

No Burst limit.

10000 IOPS fixed

Io1 Fancy SSD
  • Critical business applications that require sustained IOPS performance, or more than 10,000 IOPS or 160 MiB/s of throughput per volume
  • Large database workloads, such as:
    • MongoDB
    • Cassandra
    • Microsoft SQL Server
    • MySQL
    • PostgreSQL
    • Oracle

US$ 1,265

MiB/Sec 320 Sustained or Burst.

10000 IOPS fixed
Using 1000 IOPS would bring the bill and capability down to:
US$565

MiB Sec 250

1000 IOPS

Recommendations for deploying Volt Active Data in EBS include:

  • Always put Volt Active Data data files on a separate volume. This is so you can easily backup and move the data if you need to in the future.
  • For m4.xlarge and m4.2xlarge, a 4GiB sc1 volume ought to be enough – we’ve struggled to saturate any of these instances in tests designed to generate as much IO as possible.
  • For larger instances, you’ll probably need to use st1 and pay attention to Burst Balance.

To try it for yourself, visit the AWS Marketplace.

  • 184/A, Newman, Main Street Victor
  • info@examplehigh.com
  • 889 787 685 6