Estimating your Amazon EBS Snapshots Cost - Part 1: Block Level Incremental Snapshots
When using Amazon EBS snapshots for your EC2 backup solution, you worry about stuff like automating EBS snapshots. To rely on EBS snapshots as a backup solution, however, you may want to be able to estimate the cost of storing the snapshots. Currently, the exact size of EBS snapshots is not available. In part 1 of this 2 part series, we will try to better understand how EBS snapshots work.
AWS documentation specifies that EBS snapshots are incremental. But what does that mean exactly? It means that the first snapshot you take of a volume that contains all the data within is known as a full snapshot. Every subsequent snapshot stores the changes that were made since the last one. This is a very efficient method as nothing needs to be calculated when the snapshot starts and only the minimum required data is copied. The AWS backup mechanism needs only to track the changes that were made to the EBS volume between snapshots (probably using a bitmap).
The AWS “incremental” snapshot backup is actually incremented at the block-level. The changes monitored are not changes in files, but rather changes at the disk level. The disk is divided into “blocks” and every modified block (a write happened in that block) is marked to be copied at the next snapshot. Even if only part of a block has been changed, all of it will be copied. It is not so important to know which block size AWS uses, though it is probably somewhere between 64KB and 4MB (like most block-level snapshot solutions). Smaller blocks will provide more “accurate” backup as only the minimal changes will be copied. If the block is small, however, the metadata needed to manage it increases. There are more blocks per volume and bitmaps are bigger so there’s a trade-off. With large amounts of data, usually larger blocks are preferred.
The first full EBS snapshot is content-aware, which means that only the blocks that actually contain data are copied. If you have a 1TB EBS volume that is only half full than the snapshot will only include that half and that is what you’ll pay for. How does the AWS snapshot mechanism know if a volume’s blocks actually contain data? The answer is quite simple and is related to the bitmap we mentioned earlier. AWS’s snapshot mechanism keeps a bit (it doesn’t have to be exactly one bit, but a bit in concept) for each block on the volume. Every time there’s a write request on an EBS volume, the corresponding bit/s in the bitmap are turned on. Changes are tracked from the “birth” of the EBS volume, and every block that has ever been written in the volume will be included in the first full snapshot. After each snapshot is taken, a new bitmap is created, and tracks all the writes from the point-in-time of the previous snapshot.
In the following diagram we can see the changes between snapshots. The first one is the full snapshot. The blue area represents data on the volume and is what will be copied to the snapshot. Subsequent snapshots will only copy the green areas, which are the areas that were modified. The green areas can represent newly added data or existing data that has been updated or changed.
In part 2, I will delve deeper into how to assess the size of snapshots and, from that, determine their cost.
[Volume Backup Effectiveness: Newvem analyzes your EBS volume and snapshot usage patterns to help you increase control and enhance your backup policies. Use it for Free!]
About the Author
Uri Wolloch is the founder & CTO of N2W Software. He has over 15 years of software development experience working at various companies in different roles. In the past 10 years, Uri’s professional focus has been on IT infrastructure software and storage. Uri has worked as a software architect at IBM Tivoli focused on data protection software in physical and virtual environments. In 2011, he founded N2W Software, a company providing IT solutions for cloud environments. N2W Software’s new solution: Cloud Protection Manager (CPM), is a comprehensive backup and recovery solution for Amazon EC2 fit for enterprise organizations as well as smaller companies. Contact Uri
Cloud Protection Manager (CPM), allows EC2 users to use EBS snapshots as their mean of backup and recovery of EC2 instances, and will provide a comprehensive backup & recovery solution that even an enterprise can use.
Keywords: Amazon AWS elastic cloud services, Amazon EC2, Cloud Utilization, Amazon Web Services, AWS EC2, Amazon EBS, EBS snapshots, Cloud Automation, Cost Efficiency, Cloud Scalability, Cloud Performance, EC2 instance, Cloud Volume, Cloud Backup, Cloud Continuity, Consumption, EBS Utilization