Capacity, Consumption and Utilization

Mistake #6: Taking too Few or No EBS Snapshots

One of the handiest features of AWS is EBS Snapshots – the ability to create virtual copies of EBS Volumes at a specific point in time. Snapshots offer an adequate instrument to perform backups of EBS Volumes, and are also efficient: only the data blocks that have changed in the volume since your last snapshot are saved. It comes as a surprise that with such an easy facility for volume backups, we still find so many volumes with no or too few snapshots.

Mistake #7: Taking too Many EBS Volume Snapshots

In the previous post of this series (Mistake #6 - Taking too Few or No EBS Snapshots) we pointed out how AWS users fail to leverage EBS Snapshots in their back-up process.  On the other extreme of snapshot usage lays another mistake. Being easy as it is, EBS snapshot creation, when done without moderation, leads very fast to sprawling. Too many snapshots to manage add complexity to your backup process. Even though you’re only charged for the differential data, snapshot sprawling can still increase your storage costs on an aggregated basis, especially in combination with another common mistake – forgetting to clean up stale resources.

Mistake #8: Forgetting to release allocated Elastic IPs.

This is a variation of the previous Mistake #5: Forgetting to Clean Up Stale Resources, but curious enough to get its own post.  Users tend to forget that AWS charges for Elastic IPs only when they’re not in us (it’s a bit counter-intuitive even though the motivation is clear: preventing users to keep IP addresses reserved without ever using them).

Mistake #9: Failing to Proper Configure Security Groups

Amazon’s approach to security is based on “shared responsibility” between users and AWS – Security Groups is one of the tools Amazon provides for users to fulfill their part. One would expect that when it comes to security, users don’t err.

Mistake #10: Not taking Advantage of Multiple Availability Zones

AWS ‘Availability Zones’ is a simple feature that distributes a user’s workload across multiple data centers within a given region. We don’t even need to go as far as saying users don’t leverage AWS multiple regions to distribute their workload – the complexity and overhead in this case might be significant. AWS availability zones on the other hand are a simpler tool to pull advantage from distributed workloads in the cloud, yet users commonly overlook this capability.

Elastic IPs not in use

In AWS, users are charged for allocated Elastic IPs that are not associated to a running instance nor to a network interface (VPC). Therefore, the best practice is to keep only those IP addresses that will be needed in the future. Allocated Elastic IPs you don’t plan to use in the future, or those you just forgot to release, may contribute to unexpectedly high bills.

Newvem tracks the usage of your allocated Elastic IPs and identifies those that haven’t been in use for a significant period. We suggest you consider releasing those allocated IP addresses if you do not plan to use them.

Compute Utilization Efficiency (High Load)

Newvem continuously monitors servers’ CPU load and notifies on high CPU loads.  We consider an average CPU load of 80% and above as a high load. As high CPU load can lead to a major service availability risks, which results in service degradation. In order to protect the system one should consider changing the instance size or implementing a different scaling method. We suggest that you either:

  • Scale up your computer instances – vertical scaling; move your workload to larger servers.
  • Scale out your compute instances – horizontal scaling; use additional servers.
  • Auto-scaling – AWS offers the ability to dynamically and automatically scale up or down according to conditions you define. With Auto Scaling, you can ensure that the number of Amazon EC2 instances you’re using increases seamlessly during demand spikes to maintain performance, and decreases automatically during demand lulls to minimize costs. Auto Scaling is enabled by Amazon CloudWatch and available at no additional charge beyond Amazon CloudWatch fees.

Compute Utilization Efficiency (CPU Load)

Newvem continually monitoring your servers CPU load and notifies you on high loads. High-load lead to a major down time risk, you might need to consider changing the instance type or implement a different scaling method. We consider an arbitrary of 80% CPU load and above as an high load and suggest that you scale up or scale out your compute instances (i.e. move your workload to larger servers or use additional servers).


Keywords: compute utilization efficiency, CPU Load, Newvem

Amazon AWS storage basics: Stop the sprawl before it begins!

There is a common perception that cloud storage should not really worry you because it is very cheap and available at any time. But is that really true? I often hear AWS consumers say that AWS storage means S3 (Simple Storage Service) – this is true but it is not the whole truth. There are actually 4 different AWS cloud storage models. We’ll get back to those but first let’s focus on the importance of understanding your AWS S3 footprint.

Compute Footprint Utilization (Idle)

The nature of cloud elasticity enables flexibility in choosing and provisioning instances that perfectly suit the demand at any time. With many instances deployed across multiple environments it is sometimes difficult to keep track of all the instances that you are using, leaving behind active instances that have been idle for a long time.

Hitchhiker's Guide to The Cloud

Newvem's eBook for Cloud Operations