KnowYourCloud Community

Scale IT !

The practice of shaping demand to fit the available resources can be found for example in transportation businesses, where airlines charges more for their service when demand is high and charge less to encourage more demand. In the real-time and interactive on-line world,  the challenge is to ensure that capacity meets demand.

The following diagram shows 3 cases of over capacity, under capacity and on demand capacity, which the latter can be achieved only by taking an advantage of the cloud elasticity.

Capacity vs demand

           Source:  The Rabbit Hole - Laurent Chades’ Blog on Cloud Computing

IT systems or applications have capacity limitations. Sometimes a new SaaS web application works well in its first phase on the air as it serves from tens up to a maximum of several hundreds of users. In this stage, there will be no need to worry about a scalable hosting environment and paying a flat fee for the fixed computing resources makes sense. Over time as the SaaS product matures, the amount of users increases and the overall usage goes up. The SaaS vendor will have to deal with these scalability challenges and hopefully without the service suffering from poor performance. So what are the options to maintain the application’s availability with the new demands? Or in other word how do we scale it? 

Scaling Types

Vertical Scaling (known also as Scaling In): Adding more hardware resources such as processors and memory.  Increasing your computing resources vertically in the traditional world can take weeks contrary to the cloud world. In the cloud the application can generate an API call to the cloud infrastructure and initiate a more powerful computing resource.

This cloud feature enables the application to auto-scale under an increasing load and maintain its normal performance levels. Bottlenecks caused by the programming platform, the script logic, database design or a poor structured SQL query or which just need additional poweful computing resources, can all cause overloads and lead to the need for vertical scaling.

Scaling up the hardware to the next best powerful  resource can cost two to four times more. Vertical scaling can solve core issues but it will come with additional costs that should be part of the basic assumptions of the cloud operation’s overall costs. Another issue that should be considered when architecting your system to support the SLA is that (vertical) scaling up will leave the system vulnerable with a single point of failure. You need to make sure you have a back up in place and perform scaling out as well.

The simplicity of scaling up a single resource, helps companies with their first cloud adoption steps, however it is important to change the perception of “let’s buy big to accommodate future demand” and make sure to start small and use vertical scaling (even manually) to support growth.

Image Source: RightScale Guide

Horizontal Scaling (known also as Scaling Out): Replication of web servers in a balanced model, to support an increase of concurrent client requests to maintain the application performance. Scaling out demonstrates the great benefit in elasticity of the IaaS platforms in making the capacity meet the demand.  In the cloud, the application can generate an API call to the cloud infrastructure to start more virtual resources on demand or shut those down to avoid from having underutilized resources. This scaling method together with the “pay as you go” model enable software vendor to grow on demand and generally add more resources in proportion to their amount of new users or actual demand in any time.

Generally, horizontal scaling increases the administrative efforts including time and expenses. One will find that those are not one time efforts but an on-going maintenance including updating the auto-scaling scripts monitoring large amount of resources, backup them, and more.


Image Source: RightScale Guide

Scaling in and out are methods that enable cloud consumers such as software development and operation teams to better support the potential overload risks. Software providers that want to use the cloud must consider the auto-scaling options as part of their hardware sizing desicions. 

Vertical Scaling     Horizontal Scaling
Cloud Adoption - Can solve core issues and system bottlenecks that create loads. Simple to perform hence helps with cloud adoption. Generally solves load of client requests on the web servers, hence supports usage growth.
Expensive - Scaling up in one pricing level causes additional operation costs. Scaling out usually includes adding cheap hardware resources. Maintenance will add additional costs to its operation.
Single point of failure. Strengthen the system robustness.
Limited - There will be maximum size of machine you can purchase. The IaaS providers aim to provide a cloud with infinite amount of resources. There are is no limit for horizontal scaling in such optimal environment.
The actual scaling function involve down time. Demand should be predicted early enough that resources provisioning can be enabled in advance of the peak.

 Application is made of tiers

Web applications must take advantage of the dynamic nature of the cloud. These applications are assembled from state and stateless components. Software architects should plan the applications so that they are broken into independent tiers and differentiate between them based on their cloud resources utilization and needs. When developing auto-scaling procedures, they should be treated as any other feature of the system, hence be part of the product management and the ordinary development cycle.


SaaS offering Levels - We can classify SaaS into four “maturity levels,” whose key attributes are configurability, multi-tenant efficiency, and scalability.Each level is distinguished from the previous one by the addition of one of those three attributes:

Level 1 - Ad-hoc/custom: Each customer has a customized version of the hosted application that runs as its own instance on the host’s servers. Migrating a traditional non-networked or client–server application to this level of SaaS typically requires the least development effort, and reduces operating costs by consolidating server hardware and administration.

Level 2 - Configurable: This adds greater program flexibility through configurable metadata, so many customers use separate instances of the same application code. This lets the vendor meet different customer needs through detailed configuration options, while simplifying common code base maintenance and updating.

Level 3 - Configurable, multi-tenant-efficient: This adds multi-tenancy to the second level, so a single program instance serves all customers. This enables more efficient server resource use without apparent difference to the end user, but ultimately faces scalability limits.

Level 4 - Scalable, configurable, multi-tenant-efficient: The fourth and final SaaS maturity level adds scalability through a multi-tier architecture that supports a load-balanced farm of identical application instances that run on a variable number of servers. The provider can adjust system capacity to match demand by adding or removing servers without further altering the software architecture.

Learn more about the 4 multi-tenant levels on our terminology page


Going back to the new SaaS vendor from above, there is no doubt that the integration of the application with the IaaS API (to reach level 4) to reach the service scalability consideration should be part of the product’s road map and be planned ahead in keeping with your company’s marketing efforts and the demand forecast.