Building a Scalable WordPress Setup on AWS

Building a Scalable WordPress Setup on AWS

Scalable WordPress setup on AWSWordPress is one of the most popular open source blog platforms out there. It is used to power anything from simple blogs to complex portals, thanks to the variety of plugins the community has developed.

In this article we will describe the architecture options for deploying WordPress in AWS with scalability and high availability in mind. We will take advantage of the elasticity of the cloud and use more servers when we need them and less when we don’t (auto-scaling).

For the purpose of illustration, we will assume a typical web application architecture:

  • Database layer (MySQL master-slave scheme)
  • Storage layer for user uploaded content (e.g., images)
  • Application layer (Multiple web servers running Apache or NGINX)
  • Load balancer solution to distribute http requests to multiple web servers
  • Caching layer (Memcache nodes)
  • Content Delivery Network for fast delivery of static files to users across the globe

Challenges

Shared storage

Similarly, the various application servers will all need to access the media files. To achieve that we will use a shared storage layer, for example, we can use Amazon’s Storage Service (S3). In order to take advantage of this option, we need to install and configure the WordPress plugin “Total Cache“. This also plays nicely with AWS CloudFront so it is easy to set up CDN delivery for your media files for faster page load across the globe. Alternatively, if you are not running on AWS, you might set up GlusterFS on an array of virtual instances and mount it from the various application servers.

Scaling the DB

We can utilize Amazon’s managed Relational Database Service (RDS) with Read Replicas to enable our DB layer to scale as required. Since WordPress typically powers read-heavy sites this is a perfect fit. Out-of-the-box WordPress would not be able to take advantage of the setup as it can only be configured to use a single database hostname. HyperDB plugin helps us here by distributing select queries among the master and the replica instances while sending all write operations to the master MySQL node.

Alternatively, instead of RDS, you could obviously set up your own MySQL installation on EC2 nodes (with or without the support of a cloud management solution like RightScale or Scalr), or use Xeround’s MySQL as a service.

Being able to scale your DB is valuable but it’s critical that you maintain your DB security. For instance, when scaling your DB, you may have security group ports open unnecessarily, opening your DB server to vulnerabilities. An issue like this typically occurs when using the same security groups to secure DB and non-DB servers. Datapipe Cloud Reports automatically recognizes your database servers, analyzes their vulnerability, and provides you with drill downs covering insights on specific instances for a quick fix turnaround.

Improving performance

In order to drastically improve the performance of our setup, we can use memcache to avoid repeated expensive calls to the Database. Among other things, the total Cache plugin allows us to configure caching, for example, with memcache backend. ElastiCache is ideal as the backend for this purpose.

Alternatively, you could set up memcache running on EC2 instances.

Auto-scaling

In order to make this cost effective, we need to set up auto-scaling for our application servers. This means we will launch extra application servers when traffic is higher and decommission them when the load is low. This allows us to pay only for what we use and is a key characteristic of the cloud that makes it attractive for such use-cases.

We can use either Amazon’s own auto-scaling capabilities or one of the available cloud management solutions (for example, Scalr or RightScale).

Load balancing

Elastic Load balancer (ELB) is easy to setup and is a fully managed service from Amazon. Alternatively, you could set up Nginx or HAProxy on EC2 nodes to act as the load balancer.

Wordpress Site on AWS Architecture

 Few Final Notes

  • We will be running multiple web servers, each with its own local file system. Any changes made that impact the file system (e.g., plugin installation) need to be replicated on all application servers.
    • The simplest solution is to run those changes on one of the servers (e.g., edit your hosts file so that all changes are done on the same server), and then take a fresh AMI to replace the other running servers.
    • Another solution would be to set up network-attached storage such as GlusterFS on a couple of EC2 instances and mount /var/www to that shared storage.  This way all app servers will be working on the same centralized repository.
    • This is not a problem for any media files that are stored on S3 thanks to the Total Cache plugin.
  • WordPress does not rely on native PHP sessions and works on multi server setups out of the box. If you use any plugins or other apps that rely on PHP sessions you will need to cluster them either via the use of memcache or persistent storage (e.g. dynamoDB)
  •  The Amazon Cloud’s AZs are all identical, making it easy to scale applications across availability zones. The example described above operates on a single AWS Availability Zone (AZ). A multi-AZ setup would be resistant to outage of single data centers on an Amazon Web Services cloud. This provides new opportunities for highly available applications that were a lot harder to implement with traditional hosting. Datapipe Cloud Reports reveals bad practices when dealing with, for example, availability zones.
  • With the ease of scaling also comes the need to closely monitor your cloud to prevent sprawl. Business Views by Cloud Reports allows you to measure costs, risks, and cloud assets as they apply to specific customers, business divisions, AZ’s, products, services, and nearly any other group you can define.

Datapipe Cloud Reports actively prioritizes significant risk to cloud health based on its severity, including security and availability.


About the Author:

Andreas Chatzakis

Andreas Chatzakis

Andreas is the CTO and co-founder of Spitogatos.gr / HomeGreekHome.com (a high traffic real estate portal in Greece). His background includes 5 years of consulting @ Accenture NL and he is the organizer of Greece’s AWS Usergroup.

Contact Him

 

Keywords: Amazon AWS elastic cloud services, EC2 Instances, AMI, ELB, WordPress, Scalability, Best Practice, DynamoDB, AWS Auto scaling, RDS, MySql, CDN, CloudFront,  HyperDB, Memcache, ElasticCache, S3 Storage.

There are 8 comments .

Peter A Vandever —

and the cost?

Daniel Koffler —

Interesting deployment. However, I’m not a huge fan of services that rely on EBS as it seems to be the least stable AWS service and performance tends to vary greatly. 

I recentlycame across ClouSE which is a drop-in MySQL db engine replacement that uses S3 for storage and was wondering if anyone’s has any feedback on working with it.

The Oblaksoft guys have an interesting post on running WordPress in AWS where all storage is done using S3 http://www.oblaksoft.com/wordpress-on-s3-newsletter-may-2012/ It’s worth a look.

DSOC Orchestrator TM —

I would use more than one load balancer for high traffic configurations.

Andreas Chatzakis —

Hi Peter,  It depends on the number and type of instances you would need for your traffic. AWS calculator could give an indication: http://calculator.s3.amazonaws.com/calc5.html DSOC Orchestrator TM estimate of 600$ per month could be valid but lower costs are possible if your blog has bursty traffic since autoscaling means you only use servers when you need them.

You must be to post a comment.

* As a bonus, you'll receive our weekly newsletter!

Hitchhiker's Guide to The Cloud

Newvem's eBook for Cloud Operations