Building a Scalable WordPress Setup on AWS
WordPress is one of the most popular open source blog platforms out there. It is used to power anything from simple blogs to complex portals, thanks to the variety of plugins the community has developed.
In this article we will describe the architecture options for deploying WordPress in AWS with scalability and high availability in mind. We will take advantage of the elasticity of the cloud and use more servers when we need them and less when we don’t (auto-scaling).
For the purpose of illustration, we will assume a typical web application architecture:
- Database layer (MySQL master-slave scheme)
- Storage layer for user uploaded content (e.g., images)
- Application layer (Multiple web servers running Apache or NGINX)
- Load balancer solution to distribute http requests to multiple web servers
- Caching layer (Memcache nodes)
- Content Delivery Network for fast delivery of static files to users across the globe
Similarly, the various application servers will all need to access the media files. To achieve that we will use a shared storage layer, for example, we can use Amazon’s Storage Service (S3). In order to take advantage of this option, we need to install and configure the WordPress plugin “Total Cache“. This also plays nicely with AWS CloudFront so it is easy to set up CDN delivery for your media files for faster page load across the globe. Alternatively, if you are not running on AWS, you might set up GlusterFS on an array of virtual instances and mount it from the various application servers.
Scaling the DB
We can utilize Amazon’s managed Relational Database Service (RDS) with Read Replicas to enable our DB layer to scale as required. Since WordPress typically powers read-heavy sites this is a perfect fit. Out-of-the-box WordPress would not be able to take advantage of the setup as it can only be configured to use a single database hostname. HyperDB plugin helps us here by distributing select queries among the master and the replica instances while sending all write operations to the master MySQL node.
Alternatively, instead of RDS, you could obviously set up your own MySQL installation on EC2 nodes (with or without the support of a cloud management solution like RightScale or Scalr), or use Xeround’s MySQL as a service.
Being able to scale your DB is valuable but it’s critical that you maintain your DB security. For instance, when scaling your DB, you may have security group ports open unnecessarily, opening your DB server to vulnerabilities. An issue like this typically occurs when using the same security groups to secure DB and non-DB servers. Datapipe Cloud Reports automatically recognizes your database servers, analyzes their vulnerability, and provides you with drill downs covering insights on specific instances for a quick fix turnaround.
In order to drastically improve the performance of our setup, we can use memcache to avoid repeated expensive calls to the Database. Among other things, the total Cache plugin allows us to configure caching, for example, with memcache backend. ElastiCache is ideal as the backend for this purpose.
Alternatively, you could set up memcache running on EC2 instances.
In order to make this cost effective, we need to set up auto-scaling for our application servers. This means we will launch extra application servers when traffic is higher and decommission them when the load is low. This allows us to pay only for what we use and is a key characteristic of the cloud that makes it attractive for such use-cases.
We can use either Amazon’s own auto-scaling capabilities or one of the available cloud management solutions (for example, Scalr or RightScale).
Elastic Load balancer (ELB) is easy to setup and is a fully managed service from Amazon. Alternatively, you could set up Nginx or HAProxy on EC2 nodes to act as the load balancer.
Few Final Notes
- We will be running multiple web servers, each with its own local file system. Any changes made that impact the file system (e.g., plugin installation) need to be replicated on all application servers.
- The simplest solution is to run those changes on one of the servers (e.g., edit your hosts file so that all changes are done on the same server), and then take a fresh AMI to replace the other running servers.
- Another solution would be to set up network-attached storage such as GlusterFS on a couple of EC2 instances and mount /var/www to that shared storage. This way all app servers will be working on the same centralized repository.
- This is not a problem for any media files that are stored on S3 thanks to the Total Cache plugin.
- WordPress does not rely on native PHP sessions and works on multi server setups out of the box. If you use any plugins or other apps that rely on PHP sessions you will need to cluster them either via the use of memcache or persistent storage (e.g. dynamoDB)
- The Amazon Cloud’s AZs are all identical, making it easy to scale applications across availability zones. The example described above operates on a single AWS Availability Zone (AZ). A multi-AZ setup would be resistant to outage of single data centers on an Amazon Web Services cloud. This provides new opportunities for highly available applications that were a lot harder to implement with traditional hosting. Datapipe Cloud Reports reveals bad practices when dealing with, for example, availability zones.
- With the ease of scaling also comes the need to closely monitor your cloud to prevent sprawl. Business Views by Cloud Reports allows you to measure costs, risks, and cloud assets as they apply to specific customers, business divisions, AZ’s, products, services, and nearly any other group you can define.
Datapipe Cloud Reports actively prioritizes significant risk to cloud health based on its severity, including security and availability.
About the Author:
Andreas is the CTO and co-founder of Spitogatos.gr / HomeGreekHome.com (a high traffic real estate portal in Greece). His background includes 5 years of consulting @ Accenture NL and he is the organizer of Greece’s AWS Usergroup.
Keywords: Amazon AWS elastic cloud services, EC2 Instances, AMI, ELB, WordPress, Scalability, Best Practice, DynamoDB, AWS Auto scaling, RDS, MySql, CDN, CloudFront, HyperDB, Memcache, ElasticCache, S3 Storage.