PHP Sessions with a DynamoDB Backend

PHP Sessions with a DynamoDB Backend

The problem

When scaling an application, session sharing across multiple web servers is one of the first issues that need to be tackled. This issue is a bit more complex in autoscaling setups in the cloud where application servers are added or removed from the load balancer as traffic and load increases or decreases.

The options

  • Sticky Sessions

Sticky sessions are a very convenient functionality offered by many load balancing solutions (including that of Amazon’s ELB). But in the autoscaling scenario, users would find their sessions are lost every time an instance is terminated.

  • Memory storage

Other solutions do exist and are typically easy to implement on modern programming frameworks. For example, PHP supports the storage of sessions on a Memcache Backend, out-of-the-box, with a few simple configuration changes (http://www.php.net/manual/en/session.configuration.php#ini.session.save-handler ).

But memory storage has its shortcomings:

  1. Even with Amazon’s highly available ElastiCache solution, individual memcache nodes can and will go down (true, even if it does not happen very often);

  2. If you are not careful with capacity planning, allocated memory might be exhausted. At that point, Memcache will erase the least recently accessed records, and again users will find that their sessions are being terminated.

For many applications, the above could be acceptable.

  • Persistent storage

In cases where fast and persistent storage is required, for example, a relational DB like MySQL could be used as a backend. But relational databases are not easy to scale and anyway it is better to let them handle stuff they are good at: storing relational data. Network Attached Storage (for example, GlusterFS) offers another option but performance could be an issue.

 

Enter DynamoDB - 9 Steps

Key-Value storage systems are much better suited to solving the problem. More specifically, for those applications running on AWS another solution is now available: DynamoDB, Amazon’s highly available, consistently performing, and extremely scalable NoSQL DB as a service (zero management as all operations are managed by Amazon).

AWS SDK for PHP now even includes a drop in DynamoDB session handler class that can replace PHP’s native session engine.

This following are the steps required to use this option in your setup:

1 - Via the AWS management panel, create an AWS IAM user to properly control access to the session data.

       

 AWS will create a set of credentials for this user. Make sure to store them in a secure location.

2 - Next, you need to create a table to store your sessions in. Navigate to the DynamoDB tab in the AWS management panel. Select the region you want your session data to be stored in (make sure it is the same region as your app server instances to reduce latency and data transfer costs!). Create a table with your designated table name and add a hash primary key of type String and name “id”:

 Step 1 - 

Step 2 -  

3 - Select the minimum allowed throughput so that we perform our tests in the free tier.

4 - Install the latest version of the SDK on your web server image - Click here to learn how.

5 - In the SDK root folder, rename config-sample.inc.php to config.inc.php and update it with:

  1. The credentials of the user we created in step 1.

  2. The default-cache-config parameter (for example, apc).

6 - Give the user access to the table by attaching an access policy for that user (IAM section). Before doing so make note of:

    1. The account id (this is your account number without the hyphens as you can see it in https://aws-portal.amazon.com/gp/aws/manageYourAccount ). For example, 846544612030 (a random number for use in our example only).

    2. The region (for example, us-east-1)

    3. The name of the table (for example, php-sessions-test). The above provides the resource identifier for the sessions table, for example: 

      arn:aws:dynamodb:us-east-1:846544612030:table/php-sessions-test

7 - Armed with these details you can go ahead and create our policy – you can use the AWS policy generator for a graphical interface to build something like this:

{
  "Statement": [
    {
      "Sid": "Stmt1335183103764",
      "Action": [
        "dynamodb:BatchGetItem",
        "dynamodb:DeleteItem",
        "dynamodb:DescribeTable",
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:Query",
        "dynamodb:Scan",
        "dynamodb:UpdateItem"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:dynamodb:us-east-1:846544612030:table/my-session-table"
      ]
    }
  ]
}

8 - Now you need to configure PHP to use DynamoDB instead of the native session handling (local file) as follows:

a - Create a PHP file that instantiates the amazon DynamoDB client and registers DynamoDB session handler with PHP, for example:

/path-to/dynamosessions/dynamosessions.php

—————————————————————-

require_once 'path-to/AWSSDKforPHP/sdk.class.php';

 

// Instantiate the Amazon DynamoDB client.
// REMEMBER: You need to set 'default_cache_config' in your config.inc.php.
$dynamodb = new AmazonDynamoDB();
$dynamodb->set_hostname("https://dynamodb.us-east-1.amazonaws.com");

 

// Register the DynamoDB Session Handler.
$handler = $dynamodb->register_session_handler(array(
'table_name'           => 'php-sessions-test',
'hash_key'             => 'id',
'session_lifetime'     => 0,
'consistent_reads'     => true,
'session_locking'      => false,
'max_lock_wait_time'   => 15,
'min_lock_retry_utime' => 5000,
'max_lock_retry_utime' => 50000,
));

 b - Then modify php.ini with the following two options:

session.save_handler = user
auto_prepend_file = /path-to/dynamosessions/dynamosessions.php

 9 - Restart your web server service so that the changes in php.ini take effect. If everything went well, your sessions are now powered by Amazon’s highly available and high performance DynamoDB backend. As long as everything works, and depending on your deployment strategy, set the above changes in stone (i.e., make a snapshot of your AMI, update your bootstrap processes, and so on).

We also invite you to check out the related presentation - Enter DynamoDB - 9 Steps Manual

Three Important Considerations

1 - Estimate your read/write throughput requirements before you go live with this in production. If you have been using memcache, you can easily get its usage stats and make a very accurate guess. Post go-live you will need to monitor your usage pattern closely and update your read/write throughput accordingly. Getting this right is important because DynamoDB will throttle requests that exceed your reserved capacity, leading to extremely poor performance if you don’t allocate the right throughput. It is better to start with a certain overallocation and decrease based on actual usage than the other way around.

2 - Don’t forget that DynamoDB counts every 1kb of data (DynamoDB pricing) as a separate operation, so you need to take into account the average size of each session. In any case, it is a good practice to keep the size of your sessions as small as possible or this can become unreasonably expensive. Do not use session variables as a replacement for caching. For example, we have seen developers saving whole HTML blocks in the session array. Instead, you should be using memcache for that, and in general try to store what you can on cookies or in your database depending on the persistancy requirements. If your sessions are large, you’ll probably need to make some code changes before you are can use the DynamoDB session handler. Otherwise cost can become an issue with apps that create large session arrays. For the same reason, you’ll want to optimize your application so that it does not create sessions on pages or occasions that do not really need sessions (for example, user not signed in or page with no personalization).

3 - Does your application need locking? By default, PHP implements pessimistic locking. The class provided by AWS supports it, but you can configure it to FALSE if you don’t need it (this will decrease costs and increase performance). Thorough testing will be required to validate your selection. If you do need locking, make sure your application closes sessions as quickly as possible with session_write_close(). For example, group session actions together so that lengthy database queries do not keep the sessions locked for longer than necessary.

Keywords: cloud, AWS SDK for PHP, amazon dynamoDB, AWS management panel, AWS IAM, aws policy, aws support, amazon elb.

About the Authors:

Andreas Chatzakis

Andreas Chatzakis

Andreas is the CTO and co-founder of Spitogatos.gr / HomeGreekHome.com (a high traffic real estate portal in Greece). His background includes 5 years of consulting @ Accenture NL and he is the organizer of Greece’s AWS Usergroup.

Teo Kotsilinis

Teo is a DevOps professional, with a focus on Web infrastructure, Amazon Web Services and Cloud Management solutions.

 

There are 13 comments .

John Nousis —

Many thanks for this in depth tutorial. It seems that DynamoDB is one of the best ways to scale your app by storing the php sessions there and now there is no excuse for everyone not to use it:)

John Nousis —

Many thanks for this in depth tutorial. It seems that DynamoDB is one of the best ways to scale your app by storing the php sessions there and now there is no excuse for everyone not to use it:)

Jeremy Lindblom —

Great article! A very thorough guide to getting up and running!

Jeremy Lindblom —

Great article! A very thorough guide to getting up and running!

Andreas Chatzakis —

Jeremy, thanks for developing the session handler as part of the AWS PHP SDK! I was very impressed with the configurability of the class. 

John, appreciate your feedback & see you on upcoming meetups of the Greek AWS Usergroup! 

achatzakis —

Jeremy, thanks for developing the session handler as part of the AWS PHP SDK! I was very impressed with the configurability of the class. 

John, appreciate your feedback & see you on upcoming meetups of the Greek AWS Usergroup! 

Vishesh Joshi —

After registering the db Session handler, do we need to set the session save handler to this handler, or will it take it automatically?

ralph tice —

You probably should also set session.gc_probability = 0

You must be to post a comment.

* As a bonus, you'll receive our weekly newsletter!

Hitchhiker's Guide to The Cloud

Newvem's eBook for Cloud Operations