Developing a Server Asset Scaling Strategy

This is part 2 of a 5-part blog post series. Read part 1 here

Apica’s Amazon Cloud Platform project set out to address “how much power” a company’s web application cloud implementation needs to run applications smoothly. This project uses virtual users with user scenarios created by Apica ZebraTester in a load test generated by Apica LoadTest Portal to determine the points where server performance starts to falter.

Finding the Perfect Fit

The goal of this project was to determine how much server power a web application needs to operate under typical, user-base growth, surge, and massive surge conditions. Once the test has determined how many simultaneous users it takes to see a negative impact on the user experience, it is possible to generate a scaling methodology that automatically assigns more servers and databases to the web application as needed. Having this information in advance works perfectly with the cloud hosting model because businesses can add and remove machines on-demand.

When the hosting infrastructure lacks the capacity necessary to handle the user-request load, the implementation fails and becomes unusable. The following chart demonstrates the relationship between how many requests-per-second the infrastructure can handle within a given user-count and metrics like stability and response time in an unstable environment. As the system gets overloaded with requests, instability and response times increase.


In the case of the above chart, the total number of requests the server can address drops from a peak of just over 400 requests-per-second just seconds after the test reaches the full 1,600 simultaneous users metric. Over the course of the next five minutes, the session times jump from a few seconds tops to an early peak of over a half minute after two minutes of testing before ultimately failing at the five-minute mark.

If your application is utilizing more resources than required to provide the optimal user experience, the extra resources do not improve the experience at all. Those unused resources waste hardware, power, and space, which ends up costing your business money. According to IT World, around one-third of data center servers sit idle. If you have too few resources, users may experience slow performance or complete system outages, which can lead users to seek out other applications for their needs. Going too far in either direction, then, is bad for business–so determining the ideal amount of server power is crucial.

A Proactive Strategy

The strategy behind the project had three focal points:

  • Capacity Planning: Establish how each system layer scales, and which instances should be used when scaling
  • Identify limitations and bottlenecks within the testing hardware, software, and practices
  • Determine how large of a system is needed to handle the initial testing load

The introduction and proliferation of new web-access technology like mobile broadband, smartphones, and tablet devices have forever changed the frequency of user access to web applications. Gone are the days when you only had to worry about usage surges when people were at home or at work; users may access your service at any time from any location, making traffic surges higher than ever before. Now that traffic spikes have increased so dramatically, knowing when and how to scale your company’s server implementations is even more important.

The Importance of Fast Performance

The human experience of using a web application is an extremely important element as to whether or not the product will succeed. If the servers have enough power to handle users demands, those users will experience a fast, smooth experience when using the application.

According to the marketing consultants at Kissmetrics, 16 percent of mobile web users will abandon a site that hasn’t loaded after 5 seconds while an additional 30 percent are willing to wait up to 10 seconds. Nearly half of all users will abandon a mobile site application that takes longer than 10 seconds to load. According to Google, adding just 250 milliseconds to the load time can make or break the deal for someone using a website.

As the application takes off, more people will start using it. If the servers are not powerful enough to handle the influx of new users, all users will begin to experience longer load times when using the application. Without a scaling strategy to address the extra demands, the audience is likely to abandon the application and seek out an alternative option.