Infrastructure & Operations
Software Mag Logo

Feb/Mar 2001 COVER STORY


Scaling Your E-business

With load spikes and traffic bursts a fact of life in e-business, using a capacity planning approach based on a reference model for e-business can help you scale up your site.

By Daniel Menascé and Virgílio Almeida


2001-02 Scaling Your E-Business

Delays on the Internet frustate customers and cost e-businesses billions of dollars. While the causes of these delays vary, overloaded networks and servers are the most common ones. The viability of e-business depends on the ability of the IT infrastructure to offer timely and reliable services.

Yet the news is full of reports of horrendous performance by high-profile sites unprepared for an overwhelming and unexpected surge of users. Britannica.com and Victoria's Secret have been well-publicized examples. Another example is the online recruitment firm hotjobs.com, which spent $2 million in TV ads during the 1999 Super Bowl, and subsequently drew hundreds of thousands of visitors to its Web site. The company, however, had neglected to plan the capacity of the site before inviting millions of potential visitors. As a result, many visitors were shut out.

Load spikes and traffic bursts are facts of life in the operation of an e-business. In Election 2000, for example, sites like ABC, Yahoo, Fox, CNN, and others saw their traffic increase by 300%.

Capacity planning is the process of predicting when future load levels will saturate the system, and of determining the most cost-effective way of delaying system saturation as much as possible.

In the e-tail arena, e-commerce sites experienced in 2000 the same influx of shoppers the day after Thanksgiving—traditionally one of the busiest shopping days—as brick-and-mortar retailers, according to NetRatings, Milpitas, Calif., with spikes in apparel and consumer electronics in particular. Apparel sites were the hottest category, with an overall rise of 68% on that Friday following Thanksgiving, as compared to the rest of the week. Landsend.com skyrocketed 93%, followed by Gap.com, 86%, and Spiegel.com, 85%. Consumer electronics sites rose 46%, with CircuitCity.com jumping 126% in unique audience at home on Friday. Outpost.com rose 48%, and 800.com increased 40% in traffic.

To accommodate these types of traffic spikes, e-businesses must be able to build architectures that can scale many times in a short period of time. Many e-tailers have learned this the hard way, however. According to Jupiter Research, New York City, slow site performance plagued e-tailers in 1999 and was considered the third top online shopping problem during Christmas. During the most recent holidays, many sites briefly shut down due to excess traffic.

Developing an infrastructure that is rapidly scalable across local and wide areas networks, in a cost-effective way, requires a good understanding of system capacity modeling and planning, an area where many e-businesses are apparently lacking. This article details the most common causes that prevent a site from scaling up, as well as presents a road map to scalability—a capacity planning methodology. A reference model for e-business (see Figure, below) provides a framework for the scalability analysis methods discussed here. A capacity planning approach based on this reference model offers a way to determine scalability problems and to plan alternatives for scaling an e-business system.


2001-02 Model for E-business Planning

Three Requirements

Web-based services are offered to potentially tens of millions of users by hundreds of thousands of servers (i.e., content providers and service providers). Users and servers are connected through the Internet. These users or customers count on being able to access any service, anytime. Customers' increasing reliance on information-based services means that e-businesses, and the services they provide, must meet three requirements: availability, scalability, and cost-efficiency.

Availability means that users and customers can count on being able to access any Web service from anywhere, anytime, regardless of the load at both the Web site and the network. Availability also means that services are provided with quality; i.e., short and predictable response time. Scalability means that servers should be able to provide services to all potential customers, whose number is fast-growing and unknown for a company. Cost-effectiveness means that quality of Web services, respresented by availability and fast response times, should be achieved with an IT infrastructure that minimizes cost.

The Scalability Problem

The quality of service of e-business sites depends on many interrelated factors, such as site architecture, network capacity, and system software structure. Unpredictable public behavior compounds the complexity of an e-business site. Usage patterns can change overnight, with spikes in demand occurring for several reasons. For instance, breaking news always causes bursts of traffic on online editions of major newspapers. Or, when a company runs a marketing campaign, the capacity of its site may be inadequate to support the huge number of visitors who react to the campaign.

All of these characteristics of e-business clearly indicate that quantitative techniques are needed to manage the behavior of online companies and to guarantee quality of service. Mission-critical e-business sites require careful planning and design to ensure that the application delivers reliable and scalable services. Organizations must analyze the entire end-to-end system, plus understand and document the characteristics and performance of applications, servers, networks, load balancers, and firewalls. However, in many cases, scalability cannot be achieved because of the existence of bottlenecks—hardware and software resources that limit overall system performance.

Performance analysis is a key technique to understanding scalability problems in e-business. Because it is difficult to estimate traffic, e-business sites must be designed with scalability in mind. In other words, a designer of an online business must know a priori the limits of the system. For instance, a designer must know the maximum number of transactions per second the system can process (i.e., an upper bound on throughput) or the minimum response time that can be achieved by the business site (i.e., a lower bound on response time).

Performance bounding techniques allow designers to calculate optimistic and pessimistic bounds. The former refers to the best possible performance values. Throughput upper bounds and response-time lower bounds are optimistic bounds. Pessimistic bounds refer to the worst possible performance values. Scalability analysis refers to techniques that find a single bottleneck that cannot be sped up. When a bottleneck cannot be removed, the system is considered nonscalable in terms of performance. In short, managers must be aware in advance of the capacity limitations of their e-business systems.

What does "scalable" mean exactly? For purposes of this article, a system is scalable if there is a "straightforward'' way to upgrade it to handle an increase in traffic while maintaining adequate performance. Straightforward means that no system or software architectural changes should be required to scale the system. Examples of straightforward changes are: adding more servers to a system that already employs multiple servers, adding more CPUs to a multiprocessor, and replacing existing servers with faster servers that use the same architecture.

One approach to upgrading capacity is scaling horizontally or scaling out, which means adding more servers of the same type. Scaling vertically or scaling up means replacing the existing servers with faster ones.

Another not-so-straightforward way of dealing with increasing demand is to distribute some of the e-business functions and services provided by a site. The unpredictable server demands on the Internet generate peaks and valleys in demand for content and services. Companies are using new ways of setting up their infrastructures. Managers are no longer just throwing Web servers at Web performance problems. Instead, they are changing the way their networks and servers handle traffic. Web site capacity has been improved with new devices aimed at getting specific parts of the job done faster. Load balancers, traffic switches, caches, and secure transaction processors are new components that help e-business sites meet corporate and customer demand.

While neither caching nor load balancing are new concepts, they are being used in the Internet in new ways. Caching, which stores most recently used Web pages to speed retrieval, reduces network traffic by moving data closer to the users who are accessing it. Caching reduces network congestion because data does not have to travel as much across the public Internet or across an enterprise network to reach the person who needs it.

A high volume of transactions, though, requires massive computing power: lots of servers and good Web connections. But if a site provides a lot of images or other large-file content, it makes sense to distribute these servers around the country or the world. Many companies are doing this without maintaining multiple physical server sites by using a distributed content service, such as the ones offered by Akamai Technologies Inc., Cambridge, Mass.; CacheFlow Inc., Sunnyvale, Calif.; InfoLibria Inc., Waltham, Mass.; and epicRealm, Richardson, Texas.

Delayed response is magnified if a site delivers streaming audio or video. Strategies to deal with streaming media problems include multiple caching (either within a site or via an external network), "overflow" servers, and satellite transmission.

It is increasingly common for Internet tools to include load balancing capabilities. Server load balancing is a traffic management function used to distribute traffic evenly throughout a network, avoiding bottlenecks or overloading servers. These infrastructure technologies use different methods to meet the same goal: getting the best-possible performance out of an increasingly crowded Internet.

An additional technology—content management—does not directly affect network performance, but it is being integrated with caching, load balancing, and policy management systems with increasing frequency because it is a critical piece of the e-business infrastructure puzzle. Content management systems are used to oversee the way an enterprise generates, compiles, publishes, updates, and, ultimately, removes Web content from their sites.

Planning Ahead

Planning techniques should help management answer the following typical questions associated with the notion of scalability.

  • Is the online trading site prepared to accommodate the surge in volume that may increase the number of trades per day by up to 75%?
  • Are there enough servers to handle a peak of customers 10 times greater than the monthly average?
  • How can the organization guarantee the quality of electronic customer service for the different scenarios of traffic growth? In a B2B environment, sending and receiving sensitive data, conducting financial transactions, and exchanging credit and production data depend on the secure and fast transmission of information.

E-business sites may become popular very quickly. How fast can the site architecture be scaled up? What components of the site should be upgraded? Database servers? Web servers? Application servers? Network link bandwidth?

Traditional capacity planning approaches tend to be IT resource-centric. However, in e-business systems, an organization has to consider all aspects of the problem: business, functional, customer behavior, and IT resources. For an overall view of e-business, a reference model can be used to analyze e-business sites and plan their capacity and scalability properties. The reference model for e-business, shown on page 43, creates a framework for a quantitative approach. It also provides a basis for defining conceptual activities in the electronic business and for identifying improvement opportunities.

The reference model consists of four layers grouped into two main blocks. The upper block focuses on the nature of the business and the processes that provide the services offered by the e-business site. The lower block concentrates on the way customers interact with the site and the demand they place upon the resources of the site infrastructure. Each layer of the reference model is associated with two broad classes of descriptors and metrics used to provide a quantitative characterization of the layer.

External metrics and descriptors cover the nature of the business and are visible to management and customers. These metrics are used to assess the performance of the business processes. For instance, an organization could use a metric for e-business that reflects at the same time the behavior of the online store and its customers. One such metric is revenue throughput, measured in dollars/second generated by completed online transactions. Other external metrics could be availability, download times, page views/day, and unique visitors/day.

External descriptors give a quantitative overview of the business. For example, external descriptors include information such as the number of registered customers, number of potential customers, maximum number of simultaneous customers in the store, number of items, estimated operational cost, and services available to customers.

Internal descriptors and metrics characterize the site infrastructure and the way customers use services and resources. Internal metrics are oriented to measure the performance of applications and of the IT infrastructure. Examples of such metrics include HTTP requests/second, database transactions/second, server response time, transaction response time, and processor, server, disk, and network utilization. Internal descriptors also include application and architecture information, such as navigational structure, customers' navigation patterns, and characteristics of the components that make up the site.

Capacity Planning

Capacity planning is the process of predicting when future load levels will saturate the system, and of determining the most cost-effective way of delaying system saturation as much as possible. Future load levels are generally a function of a combination of three factors: natural evolution of existing loads, deployment of new applications and services, and changes in customer behavior. This last factor includes traffic surges due to new situations (e.g., breaking news, TV ad campaigns, or the release of a new product) as well as changes in customer navigational patterns due to the availability of new business functions.

Prediction is key to capacity planning because an organization needs to be able to determine how an e-business site will react when changes in load levels and customer behavior occur, or when new business models are developed. This determination requires predictive models and not experimentation.

According to the reference model for e-business, a capacity planning methodology must cover the business level, functional level, customer behavior level, and resource level. The business level deals with the understanding of how new business initiatives may affect the site load. An example of a business-level decision could be a company that decides to increase the degree of security of its transactions. The functional level addresses the e-business functions that support current and future business models. At this level, the reference model would be used to specify the functions that would be affected by the increasing security plan.

At the customer behavior level, an organization must analyze how new site functions will result in new navigational patterns. For example, at this level the planners would need to represent the changes in customer behavior due to the increased security mechanisms.

Finally, IT resource planning deals with the IT resources used to support the e-business. At this level, the capacity planner analysts would have to estimate what would be the additional resource cost to execute new functions in a security model, such as SSL. The main steps involved at this level are: IT environment characterization, workload characterization, performance prediction/modeling, and what-if analysis.

IT Environment Characterization. This step generates a description of the IT infrastructure as well as a workload description. The description of the IT infrastructure includes the type of hardware (e.g., server machines, disk farms, routers, load balancers, firewalls), servers (e.g., Web servers, application servers, database servers, domain name servers), software (e.g., operating systems, middleware, database management systems), network connectivity, network protocols, and payment services.

In this step, an organization must also determine the service-level agreements (SLAs) that the e-business has to meet. Examples of SLAs include: "End-to-end response time for search requests must be less than 8 seconds, 90% of the time;" "Application availability must be at least 99.5%;" and, "Disaster recovery time must not exceed 30 minutes."

Workload Characterization. Workload characterization is the process of precisely describing, in a qualitative and quantitative manner, the global workload of an e-business site. Load generators can be useful in the process of workload characterization. Many existing tools (see Table: Representative E-business Tools and Services) allow users to capture typical transactions and build parameterized scripts used to stress test the site at different simulated load levels. The use of stress testing is extremely time-consuming. An exhaustive testing of myriad factors is very difficult to carry out. Workload characterization and understanding is essential to performing stress testing.

Testing should be part of a capacity planning effort. By itself, however, testing is not capable of predicting capacity/performance tradeoffs.

Performance Prediction/Modeling. Modeling is critical to understanding and predicting the performance characteristics of Web services. Two types of modeling techniques are commonly used for capacity planning: simulation and analytic models. Simulation models mimic detailed interactions and dependencies between server components and may require the collection of very detailed parameters. Analytic models require higher levels of abstraction and can be used to generate quick answers to many different configuration scenarios. Predictive modeling tools are available from many vendors.

E-commerce sites with their unpredictable traffic spikes bring new challenges to performance modeling. Detailed and costly modeling analysis may not be worthwhile when the capacity planning analyst faces a large number of possible future scenarios. Quick bounding studies may be the right solution for these cases.

Consider an e-commerce site that is preparing for a surge of customers due to a special event, such as the Olympic Games, the World Soccer Cup, or an ad campaign. Management does not know how many customers will be attracted to the site. Some analysts estimate that the campaign could add a number of customers that varies between 100,000 and one million new visitors per day. Developing a detailed model to calculate that the proposed system will support 11.725 customers per second may be overkill. Simply knowing that the site can serve approximately nine customers per second for one alternative, or 28 for another alternative, is the right level of information to select one option over another. The options could be adding eight or 15 additional application servers to the current site configuration.

Consider the following example of bounding analysis. The search e-business function requires 0.05 seconds of disk I/O on average. Now consider that disk I/O is the bottleneck for this type of transaction. Then, according to the bounding analysis models (see Scaling for E-business: Technologies, Models, Performance, and Capacity Planning, Menascé and Almeida, Prentice Hall, 2000), the maximum throughput is the inverse of the total time spent at the bottleneck resource. In this example, this leads to 20 (= 1/ 0.05) search requests/second.

What-If Analysis. How can IT justify to higher levels of management an enormous dollar amount for site expansion without showing any analytics? The capacity planning methodology should be able to show the cost-effectiveness of the proposed solution, answering various what-if questions.

The IT infrastructure of e-business sites is multitiered, as illustrated in the Figure below. When analyzing e-business scalability, one has to take into account the flow of e-business transactions across the various layers. It is important to detect the bottlenecks; i.e., resources where transactions spend most of the service time.


Multitiered Architecture

2001-02 Multitiered Architecture

E-business sites are composed of servers organized in a multitiered architecture. A load balancer distributes incoming requests to any of the Web and authentication servers at the first layer. The next level is composed of transaction servers, which implement the business logic and may need data from back-end database servers.


Once a bottleneck is detected, performance improvements must be achieved by first improving the time spent by transactions at the bottleneck. Any effort on other resources may prove to be futile and wasteful of time and money.

The bottom line in e-business is generating revenue out of Web traffic. This can only be achieved if the IT infrastructure is ready to provide customers with a pleasant experience and with service of high quality.

Guaranteeing SLAs

By monitoring traffic at different points of presence, organizations have better control over how the traffic crosses different service providers' networks. Current monitoring tools vary in the way they conduct measurement. Some measure the site as a black box from using agents located at many different geographical regions through slow and fast connections. This approach helps site managers to understand the behavior of the e-business as perceived by clients worldwide.

Other tools also monitor the site from the inside. They measure response time and availability at each server as well as application-level failures.

The Bottom Line

The bottom line in e-business is generating revenue out of Web traffic. This can only be achieved if the IT infrastructure is ready to provide customers with a pleasant experience and with service of high quality. Competitors in e-business are a click away. The IT infrastructure of e-business sites is complex enough to preclude any guesswork when it comes to capacity planning. When planning the site capacity, it is very important to make sure that the site can handle the peak—not just the average—load.

Daniel Menascé is a professor of computer science at George Mason University and the co-director of the E-Center for E-Business. He holds a Ph.D. in computer science from UCLA. He has published and consulted extensively in the areas of Web and e-commerce performance, capacity planning, and software performance engineering. E-mail him at menasce@cs.gmu.edu.

Virgílio A. F. Almeida is a professor of computer science at the Federal University of Minas Gerais (UFMG), Brazil. He holds a Ph.D. in computer science from Vanderbilt University.
He has published and consulted extensively in the area of distributed systems and Internet performance. E-mail him at virgilio@dcc.ufmg.br.


For more information on this topic in the future, register Here.