pnp.gif

How To: Conduct Performance Testing

J.D. Meier, Prashant Bansode, Scott Barber, Mark Tomlinson

Applies To

  • Applications Performance Testing
  • Capacity Planning
  • Load Testing
  • Stress Testing

Summary

This How To provides a high-level introduction to a basic approach to performance-testing your applications and the systems that support those applications. Performance testing is typically done to help identify bottlenecks in a system, establish a baseline for future testing, determine compliance with performance goals and requirements, and collect other performance-related data to help stakeholders make informed decisions related to the overall quality of the application being tested. In addition, the results from performance testing and analysis can help you to estimate the hardware configuration required to support the application(s) when you “go live” to production operation.

Contents

  • Objectives
  • Overview
  • Summary of Steps
  • Step 1. Identify Desired Performance Characteristics
  • Step 2. Identify Test Environment
  • Step 3. Create Test Scripts
  • Step 4. Identify Metrics of Interest
  • Step 5. Create Performance Tests
  • Step 6. Execute Tests
  • Step 7. Analyze the Results, Report and Retest
  • Resources

Objectives

  • Gain familiarity with performance testing fundamentals.
  • Learn a basic approach to performance-testing Web applications.
  • Learn how to establish performance testing baselines.

Overview

At its most basic level, the performance testing process can be viewed in terms of planning and preparation, test execution, data analysis, and results reporting. These activities occur multiple times, more or less in parallel. In some cases the activities are applied sequentially and are repeated until performance testing is deemed complete. In the interest of simplicity, this How To follows this type of sequential and iterative approach. The activities below are represented as a simple but effective sequence of steps that are easy to apply to any performance testing project, from small-scale unit-level performance testing to large-scale production simulation and capacity-planning initiatives. The advantage of this approach lies in the fact that the same overall concepts and activities can be applied equally effectively to both expected cases and unexpected, exceptional cases, rather than prescribing specific actions for every possible circumstance.

Performance testing can be thought of as the process of identifying how an application responds to a specified set of conditions and input. To accomplish this, multiple individual performance test scenarios (such as suites, cases, and scripts) are often needed to cover the most important conditions and/or input of interest. To improve the accuracy of a performance test’s output, the application should, if at all possible, be hosted on a hardware infrastructure that is separate from your production environment, while still providing a close approximation of the actual environment. By examining your application’s behavior (the output) under simulated load conditions (the input), you usually can identify whether your application is trending toward or away from the desired performance characteristics.

Some of the most common reasons for conducting performance testing can be summarized as follows:
  • To compare the current performance characteristics of the application with the performance characteristics that equate to end-user satisfaction when using the application.
  • To verify that the applications exhibits the desired performance characteristics, within the budgeted constraints of resource utilization. These performance characteristics may include several different parameters such as the time it takes to complete a particular usage scenario (known as response time) or the number of simultaneous requests that can be supported for a particular operation at a given response time. The resource characteristics may be set with respect to server resources such as processor utilization, memory, disk input/output (I/O), and network I/O.
  • To analyze the behavior of the Web application at various load levels. The behavior is measured in metrics related to performance characteristics, as well as other metrics that help to identify bottlenecks in the application.
  • To identify bottlenecks in the Web application. Bottlenecks can be caused by several issues such as memory leaks, slow response times, or contention under load.
  • To determine the capacity of the application’s infrastructure and to determine the future resources required to deliver acceptable application performance
  • To compare different system configurations to determine which one works best for both the application and the business
Performance testing of Web applications is frequently subcategorized into several types of tests. Two of the most common are load tests and stress tests. Additionally, performance testing can add value at any point in the development life cycle; for example, performance unit testing frequently occurs very early in the life cycle, while endurance testing is generally saved for very late in the life cycle. (For more information, see Explained: Types of Performance Testing.

Input

Common input items for performance testing include:
  • Desired application performance characteristics
  • Universal metrics of interest
  • Application usage characteristics (scenarios)
  • Workload characteristics
  • Test environment configuration
  • Production environment configuration
  • Performance test plans, techniques, tools, and strategies

Output

Common output items for performance testing include:
  • Result data representing the behavior and performance characteristics of the application under various load and usage conditions
  • Bottleneck suspects deserving additional analysis
  • Updated test plans and performance testing priorities
  • Estimates or predictions about production application performance characteristics
  • Current operating capacity

Steps

  • Step 1. Identify Desired Performance Characteristics
  • Step 2. Identify the Test Environment
  • Step 3. Create Test Scripts
  • Step 4. Identify Metrics of Interest
  • Step 5. Create Performance Tests
  • Step 6. Execute Tests
  • Step 7. Analyze Results, Report, and Retest

Step 1. Identify Desired Performance Characteristics

You should start identifying, or at least estimating, the desired performance characteristics early in the application development life cycle. Record the performance characteristics that your users and stakeholders would equate to a successfully performing application, in a manner that is appropriate to your project’s standards and expectations.
Characteristics that frequently correlate to a user’s or stakeholder’s satisfaction typically include:
  • Response time. For example, the product catalog must be displayed in less than three seconds.
  • Throughput. For example, the system must support 100 transactions per second.
  • Resource utilization. For example, CPU utilization is not more than 75 percent. Other important resources that need to be considered for setting objectives are memory, disk I/O, and network I/O.

{See How To: Quantify End-User Response Time Goals and How To: Identify Performance Testing Objectives for more information about capturing and recording desired Performance Characteristics.}

Step 2. Identify Test Environment

The degree of similarity between the hardware and network configuration of the application under test conditions and under actual production conditions is often a significant consideration when deciding what performance tests to conduct and what size loads to test. It is important to remember that it is not only the physical environment that impacts performance testing, but also the business or project environment.

In addition to the physical and business environments, you should consider the following when identifying your test environment:
  • Identify the amount of test data needed for each parameter. Determine what kind of data the application consumes at each step of activity through the system. How many records are moving throughout the end-to-end transaction? How big are the queried results sets? How much unique data will I need to feed the automation for my testing in order to emulate real-world conditions?
  • Identify critical system components. Does the system have any known bottlenecks or weak points? Are there any integration points that are beyond your control for testing?

Identify Physical Environment

The key factor in identifying your test environment is to completely understand the similarities and differences between the test and production environments. Some critical factors to consider are:
  • Machine configuration
  • Machine hardware (processor, RAM, etc.)
  • Overall machine setup (software installations, etc.).
  • Network architecture and end-user location

Identify the Business Environment

Consider the following test project practices:
  • Document test team roles and contacts. For example, the network expert, the database specialist, the developer, the test scripters, the business analysts, the project manager, the CIO, and so on.
  • Document risks that may result in failure of the testing project. For example, inability to obtain lab resources, tool failure, communications issues, and so on.

Considerations

  • Although few performance testers install, configure, and administrate the application being tested, it is beneficial for them to have access to the servers, software, and administrators who do.
  • Recommendations for configuring the load generation tool:
    • Performance testing is frequently conducted on an isolated network segment to prevent disruption of other business operations. If this is not the case for your test project, ensure that you obtain permission to generate loads during certain hours on the available network.
    • Get to know the IT staff. You will likely need their support to perform tasks such as monitoring overall network traffic and configuring your load-generation tool to simulate a realistic number of Internet Protocol (IP) addresses.
    • Remember to figure out how to get load balancers to treat the generated load as though it were an actual user load.
    • Validate that firewalls, Domain Name System (DNS), routing, and so on treat the generated load like a load that would typically be encountered in an actual production environment, and that the test environment is treated similarly to the production environment.
    • Determine how much load you can generate before the load generation becomes a bottleneck.
  • It is often appropriate to have systems administrators set up resource monitoring software, diagnostic tools, and other utilities on AUT servers.

Step 3. Create Test Scripts

Key user scenarios for the application typically surface during the process of identifying the desired performance characteristics of the application (Step 1). If this is not the case for your test project, you will need to explicitly determine the usage scenarios that are the most valuable to script. {For more information, see How To: IdentifyKeyScenariosThatHasn’tBeenWrittenYet.} To create test scripts from the identified or selected scenarios, most performance testers follow an approach similar to the following:
  • Identify the activities or steps involved in each of the scenarios; for example, the “Place an Order” scenario for an e-commerce application may include the following activities:
    • Log on to the application.
    • Browse a product catalog.
    • Select a product.
    • Place order, and so on.
  • For each activity or step, consider and design the data associated with that step. A common method is to use a table similar to the one below:

Scenario StepData InputsData Outputs
Log on to the applicationUsername (unique)Password (matched to username)
Browse a product catalogCatalog Tree/Structure (static)User Type (weighted)Product Description Sku# Catalog Page Title Advertisement Category


Only after you have detailed the individual steps can you effectively and efficiently create a test script to emulate the necessary requests against the application to accomplish the scenario {For more information, see HowTo:blah}.

Considerations

  • If the request accepts parameters, ensure that the parameter data is populated properly with random and/or unique data to avoid any server-side caching.
  • Remember to account for user abandonment, if it applies to your application.
  • It is useful to create the test script in such a way that it can optionally execute multiple iterations without end, thereby facilitating your ability to vary test duration.
  • If appropriate, set a delay for each iteration of the transaction; this will serve as a control on the test.
  • If the tool does not do so automatically, you will likely want to add a wrapper around the requests in the test script in order to measure the request response time.
  • Beware of allowing your tools to influence your test design. Better tests almost always result from designing tests on the assumption that they can be executed and then adapting the test or the tool when that assumption is proven false, rather than by not designing particular tests based on the assumption that you do not have access to a tool to execute the test.
  • It is generally worth taking the time to make the script match your designed test, rather than changing the designed test to save scripting time.
  • Significant value can be gained from evaluating the data collected from executed tests in order to test or validate script development.
  • Ensure that your test design documents (in the form of diagrams, e-mail, sketches, etc.) what the script is actually doing.

Step 4. Identify Metrics of Interest

When identified and captured correctly, metrics provide information about how your application’s performance compares to your performance objectives. In addition, metrics can help you identify problem areas and bottlenecks within your application.

Using the desired performance characteristics identified in step 1, identify metrics to be captured that focus on measuring performance and identifying bottlenecks in the system.
When identifying metrics, you will either use direct objectives or indicators that are directly or indirectly related to those objectives. The following table presents an example of a metric corresponding to the performance objectives identified in step 1.

MetricAccepted level
Request Execution timeMust not exceed 8 seconds
Throughput100 or more requests / second
% process timeMust not exceed 75%
Memory Available25 % of total RAM

Considerations

  • Although it may seem like a commonsense practice, it is important to verify that system clocks are synchronized on all of the machines from which resource data will be collected. Doing so can save you significant time, and prevent you from having to dispose of the data entirely and repeat the tests after synchronizing the system clocks.
  • Involve the developers and administrators in the process of determining which metrics are likely to add value and which method best integrates the capturing of those metrics into the test.
  • Collecting metrics frequently produces very large volumes of data. While it is tempting to reduce the amount of data through averaging, always exercise caution when using this or other data reduction techniques as valuable data is commonly lost when reducing data.

Step 5. Create the Performance Test

The details of creating an executable performance test are extremely tool-specific. Regardless of the tool that you are using, creating a performance test typically involves taking a single instance of your test script (or virtual user) and gradually adding more instances and/or more scripts over time, thereby increasing the load on the component or system.

To determine how many instances of a script are necessary to accomplish the objectives of your test, you first need to identify a workload that appropriately represents the usage scenario related to the objective.

Identifying a Workload of Combined User Scenarios

A workload profile consists of an aggregate mix of users performing various operations. Use the following conceptual steps to identifying the workload:
  • Identify the distribution (ratio of work). For each key scenario, identify the distribution / ratio of work. The distribution is based on the number of users executing each scenario, based on your application scenario.
  • Identify the peak user loads. Identify the maximum expected number of concurrent users of the Web application. Using the work distribution for each scenario, calculate the percentage of user load per key scenario.
  • Identify the user loads under a variety of conditions of interest. For instance, you might want to identify the maximum expected number of concurrent users for the Web application at normal and peak hours.

For more information about how to create a workload model for your application, see “How to - Model a Workload for a Web Application” at <<Add url>>

Creating the Performance Test

After you have identified the workload for each user scenario to be tested, create your performance test by performing the following conceptual steps:
  • Create a performance test that will take a single instance of the test script that corresponds to the user scenario to be tested. Later, more instances will be added for each additional scenario.
  • Gradually add more instances over time, increasing the load for the user scenario to the maximum workload identified in the above step. It is important to allow sufficient time between each s increase in the number of users, so that the system has enough time to stabilize before the next set of user connections executes the test case.
  • Integrate measurement of resource utilization of interest on the server(s): for example, CPU, Memory, Disk, and Network.
  • If possible, set thresholds in your testing tool according to your performance test objectives; for example, the resource utilization thresholds can be as follows:
    • Processor\% Processor Time: 75 percent
    • Memory\Available MBytes: 25 percent of total physical RAM

Step 6. Test Execution

After the previous steps have been completed to a appropriate degree for the test you want to execute, do the following:
  • Validate that the test environment matches the configuration that you were expecting and/or designed your test for.
  • Ensure that both the test and the test environment are correctly configured for metrics collection.
  • Before running the real test, execute a quick “smoke test” to make sure that the test script and remote performance counters are working correctly.
  • Reset the system (unless your scenario is to do otherwise) and start a formal test execution.

Considerations

  • If at all possible, execute every test twice. If the results produced are not very similar, execute the test again. Try to determine what factors account for the difference.
  • Observe your test during execution and pay close attention any behavior you feel is unusual. Your instincts are usually right, or at least valuable.
  • No matter how far in advance a test is scheduled, give the team 30-minute and 5-minute warnings before launching the test (or starting the day’s testing). Inform the team whenever you are not going to be executing for more than 1 hour in succession.
  • Do not process data, write reports, or draw diagrams on your load-generating machine while generating a load because this can corrupt the data.
  • Turn off any active virus-scanning on load-generating machines during testing to minimize the likelihood of unintentionally corrupting the data.
  • Use the system manually during test execution so that you can compare your observations with the results data at a later time.
  • Remember to simulate ramp-up and cool-down periods appropriately.
  • Do not throw away the first iteration because of script compilation or other reasons. Instead, measure this iteration separately so you will know what the first user after a system-wide reboot can expect.
  • Test execution is never really finished, but eventually you will reach a point of diminishing returns on a particular test. When you stop obtaining valuable information, change your test.
  • If neither you nor your development team can figure out the cause of an issue in twice as much time as it took the test to execute, it may be more efficient to eliminate one or more variables/potential causes and try again.

Step 7. Analyze the Results, Report and Retest

Consider the following important points while analyzing the data returned by your performance test:
  • Analyze the captured data and compare the results against the metric’s acceptable or expected level to determine whether the performance of the application being tested shows a trend toward or away from the performance objectives.
  • If all of the metric values are within accepted limits and none of the set thresholds have been violated, the tests have passed and you have finished testing that particular scenario on that particular configuration.
  • If the test fails, a diagnosis and tuning activity are generally warranted. {See some other HowTo}
  • If you fix any bottlenecks, repeat the testing process from step 4 onwards until the test succeeds.

Note: If required, capture additional metrics in subsequent test cycles. For example, suppose that during the first iteration of load tests the process shows a marked increase in memory consumption, indicating a possible memory leak. In subsequent test iterations, additional memory counters related to generations can be captured, allowing you to study the memory allocation pattern for the application.

Considerations

  • Immediately share test results and raw data with a broad range of stakeholders.
  • Talk to the consumers of the data to validate that the test achieved the desired results and that the data means what you think it means. Modify the test to get new, better, or different information now while the test objectives are fresh in your mind, especially if the results do not represent what the test was defined to determine.
  • Filter out any unnecessary data. For example, if 87 of the 100 pages tested failed to meet their target response time, remove them from the graph to avoid redundancy.
  • Make your report using strong but factual language, with supporting paragraphs and graphics if available. For example, “The home page fails to achieve target response time for 75% of the users.”
  • Make your report using business language, not simply technical data. For example, do no say, “The CPU utilization is hovering at 85%”; rather, say, “The application server is not powerful enough to support the target load as the application is currently deployed/operating.”
Use current results to set priorities for the next test.
  • After each test, tell the team what you expect the next two tests to be so the team members can provide input concerning what they would like to have tested next while you are executing the current test.
  • Always keep supporting data handy and deliver it in the Appendix.

Resources

<<TBD>>

Last edited Mar 14, 2007 at 9:35 PM by prashantbansode, version 7

Comments

DouglasBrown Mar 6, 2007 at 3:08 PM 
Hi,
I have added this comment to another how to, but it is relevent here also. One of the objectives of load testing is 'soak' testing. To see if the application can perform within the service levels expected for a long period of time without errors occuring.

For example, if an application is restarted on a weekly basis then it must be able to run for that long without a reduction in performance. A reduction can be caused by memory leaks or misuse of other resources by the server, e.g. not closing down connections to external data sources.

An example soak test can be to run the application at average load for 25 hours (using automated tools or scripts). You would look at the performanc, memory usage, disk space usage and other statistics to see if anything was increacing or decreacing over time. It may be for example that the application performs well, but the free memory is decreacing, it may be that the performance reduces after the logs start to fill up the hard disk.