pnp.GIF

How To: Model User Delays and Think-Times

Scott Barber,

Applies To

  • Performance Test Design
  • Workload Modeling for Performance Testing
  • Performance Testing

Summary

This How-To explores the process of determining realistic user delays. For performance testing to yield results that are directly applicable to understanding the performance characteristics of an application in production, the tested workloads must represent reality. To create a reasonably accurate representation of reality one must model user delays with variability and randomness similar to a representative cross-section of users.

Contents

  • Objectives
  • Overview
  • Summary of Steps
  • Step 1. Determine User Delays
  • Step 2. Apply Delay Ranges
  • Step 3. Apply Distributions
  • Consequences of Improperly Modeling User Delays
  • Resources

Objectives

  • Learn how to determine realistic durations and distribution patters for user delay times.
  • Learn how to incorporate realistic user delays into test designs and test scripts.

Overview

The more accurately users are modeled, the more reliable performance test results will be. One frequently overlooked aspect of accurate user modeling is the modeling of user delays. This How To explains how to determine user delay times to be incorporated into your workload model and subsequently your performance scripts.

Steps

  • Step 1. Determine User Delays
  • Step 2. Apply Delay Ranges
  • Step 3. Apply Distributions

Step 1. Determine User Delays

Delays that occur while users view content onWeb pages, also commonly known as think times, represent the answers to questions such as “How long does it take a user to enter their login credentials?” and “How much time will users spend reading this page?” You can use several different methods to estimate think times associated with user activities on your Web site. The best method, of course, is to use real data collected about your production site. This is rarely possible, however, because testing generally occurs before the site is released to production. This necessitates making educated guesses or approximations regarding activity on the site. The most commonly useful methods of determining this include the following:
  • When testing a Web site that is already in production, you can determine the actual values and distribution by extracting the average and standard deviation for user viewing (or typing) time from the log file for each page. With this information, you can easily determine the think time for each page. Your production site may also have Web traffic monitoring software that provides this type of information directly.
  • If you have no log files, you can run simple in-house experiments using employees, customers, clients, friends, or family members to determine, for example, the page-viewing time differences between new and returning users. This type of simplified usability study tends to be a highly effective method of data collection for Web sites that have never been live, as well as validation of data collected using other methods.
  • Time yourself using the site, or by performing similar actions on a similar site. Obviously, this method is highly vulnerable to personal bias, but it is a reasonable place to start until you get a chance to time actual users during User Acceptance Testing (UAT) or conduct your own usability study.
  • In the absence of any better source of information, you can leverage some of the metrics and statistics that have already been collected by research companies such as Nielsen//NetRatings, Keynote, or MediaMetrix. These statistics provide data on average page-viewing times and user session duration based on an impersonal sample of users and Web sites. Although these numbers are not from your specific Web site, they can work quite well as first approximations.

There is no need to spend a lot of time collecting statistically significant volumes of data or to be excessively precise. All you really need to know is how long a typical user will spend performing an activity, give or take a second or two. However, depending on the nature of your site, you may want to determine user delay times separately for first-time and experienced users.

Step 2. Apply Delay Ranges

Simply determining how much time one person spends visiting your pages, or what the variance in time between users is, is not enough in itself—you must vary delay times by user. It is extremely unlikely that each user will spend exactly the same amount of time on a page. It is also extremely likely that conducting a performance test in which all users spend the same amount of time on a page will lead to unrealistic or at least unreliable results.

To convert the delay times or delay ranges from Step 1 into something that also represents the variability between users, the following three pieces of information are required:
  • The minimum delay time
  • The maximum delay time
  • The distribution or pattern of user delays between those points

If you do not have a minimum and maximum value from your analysis in step 1, you can apply heuristics as follows to determine acceptable estimates:
  • The minimum value could be:
    • An experienced user who intended to go to the page but will not remain there long (for example, a user who only needs the page to load in order to scan, find, and click the next link)
    • A user who realized that they clicked to the wrong page
    • A user who clicked through a form that had all of its values pre-filled
    • The minimum length of time you think a user needs to type the required information into the form
    • Half of the value that you determined was “typical”
  • The maximum value could be:
    • Session time-out
    • Sufficient time for a user to look up information for a form
    • No longer than it takes a slow reader to read the entire page
    • Double the value that you determined was “typical”

Although you want your estimate to be relatively close to reality, any range that covers ~75 percent of the expected users is sufficient to ensure that you are not unintentionally skewing your results.

Step 3. Apply Distributions

There are numerous mathematical models for these types of distributions. Four of these models cover the overwhelming majority of user delay scenarios:
  • Linear or Uniform Distribution
  • Normal Distribution
  • Negative Exponential Distribution
  • Double Hump Normal Distribution

Linear or Uniform Distribution

A uniform distribution between a minimum and a maximum value is the easiest to model. This distribution model simply selects random numbers that are evenly distributed between the upper and lower bounds of the range. This means that it is no more likely that the number generated will be closer to the middle or either end of the range. The figure below shows a uniform distribution of 1,000 values generated between 0 and 25. Use a uniform distribution in situations where there is a reasonably clear minimum and maximum value, but either have or expect to have distinguishable pattern between those end points.
Unifrom Distribution.GIF

Normal Distribution

A normal distribution, also known as a bell curve, is more difficult to model but is more accurate in almost all cases. This distribution model selects numbers randomly in such a way that the frequency of selection is weighted toward the center, or average value. The figure below shows a normal distribution of 1,000 values generated between 0 and 25 (that is, a mean of 12.5 and a standard deviation of 3.2). Normal distribution is generally considered to be the most accurate mathematical model of quantifiable measures of large cross-sections of people when actual data is unavailable. Use a normal distribution in any situation where you expect the pattern to be shifted toward the center of the end points. The valid range of values for the standard deviation is from 0 (equivalent to a static delay of the midpoint between the maximum and minimum values) and the maximum value minus the minimum value (equivalent to a uniform distribution). If you have no way to determine the actual standard deviation, a reasonable approximation is 25 percent of (or .25 times the range) of the delay.
Normal Distribution.GIF
The following is pseudo code for a function to implement a normal distribution
int func normdist(min, max, stdev) /* specifies input values */
/* min: Minimum value; max: Maximum value; stdev: degree of deviation */

int min, max, stdev; {

    int range, iterate, result;
/* declare range, iterate and result as integers, to avoid the need for floating point math*/

    result = 0;
/* ensure result is initialized to 0 */

    range = max -min;
/* calculate range of possible values between the max and min values */

    iterate = range / stdev;
/* this number of iterations ensures the proper shape of the resulting curve */

    stdev += 1; /* compensation for integer vs. floating point math */
    for (c = iterate; c != 0; c--) /* loop through iterations */
        result += (uniform (1, 100) * stdev) / 100; /* calculate and tally result */
    return result + min; /* send final result back */
}

Negative Exponential Distribution

Negative exponential distribution creates a distribution similar to that shown in the graph below. This model skews the frequency of delay times strongly toward one end of the range. This model is most useful for situations such as users clicking on a “play again” link that only activates after multimedia content has completed playing. The following figure shows a negative exponential distribution of 1,000 values generated between 0 and 25.
Negexp Distribution.GIF

Double Hump Normal Distribution

The double hump normal distribution creates a distribution similar to that shown in the graph below. To understand when this distribution would be used, consider the first time you visit a Web page that has a large amount of text. On that first visit, you will probably want to read the text, but the next time you may simply click through that page on the way to a page located deeper in the site. This is precisely the type of user behavior this distribution represents. The figure below shows that 60 percent of the users who view this page spend about 8 seconds on the page scanning for the next link to click, and the other 40 percent of the users actually read the entire page, which takes about 45 seconds. You can see that both humps are normal distributions with different minimum, maximum, and standard deviation values.
Double Hump Normal Distribution.GIF
To implement this pattern, simply write a snippet of code to generate a number between 1 and 100 to represent a percentage of users. If that number is below a certain threshold (in the graph above, below 61), call the normal distribution function with the parameters to generate delays with the first distribution pattern. If that number is at or above that threshold, call the normal distribution function with the correct parameters to generate the second distribution pattern.

Consequences of Not Properly Modeling User Delays

For realistic load tests, any reasonable attempt at applying ranges and distributions is preferable to ignoring the concept of varying user delays. Creating a load test in which every user spends exactly the same amount of time on each page is simply not realistic and will generate misleading results. For example, you can very easily end up with results similar to the following.
response graph 1.GIF
In case you are not familiar with response graphs, each red dot represents a user activity (in this case, a page request); the horizontal axis shows the time, in seconds, from the start of the test run; and individual virtual testers are listed on the vertical axis. This particular response graph is an example of “banding” or “striping.” Banding should be avoided when doing load or performance testing, although it may be valuable as a stress test. From the server’s perspective, this test is the same as 10 users executing the identical actions synchronously: Home page wait x seconds page1.
To put a finer point on it, hold a ruler vertically against your screen and move it slowly across the graph from left to right. This is what the server sees: no dots, no dots, no dots, lots of dots, no dots. This is a very poor representation of actual user communities.
The figure below is a much better representation of actual users, achieved by adding some small-range uniform and normally distributed delays to the same test.
response graph 2.GIF
If you perform the same activity with the ruler, you will see that the dots are more evenly distributed this time, which dramatically increases both the realism of the simulated load and the accuracy of the performance test results.

Resources

Last edited Mar 16, 2007 at 9:24 PM by prashantbansode, version 5

Comments

No comments yet.