pnp.gif

How to: Modeling the User Experience from Web Server Logs

Scott Barber , Carlos Farre, Larry Brader, Edmond Wong

Applies to

  • Performance Testing

Summary

This How To explains an approach of modeling the user experience for workload characterization from Web server logs data, when designing load tests for performance and scalability testing. It will show the importance of incorporating customer behavior into the modeling process, when creating Web load tests to realistically emulate how users interact with Web sites. It will also explain the necessary steps to define input to represent the user behavior when creating Web load tests.

Contents

  • Objectives
  • Overview
  • Quantifying the Volume of Application Usage
  • Web site metrics
  • Summary of Steps
  • Additional Resources

Objectives

  • Learn the difference between concurrent users and user sessions and why this is important when defining input for Web load tests.
  • Learn about the elements of customer model behavior that will aid with modeling the user experience when creating load tests.
  • Learn about the metrics that will help in developing realistic workload characterizations.
  • Learn about key-client variables to consider when defining workload characterization.
  • Learn about user abandonment and how it can affect the user experience emulation, when building load tests.

Overview

The most common purpose of Web load tests is to anticipate the user experience as realistically as possible.. If you are performance testing for the purpose of predicting user experience it is crucial that test conditions be similar or at least close to production usage. Customer visits to a web site are made of a series of related requests which we will refer to a user session. Users with different behaviors navigating the same Web site are unlikely to cause overlapping requests to the web server during their sessions. So instead of concurrent users, it is more useful and accurate to model the user experience based on user sessions. User sessions can be identified as a sequence of actions in a navigational page flow, undertaken by a customer visit to a Web site. During a session, the user can be in different states, browsing, login to the system, etc. The transition between pages is dependent on the layout of the site. Customers have different modes of interacting with the Web site; some are familiar with the site and quickly go from one page to another, others take longer times deciding their action. Thus characterizing the user behavior must involve modeling the customer sessions based on page flow, frequency of hits, time users stop in between pages, and any other factor specific to how users interact with your web site.

Quantifying the Volume of Application Usage, The Theory

Determining and expressing an application's usage volume is frequently difficult becauseweb-based multi-user applications communicate via stateless protocols. Terms like "concurrent users" and "simultaneous users" have frequently been used, but they can be misleading when modeling users’ visits to the web sited. In Figures 2 and 3 below, each line segment represents a user activity, and different activities are represented by different colors. The red line segment represents the activity of “Load the Home Page”. User’s Sessions are represented horizontally across the graph. In this hypothetical representation, the same activity takes the same amount of time for each user. The time elapsed between the Start of Model and End of Model lines is one hour.
Server Perspective.JPG
Figure 1 Server Perspective

From the perspective of the server (in this case a Web server) (see Figure 2). Reading the graph from top to bottom, left to right, we see that user 1 surfs to page “red,” then “blue," “black," “red," “blue," and “black." User 2 also starts with page “red," but then goes to “green," “purple," etc. It is also noticed that virtually any vertical slice of the graph between the start and end times will reveal 10 users accessing the system, showing that this distribution is representative of 10 concurrent, or simultaneous, users. What should be clear is that the server knows that 10 activities are occurring at any moment in time, but not how many actual users are interacting with the system to generate those 10 activities.

Now looking at another distribution of activities by individual user that would generate the server perspective graph above (see Figure 2).

Actual Distribution.JPG
Figure 2

In this graph, there are 23 individual users have been captured. These users conducted some activity during the time span modeled, and they can be thought as 23 user sessions. Those 23 users all began interacting with the site at different times. There is no particular pattern to the order of activities, with the exception of all users starting with the “red” activity. These 23 users actually represent the exact same activities in the same sequence shown in Figure 2. However, as depicted in the representation, at any given time there are 9 to 10 concurrent users. The modeling of usage for the above case in terms of volume can be thought of total hourly users, or user sessions count between “Start of Model” and “End of Model”.

Web site metrics in web logs

For our purposes, Web site metrics are the variables that help us understand a site’s traffic and load patterns from the server’s perspective. Web site metrics are generally averages that may vary with flow of users accessing the site; but they generally provide a picture of how the site is being used, making it a high level view of the site’s usage that is helpful in creating models for performance testing. These metrics ultimately reside in the Web server logs (There are many software applications that parse these logs to present these metrics graphically or otherwise, but these are out of scope for this How To). Some of the more useful metrics that can be read or interpreted from web server logs (assuming the web server is configured to keep logs) include.
  • Page views per period: page view is a page request with all dependent requests associated with it ( Js, giff Css files etc). Time frames can span hourly, daily, weekly to account for the cyclical patterns where peaks or bursts of users accessing the web site can happen.
  • User sessions per period: user session is the sequence of related requests originated from a user visit to the Web site as explained before. As in page view they can span hourly, daily and weekly time frames.
  • Session duration: This is the amount of time user session lasts. It is measured from the first page request until the last page request is completed. Session duration takes into account the time user stops when navigating from page to page.
  • Page Requests Distribution: It is the distribution in percentages of the page hits by functional types (Home, login, Pay). The distribution percentages will establish a weighting ratio of page hits based on the user actual utilization of the Web site.
  • Interaction Speed: It is the time users take to transition between pages when navigating the Web site, constituting the think time behavior. Every different user will interact with the Web site at a different rate.
  • User Abandonment: It is the rate that users will wait for a page load before getting dissatisfied and quitting the site, because it is too slow when satisfying the request. Sessions that are abandoned are quite normal in the internet, and they will have an impact on the load test results.

User Abandonment

User abandonment is when customers exit the Web site before completing a task, due to performance slowness. People have different rates of tolerance for performance, depending on their psychological profile and the type of page they request. Failing to account for user abandonment, will cause loads that are highly unrealistic and improbable. Load tests should simulate user abandonment as realistically as possible or they may cause types of load that will never occur in real life — and creating bottlenecks that might never happen with real users. Load tests should be reporting the number of users that might abandon the Web site due to poor performance.

In Web sites traffic when load gets too big for the system/application to handle, the site slows down, causing people to abandon it, thus decreasing the load until the system speeds back up to acceptable rates. Abandonment creates a self policing mechanism that recovers performance at previous levels, when load occurred, even if at the cost of losing some customers. So one reason to correctly account for user abandonment is to see how just how many “some” is. Another reason is to determine the actual volume your application can maintain before you start losing customers. Yet another reason to account for user abandonment is to avoid simulating, and subsequently resolving, bottlenecks that, realistically, might not even be possible.

Not accounting for abandonment at all, the load test may wait forever to receive the page or object it requested. When the test eventually receives that object, even if "eventually" takes hours longer than a real user would wait, the test will move on to the next object as if nothing were wrong. If the request for an object simply isn't acknowledged, the test skips it and makes a note in the test execution log with no regard as to whether that object was critical to the user. There are some cases where not accounting for abandonment is an accurate representation of reality; for instance, a Web-based application has been exclusively created for an audience that has no choice but to wait because there are no alternate methods to complete a required task. The following are generally useful rules of thumb related to User Abandonment:
  1. Check the abandonment rate before evaluating response times. If the abandonment rate for a particular page is less than about 2%, consider the possibility of those response times being outliers.
  2. Check the abandonment rate before drawing conclusions about load. Remember, every user who abandons is not applying load. The response time statistics may look good, but if you have 75% abandonment, load is roughly 75% lighter than it was has being testedbeen tested for.
  3. If the abandonment rate is more than about 20%, consider disabling the abandonment routine and re-executing the test to help gain information about what’s causing the problem.

Summary of Steps

This How To includes the following steps:
  • Step 1. From the Web server logs determine the workload characteristics of the web site in terms of number of user sessions or user visits and translate the total session’s number into concurrent users.
  • Step 2. Determine the Page requests distribution from the Web logs.
  • Step 3. Determine other client side variables: User abandonment.
  • Step 4. Create load test scripts to replicate realistic loads and compare with the Web server logs for total number of users averages and peak values.

Step 1. From the Web server logs determine the workload characteristics of the web site in terms ofnumber of user sessions or user visits and translate the total session’s number into concurrent users.

This is the quantitative analysis from Web server logs of the Web site in terms of total number of visits for the site over a period of time.
  • Determine the volume of users by doing the analysis of distribution across a time period such as a month/week/day.
  • Determine the volume in terms of total averages and peak loads on an hourly basis within the selected period.
  • Determine the duration of sessions for total averages and peak loads on an hourly basis.
  • Translate the total hourly averages and peak loads into concurrent users to simulate real scalability volume for the load test, dividing the number of user sessions by the result of division of time frame (1hour) divided by session duration.
Example: the below data was extracted for the Web serve logs the below data was extracted from the Web serve logs by importing the text of the log file into Excel and sorting, filtering and summing up the data. Note that this information may also be available to you more simply via a tool such as Webtrends, a service such as Google Analytics or a custom service provided by your web hosting company.
  • Customer has monthly total number of 800000 users
  • Volume during weekends is negligible, hence 5 days form our working week.
  • Customer has weekly total average number of 800000 users/4 = 200000 users per week
  • Customer has Daily total average of ~40000 of users per day
  • The hourly breakdown is represented in below table.
  • The number of overlapping user sessions (sometimes referred to as concurrent users) is obtained by dividing the number of User sessions by the result of 60 minutes (1 hour) divided by the session duration ( UserSessions/(60/sessionduration)), which is arrival rate needed to perform the total number of sessions in one hour, given the load test will simulate session durations correctly.
  • Also note that the hourly breakdown representing the peak load that occurred between 10:00AM-12:00PM.

Time User Sessions Session Duration (minutes) % New Users # concurrent Users
8:00 AM 789 15 1% 197
9:00 AM 1894 13 1% 474
10:00 AM 2536 23 0% 634
11:00 AM 2878 22 0% 720
12:00 PM 4989 25 1% 1247
1:00 PM 5084 11 2% 1271
2:00 PM 2789 10 3% 697
3:00 PM 2456 11 1% 614
4:00 PM 2383 13 2% 596
5:00 PM 2789 14 1% 697
6:00 PM 2456 15 1% 614
7:00 PM 2189 15 1% 547
8:00 PM 2808 12 2% 702
9:00 PM 2098 11 3% 525
10:00 PM 1589 10 1% 397
Total 39727
Total Averages 2648 15 1% 662

Step 2. Determine the Page requests distribution from the Web logs.

This step identifies the percentage of distribution of the pages and the business transactions that most critically need to be included in a performance test. We can divide the transactions in the below categories four categories:
  • Frequently Used Transactions
  • Performance-Intensive Transactions
  • Business-Critical Transactions
  • Transactions of Special Interest to a Relevant Stakeholder
Of these four categories, only the frequently used transactions can be conclusively determined from web server logs.

In the above customer scenario the most frequent transactions were “search”, “login”; that the most performance-intensive transactions were “search” and “Browse"; and that the most important business-critical transactions were “Product selection" and “Pay”. Below is the page percentage distribution of scenario.

Session Path( 1 hour modeling) Percent of usage "Total User Sessions Per hour" "Total Peak User Sessions Per hour" # Concurrent Users # Peak Concurrent Users % New Users
Home->Login->Search->Logout 40% 1059 1996 265 499 1%
Home->Browse->Home->Login 25% 662 1247 166 312 2%
Home->Search->Logout 35% 927 1746 232 437 2%
Home->browse->Product Selection->Pay 5% 132 249 33 62 1%
Total 2648 4989 662 1247


Page Distribution.JPG

Page Distribution 1.JPG

Step 3. Determine other client side variables: User abandonment.

From the Web logs apply the criteria for abandonment in your web tests and account for users abandoning the Web site.


% Users Leave the site 0-5 seconds % Users Leave the site 5-10 seconds % Users Leave the site 10-15 seconds % Users Leave the site 20- more seconds
Home 0% 7%-10% 30%-35% 85%
Login 0% 1%-3% 10%-15% 80%
Browse 0% 10%-15% 25%-30% 75%
Search 0% 10%-15% 25%-30% 75%
Pay 0% 0% 1%-2% 15%
Logout 0% 0%-1% 8%-10% 65%
Product Selection 0% 0%-1% 2%-3% 10%

Step 4. Create load test scripts to replicate realistic loads and compare with the Web server logs for total number of users averages and peak values.

Create the load test scripts with business scenarios identified in step 2. Apply the quantitative analysis as in step 1 with concurrent users in the load testing tool so to match the total number of users in the model time frame. Run your tests and compare with results. In Visual Studio Team System you will have the number of tests executed, that can be translated into user session completions during test run.

Session Path( 1 hour modeling) Percent of usage "Total User Sessions Per hour" "Total Peak User Sessions Per hour" # Concurrent Users # Peak Concurrent Users % New Users
Home->Login->Search->Logout 40% 1059 1996 265 499 1%
Home->Browse->Home->Login 25% 662 1247 166 312 2%
Home->Search->Logout 35% 927 1746 232 437 2%
Home->browse->Product Selection->Pay 5% 132 249 33 62 1%
Total 2648 4989 662 1247

Additional Resources

  • For more information see "How To: Use IIS logs performance analysis" at <<Add Url>>

Contributors and Reviewers

  • External Contributors and Reviewers: Andy Eunson

Last edited Mar 16, 2007 at 9:12 PM by prashantbansode, version 7

Comments

No comments yet.