pnp.GIF

How To: Model Application Usage without Empirical Data

Scott Barber,

Applies To

  • Performance Test Design
  • Workload Models for Performance Testing
  • Performance Testing

Summary

This How-To demonstrates an approach to developing user community models that represent a realistic approximation of the usage of the application when there is no empirical data available to model from by focusing on groups of users and how they interact from the perspective of the application. Applications typically allow for many ways to accomplish a task or many methods to navigate the application. To effectively predict performance in production, these variants of individual user patterns must be accounted for. To do this effectively, the model needs to available to and understandable by the entire team with little or no explanation allowing for ease of conversation and resulting in more accurate user models.

Contents

  • Objectives
  • Overview
  • Summary of Steps
  • Step 1. Determine Key Scenarios
  • Step 2. Determine Individual User Patterns
  • Step 3. Consolidate Individual Patterns Into One or More Collective Models
  • Step 4. Determine Distribution of Activities
  • Step 5. Identify Target Load Levels
  • Step 6. Integrate Model Variance
  • Step 7. Prepare to Implement the Model
  • Additional Considerations
  • Resources

Objectives

  • Learn how to identify individual usage scenarios
  • Learn how to account for variances in individual usage scenarios
  • Learn how to incorporate individual usage scenarios and their variances into user groups
  • Learn how to model groups of users with appropriate variance
  • Learn how to consolidate groups of users into one or more production simulation models
  • Learn how to identify and model special considerations when blending groups of users into single models

Overview

Testing a website such that the test can reliably predict performance is often more of an art than a science. More than a few brilliant minds have published detailed and mathematically sound methods to plan for, predict and model performance characteristics very accurately… as long as there is sufficient empirical data to work from.

As critical as it is to creating load and usages models that will predict performance accurately, the data necessary to create these models is typically not directly available to the individuals actually conducting the testing, if it exists at all. Since all of the generally accepted methods and techniques for testing and predicting application performance depends on usage and load models, an alternative to empirical data is required. This How To explores an approach to creating load and usage models that is a quick and effective substitute for empirical data.

Steps

  • Step 1. Determine Key Scenarios
  • Step 2. Determine Individual User Patterns
  • Step 3. Consolidate Individual Patterns Into One or More Collective Models
  • Step 4. Determine Distribution of Activities
  • Step 5. Identify Target Load Levels
  • Step 6. Integrate Model Variance
  • Step 7. Prepare to Implement the Model

Step 1. Determine Key Scenarios

It is typically somewhere between impractical and impossible to simulate every possible user task or activity in a performance test. As a result, no matter what method you use to identify key scenarios, you will probably want to apply some limiting heuristic to the number of activities, or key scenarios you identify for performance testing. You may find the following limiting heuristics useful:
  • Include the key scenarios implied or mandated by the objectives for the performance testing effort.
  • Include the most common activities.
  • Include high visibility activities. For example, a user may only register on your web site once, but if that is a bad experience, they may never return.
  • Include business critical activities. Placing an order may not be common or highly visible on your website, but if users can’t place orders, you lose revenue.
  • Include performance intensive activities. Even if these activities are extremely rare, when they happen they can have a significant system wide impact on performance.
  • Include activities whose performance is mandated by contract, SLA or an influential stakeholder.

Determining these items without empirical data is an indirect process based on evaluating information collected from places like the following:
  • Requirements and Use Cases
  • Contracts
  • Marketing Material
  • Interviews with Stakeholders
  • Information about how similar applications are used
  • Observing and asking questions of Beta-testers and prototype users
  • Your own experiences with how similar applications are used

Once you collect a list of what you believe are the key tasks or usage scenarios, solicit commentary from the team. Ask what they think is missing, what they think can be de-prioritized and why. The “why” part is most important. What doesn’t seem to matter to one person may still be critical to include in the performance test due to side effects that activity has on the system as a whole that the individual suggesting that the activity isn’t important may know be aware of.

Considerations

  • Whenever testing a web site with a significant amount of new features/functionality, use interviews. By interviewing the individuals responsible for selling/marketing the new features, you will find out what features/functions are going to be expected, and therefore most likely used. By interviewing existing users, you can determine which of the new features/functions they believe they are most likely to use.
  • When testing a pre-production web site already the best option is to roll out a (stable) beta version to a group of representative users, roughly 10-20% the size of the expected user base and analyze the log files from their usage of the site.
  • Run simple in-house experiments using employees, customers, clients, friends, or family members to determine, for example natural user paths and the page-viewing time differences between new and returning users. This method is a highly effective method of data collection for web sites that have never been live, as well as validation of data collected using other methods.
  • Remember to ask about usage by various user types, roles or personas. It is frequently the case that team members won’t remember to tell you about the less common user types or roles if you don’t explicitly ask.
  • Think about system users and batch processes as well as human end-users. For example there might be a batch process that runs to update the status of orders while users are performing the activities in the site. Account for those processes because they might be consuming resources.
  • For the most part, web servers are very good at serving text and graphics. Static pages with average sized graphics are probably less critical than dynamic pages, forms and multi-media pages.

Step 2. Determine Individual User Patterns

With a list of key scenarios, the next step is to determine how individual users actually accomplish the tasks or activities related to those scenarios and the user specific data associated with a user accomplishing that task or activity.

Navigation

Human beings are unpredictable and web sites commonly offer multiple paths to accomplish the same task or activity. Even with a relatively small number of users, it is almost certain that real users will not only use every path you think they will to complete a task, they will inevitably invent some that you hadn’t thought of. Each path they take to complete an activity will put a different load on the system. That difference may be trivial, it may be enormous. There is no way to be certain until we test it. There are many methods to determine navigation paths to complete a task or activity. Some include:
  • Identify user paths within your web application expected to have significant performance impact and that accomplish one or more of the identified key scenarios
  • Read design and/or usage manuals
  • Try to accomplish the activities yourself
  • Observe others trying to accomplish the activity without instruction.

Once the application is released for unscripted user acceptance testing, beta testing or to production, you will be able to determine how the majority of users accomplish activities on the system under test. It is always a good idea to compare your models against reality and make an informed decision about whether to do additional testing based on the similarities and differences found.

Apply the same limiting heuristics to navigation paths as you did when determining activities to decide which paths you want to include in your performance simulation.

Considerations

  • Some users will complete more than one activity during a visit to your site.
  • Some users will complete the same activity more than once per visit.
  • Some users may not actually complete any activities during a visit to your site.
  • Navigation paths are often easiest to capture using page titles.
  • If page titles don’t work, or aren’t intuitive for your application, the navigation path may be easier defined by steps the user takes to complete the activity.
  • First time users frequently follow a different path to accomplish a task than users experienced with the application. Consider this difference and what percentage of new vs. return user navigation paths should be represented in your model.
  • Different users will spend a different amount of time on the site. Some will log out, some will close their browser and others will leave their session to time out. Take these factors into account when determining or estimating session durations

Data

Unfortunately, navigation paths alone don’t provide all of the information required to implement a workload simulation. To fully implement the workload model, several more pieces of information are needed. This information includes items such as:
  • How long users may spend on a page
  • What data may need to be entered on each page
  • What conditions may cause a user to change navigation paths

Below is an example of unique data identified for an eCommerce application:
Implementation Data
ScenarioPage/ StepData InputsData OutputsThink Time
Login
Login pageUsername (unique), Password (matched to username) 6 – 9 Sec, Random
Browse
Login PageUsername (unique), Password (matched to username) 6 – 9 Sec, Random
BrowseCatalog Tree/Structure (static), User Type (weighted)Product Description, Title, Category4 – 60 Sec, Random

Considerations

  • Consider including data that leads to exception paths in your performance tests, for example, include some users who mistype their password on the first attempt, but get it correct on a second try.

Step 3. Consolidate Individual Patterns Into One or More Collective Models

There are a wide variety of methods that teams and individuals use to consolidate individual usage patterns into one or more collective models. Some of those include spreadsheets, pivot tables, narrative text, UML collaboration diagrams, Markov Chain diagrams and flow charts. In each case the intent is to make the model as a whole easy to understand, maintain and communicate across the entire team.
One highly effective method is to create visual models that are intuitive to the entire team, including end-users, developers, testers, analysts and executive stakeholders. The key to this technique is to use language and visual representations that make sense to your team without extensive training. In fact, visual models that convey their intended meaning with no training what-so-ever are best. Consider the following visualization of a consolidated usage model for an eCommerce application.
ImpiricalData1.GIF
As you can see, the combination of labeled user activities, flow lines and color make it easy for the entire team to understand. Once such a model is created, it is valuable to circulate that model to both users and stakeholders for review/comment. Just like when you collected key usage scenarios, ask the team what they think is missing, what they think can be de-prioritized and why.
Once you are confident that the model is appropriate for performance testing, supplement that model with the individual usage data collected for each navigation path during step 2 such that the model contains all of the data that will be needed to create the actual test. The table below represents one way that supplementary data can be organized in a spreadsheet.
ImpiricalData2.GIF

Considerations

  • Performance tests frequently consume large amounts of test data. Ensure you include enough in your data files.
  • Using the same data repeatedly will frequently lead to invalid performance results.
  • Especially while designing and debugging performance tests, test databases can become dramatically overloaded with data. Periodically check to see if the data base is storing unrealistic volumes of data for the situation you are trying to simulate.
  • Client side caching. First time users will be downloading every object on the site. Frequently returning users are likely to have many static objects and/or cookies stored in cache locally. When capturing the uniqueness of the user, consider whether that user represents a first time user or a user with an established client-side cache.

Step 4. Determine Distribution of Activities

Now that you’ve determined what scenarios you want to simulate, the steps and associated data are for those scenarios, and consolidated those scenarios into visual models, you need to determine how often each activity represented in the model is performed relative to the other activities to complete the workload model. The most common methods to determine the relative distribution of activities with no empirical data to draw from are:
  • Interview the individuals responsible for selling/marketing new features, you will find out what features/functions are going to be expected, and therefore most likely used. By interviewing existing users, you may also determine which of the new features/functions they believe they are most likely to use.
  • Deploy a beta release to a group of representative users, roughly 10-20% the size of the expected user base and analyze the log files from their usage of the site.
  • Run simple in-house experiments using employees, customers, clients, friends, or family members to determine, for example natural user paths and the page-viewing time differences between new and returning users.
  • As a last resort, you can use your intuition, or best guess, to estimate based on your own familiarity with the site.

Possibly the most effective method capturing and communicating the distribution of activities is to return to the visual model and make notes about what percentage of users you anticipate will perform each activity, then once again, circulate that model to both users and stakeholders for review/comment. Just like before, ask them to tell you if they believe the percentages are reasonable and why. Often, members of the team will simply write new percentages on the visual model, making it very easy for everyone to see which activities have achieved a consensus and which have not. The following visualization is for the same eCommerce application referenced in Step 3, only now it contains distribution data.

ImpiricalData3.GIF
Sometimes you will find that one workload model is not enough. Research and experience tell us that, user activities often vary greatly over time. To ensure test validity, you must validate that activities are evaluated by time of day, day of week, day of month and time of year.
As an example, consider an on-line bill payment site. If all bills go out on the 20th of the month, the activity on the site immediately before the 20th will be focused on updating accounts and importing billing information, etc. by system administrators, while immediately after the 20th, customers will be viewing and paying their bills until the payment due date of the 5th of the next month.

Considerations

  • Create visual models and circulate them to both users and stakeholders to review/comment.
  • Ensure the model is intuitive to both non-technical users, technical designers and everyone in between.
  • Ensure the model contains all of the supplementary data needed to create the actual test.
  • It is during this step that you would account for user abandonment if applicable to your application. {for more on user abandonment see HowTo:AccountForUserAbandonment}

Step 5. Identify Target Load Levels

Unless or until some degree of empirical data is available (for example, previous related applications, predetermined user base, etc.) target load levels are exactly that; targets. These targets are most frequently set by the business based on their goals related to the application, whether those goals are market penetration, revenue generation or something else. These are the numbers that you want to work with to get started. These numbers may or may end up correlating to the loads the application will actually encounter, but the business will want to know if and how well the application as developed or deployed will support those loads if their targets are met.
Since the models you have constructed represent the frequency of each activity as a percentage of the total load, your models should not need to be updated as a result of determining target load levels.

Step 6. Integrate Model Variance

Since the usage models are “best guesses” until empirical data becomes available, it is a god idea to create not fewer than three usage models for each target load. This has the effect of adding a rough confidence interval to the performance measurements so that stakeholders can focus on not just the results from one test based on many fallible assumptions, but also on how much inaccuracies in those assumptions are likely to impact the performance characteristics of the application.

The three usage models that teams generally find most valuable are:
  • Anticipated Usage (The model or models you created in step 4)
  • Best Case Usage in terms of performance (i.e. weighted heavily in favor of low performance cost activities)
  • Worst Case Usage in terms of performance (i.e. weighted heavily in favor of high performance cost activities)

The chart below is an example of the information that testing for all three of these models can provide. As you can see, in this particular case, the Anticipated Usage and Best Case Usage resulted in similar performance characteristics. However, the Worst Case Usage showed that there is nearly a 50% drop-off in the total load that can be supported between it and the Anticipated Usage. Information such as this could lead to a re-evaluation of the usage model or possibly a decision to test with the Worst Case Usage moving forward as a sort of a safety factor until empirical data becomes available
ImpiricalData4.GIF

Step 7. Prepare to Implement the Model

Preparing to implement the model is tightly tied to the method of implementation, typically a load generation tool. For more information about implementing a workload model using VSTS see {HowTo:SomeName}.
Considerations:
  • Don’t change your model blindly because the model is difficult to implement in your tool.
  • If you cannot implement your model as designed, ensure that you record the details about the model you do implement.
  • Implementing the frequently includes identifying metrics to be collected and determining how to collect those metrics see {HowTo: DoMetricsStuff} for more

Resources

<<TBD>>

Last edited Mar 17, 2007 at 12:39 AM by prashantbansode, version 2

Comments

No comments yet.