Ask Learn
Preview
Please sign in to use this experience.
Sign inThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
J.D. Meier, Carlos Farre, Prashant Bansode, Scott Barber, and Dennis Rea
Microsoft Corporation
September 2007
Performance testing is a type of testing intended to determine the responsiveness, throughput, reliability, and/or scalability of a system under a given workload. Performance testing is commonly conducted to accomplish the following:
This chapter provides a set of foundational building blocks on which to base your understanding of performance testing principles, ultimately leading to successful performance-testing projects. Additionally, this chapter introduces various terms and concepts used throughout this guide.
Use this chapter to understand the purpose of performance testing and the core activities that it entails. To get the most from this chapter:
Performance testing is typically done to help identify bottlenecks in a system, establish a baseline for future testing, support a performance tuning effort, determine compliance with performance goals and requirements, and/or collect other performance-related data to help stakeholders make informed decisions related to the overall quality of the application being tested. In addition, the results from performance testing and analysis can help you to estimate the hardware configuration required to support the application(s) when you “go live” to production operation.
Figure 1.1* *Core Performance Testing Activities
The performance testing approach used in this guide consists of the following activities:
At the highest level, performance testing is almost always conducted to address one or more risks related to expense, opportunity costs, continuity, and/or corporate reputation. Some more specific reasons for conducting performance testing include:
Assessing release readiness by:
Assessing infrastructure adequacy by:
Assessing adequacy of developed software performance by:
Improving the efficiency of performance tuning by:
For a performance testing project to be successful, both the approach to testing performance and the testing itself must be relevant to the context of the project. Without an understanding of the project context, performance testing is bound to focus on only those items that the performance tester or test team assumes to be important, as opposed to those that truly are important, frequently leading to wasted time, frustration, and conflicts.
The project context is nothing more than those things that are, or may become, relevant to achieving project success. This may include, but is not limited to:
Some examples of items that may be relevant to the performance-testing effort in your project context include:
When end-to-end performance testing reveals system or application characteristics that are deemed unacceptable, many teams shift their focus from performance testing to performance tuning, to discover what is necessary to make the application perform acceptably. A team may also shift its focus to tuning when performance criteria have been met but the team wants to reduce the amount of resources being used in order to increase platform headroom, decrease the volume of hardware needed, and/or further improve system performance.
Although tuning is not the direct responsibility of most performance testers, the tuning process is most effective when it is a cooperative effort between all of those concerned with the application or system under test, including:
Without the cooperation of a cross-functional team, it is almost impossible to gain the system-wide perspective necessary to resolve performance issues effectively or efficiently.
The performance tester, or performance testing team, is a critical component of this cooperative team as tuning typically requires additional monitoring of components, resources, and response times under a variety of load conditions and configurations. Generally speaking, it is the performance tester who has the tools and expertise to provide this information in an efficient manner, making the performance tester the enabler for tuning.
Tuning follows an iterative process that is usually separate from, but not independent of, the performance testing approach a project is following. The following is a brief overview of a typical tuning process:
Performance tests are usually described as belonging to one of the following three categories:
Creating a baseline is the process of running a set of tests to capture performance metric data for the purpose of evaluating the effectiveness of subsequent performance-improving changes to the system or application. A critical aspect of a baseline is that all characteristics and configuration options except those specifically being varied for comparison must remain invariant. Once a part of the system that is not intentionally being varied for comparison to the baseline is changed, the baseline measurement is no longer a valid basis for comparison.
With respect to Web applications, you can use a baseline to determine whether performance is improving or declining and to find deviations across different builds and versions. For example, you could measure load time, the number of transactions processed per unit of time, the number of Web pages served per unit of time, and resource utilization such as memory usage and processor usage. Some considerations about using baselines include:
Benchmarking is the process of comparing your system’s performance against a baseline that you have created internally or against an industry standard endorsed by some other organization.
In the case of a Web application, you would run a set of tests that comply with the specifications of an industry benchmark in order to capture the performance metrics necessary to determine your application’s benchmark score. You can then compare your application against other systems or applications that also calculated their score for the same benchmark. You may choose to tune your application performance to achieve or surpass a certain benchmark score. Some considerations about benchmarking include:
The following definitions are used throughout this guide. Every effort has been made to ensure that these terms and definitions are consistent with formal use and industry standards; however, some of these terms are known to have certain valid alternate definitions and implications in specific industries and organizations. Keep in mind that these definitions are intended to aid communication and are not an attempt to create a universal standard.
Term / Concept |
Description |
Capacity |
The capacity of a system is the total workload it can handle without violating predetermined key performance acceptance criteria. |
Capacity test |
A capacity test complements load testing by determining your server’s ultimate failure point, whereas load testing monitors results at various levels of load and traffic patterns. You perform capacity testing in conjunction with capacity planning, which you use to plan for future growth, such as an increased user base or increased volume of data. For example, to accommodate future loads, you need to know how many additional resources (such as processor capacity, memory usage, disk capacity, or network bandwidth) are necessary to support future usage levels. Capacity testing helps you to identify a scaling strategy in order to determine whether you should scale up or scale out. |
Component test |
A component test is any performance test that targets an architectural component of the application. Commonly tested components include servers, databases, networks, firewalls, and storage devices. |
Endurance test |
An endurance test is a type of performance test focused on determining or validating performance characteristics of the product under test when subjected to workload models and load volumes anticipated during production operations over an extended period of time. Endurance testing is a subset of load testing. |
Investigation |
Investigation is an activity based on collecting information related to the speed, scalability, and/or stability characteristics of the product under test that may have value in determining or improving product quality. Investigation is frequently employed to prove or disprove hypotheses regarding the root cause of one or more observed performance issues. |
Latency |
Latency is a measure of responsiveness that represents the time it takes to complete the execution of a request. Latency may also represent the sum of several latencies or subtasks. |
Metrics |
Metrics are measurements obtained by running performance tests as expressed on a commonly understood scale. Some metrics commonly obtained through performance tests include processor utilization over time and memory usage by load. |
Performance |
Performance refers to information regarding your application’s response times, throughput, and resource utilization levels. |
Performance test |
A performance test is a technical investigation done to determine or validate the speed, scalability, and/or stability characteristics of the product under test. Performance testing is the superset containing all other subcategories of performance testing described in this chapter. |
Performance budgets or allocations |
Performance budgets (or allocations) are constraints placed on developers regarding allowable resource consumption for their component. |
Performance goals |
Performance goals are the criteria that your team wants to meet before product release, although these criteria may be negotiable under certain circumstances. For example, if a response time goal of three seconds is set for a particular transaction but the actual response time is 3.3 seconds, it is likely that the stakeholders will choose to release the application and defer performance tuning of that transaction for a future release. |
Performance objectives |
Performance objectives are usually specified in terms of response times, throughput (transactions per second), and resource-utilization levels and typically focus on metrics that can be directly related to user satisfaction. |
Performance requirements |
Performance requirements are those criteria that are absolutely non-negotiable due to contractual obligations, service level agreements (SLAs), or fixed business needs. Any performance criterion that will not unquestionably lead to a decision to delay a release until the criterion passes is not absolutely required ― and therefore, not a requirement. |
Performance targets |
Performance targets are the desired values for the metrics identified for your project under a particular set of conditions, usually specified in terms of response time, throughput, and resource-utilization levels. Resource-utilization levels include the amount of processor capacity, memory, disk I/O, and network I/O that your application consumes. Performance targets typically equate to project goals. |
Performance testing objectives |
Performance testing objectives refer to data collected through the performance-testing process that is anticipated to have value in determining or improving product quality. However, these objectives are not necessarily quantitative or directly related to a performance requirement, goal, or stated quality of service (QoS) specification. |
Performance thresholds |
Performance thresholds are the maximum acceptable values for the metrics identified for your project, usually specified in terms of response time, throughput (transactions per second), and resource-utilization levels. Resource-utilization levels include the amount of processor capacity, memory, disk I/O, and network I/O that your application consumes. Performance thresholds typically equate to requirements. |
Resource utilization |
Resource utilization is the cost of the project in terms of system resources. The primary resources are processor, memory, disk I/O, and network I/O. |
Response time |
Response time is a measure of how responsive an application or subsystem is to a client request. |
Saturation |
Saturation refers to the point at which a resource has reached full utilization. |
Scalability |
Scalability refers to an application’s ability to handle additional workload, without adversely affecting performance, by adding resources such as processor, memory, and storage capacity. |
Scenarios |
In the context of performance testing, a scenario is a sequence of steps in your application. A scenario can represent a use case or a business function such as searching a product catalog, adding an item to a shopping cart, or placing an order. |
Smoke test |
A smoke test is the initial run of a performance test to see if your application can perform its operations under a normal load. |
Spike test |
A spike test is a type of performance test focused on determining or validating performance characteristics of the product under test when subjected to workload models and load volumes that repeatedly increase beyond anticipated production operations for short periods of time. Spike testing is a subset of stress testing. |
Stability |
In the context of performance testing, stability refers to the overall reliability, robustness, functional and data integrity, availability, and/or consistency of responsiveness for your system under a variety conditions. |
Stress test |
A stress test is a type of performance test designed to evaluate an application’s behavior when it is pushed beyond normal or peak load conditions. The goal of stress testing is to reveal application bugs that surface only under high load conditions. These bugs can include such things as synchronization issues, race conditions, and memory leaks. Stress testing enables you to identify your application’s weak points, and shows how the application behaves under extreme load conditions. |
Throughput |
Throughput is the number of units of work that can be handled per unit of time; for instance, requests per second, calls per day, hits per second, reports per year, etc. |
Unit test |
In the context of performance testing, a unit test is any test that targets a module of code where that module is any logical subset of the entire existing code base of the application, with a focus on performance characteristics. Commonly tested modules include functions, procedures, routines, objects, methods, and classes. Performance unit tests are frequently created and conducted by the developer who wrote the module of code being tested. |
Utilization |
In the context of performance testing, utilization is the percentage of time that a resource is busy servicing user requests. The remaining percentage of time is considered idle time. |
Validation test |
A validation test compares the speed, scalability, and/or stability characteristics of the product under test against the expectations that have been set or presumed for that product. |
Workload |
Workload is the stimulus applied to a system, application, or component to simulate a usage pattern, in regard to concurrency and/or data inputs. The workload includes the total number of users, concurrent active users, data volumes, and transaction volumes, along with the transaction mix. For performance modeling, you associate a workload with an individual scenario. |
Performance testing helps to identify bottlenecks in a system, establish a baseline for future testing, support a performance tuning effort, and determine compliance with performance goals and requirements. Including performance testing very early in your development life cycle tends to add significant value to the project.
For a performance testing project to be successful, the testing must be relevant to the context of the project, which helps you to focus on the items that that are truly important.
If the performance characteristics are unacceptable, you will typically want to shift the focus from performance testing to performance tuning in order to make the application perform acceptably. You will likely also focus on tuning if you want to reduce the amount of resources being used and/or further improve system performance.
Performance, load, and stress tests are subcategories of performance testing, each intended for a different purpose.
Creating a baseline against which to evaluate the effectiveness of subsequent performance-improving changes to the system or application will generally increase project efficiency.
Please sign in to use this experience.
Sign in