Managing the capacity of business services, applications, and their associated components is an ongoing challenge. A crystal ball and unlimited funding would certainly reduce or eliminate some of the challenges.
Challenges such as:
- Infrastructure capacity planning not aligned with current business initiatives.
- Business relies upon availability of services.
- Difficult to anticipate departmental needs related to IT capacity.
Our Advice
Critical Insight
The quest for availability drives the need for capacity management, which matures into proactive prediction and in turn enhances availability.
Impact and Result
Observe, Manage, and Model!
- Identify what components and services you have and add in the various initiatives that will impact those services and components. (Observe!)
- Set thresholds, manage risk, and use your tools to take actions as capacity and availability become a challenge. (Manage!)
- Use trending, metrics, and other insights to predict capacity requirements and reduce availability challenges. (Model!)
Availability & Capacity Management
Maximize the benefits of infrastructure monitoring investments by diagnosing & assessing transaction performance, from network to server to end-user interface.
This course makes up part of the Infrastructure & Operations Certificate.
- Course Modules: 8
- Estimated Completion Time: 1.5 hours
- Featured Analysts:
- PJ Ryan – Research Director
- Gord Harrison, SVP of Research and Advisory
Manage Your Capacity to Increase Your Availability
Maximize availability throuh proactive capacity management.
Table of Contents
4 | Analyst Perspective |
5 | Executive Summary |
6 | Thought Model |
32 | Phase 1: Observe and Monitor |
33 | Step 1.1: Align Business Initiatives and Intake Sources |
40 | Step 1.2: Business Impact Analysis |
52 | Step 1.3: Comprehend Components |
67 | Phase 2: Manage Capacity and Availability |
68 | Step 2.1: Initiate Event Management |
77 | Step 2.2: Estabish Thresholds |
84 | Step 2.3: Explore Management Options |
88 | Step 2.4: Appreciate Risk Management |
96 | Step 2.5: Onboard New Requests and Components |
102 | Phase 3: Implement Modeling |
103 | Step 3.1: Understand Trending |
107 | Step 3.2: Define Availability Metrics |
111 | Step 3.3: Availability in the Business Context |
116 | Step 3.4: Bring it All Together |
124 | Bibliography |
Manage Your Capacity to Increase Your Availability
EXECUTIVE BRIEF
Analyst Perspective
Capacity management directly impacts availability! No storage capacity to store that incoming email message means email is no longer available. No email now means other services are impacted. This is a domino effect that starts with a lack of capacity.
Availability is essential to any business operation. It is directly impacted by several factors, and capacity is one of the most prominent of those factors. Exceeding capacity will directly, and in most cases immediately, impact availability.
Identify what components and services you have and add in the various initiatives that will impact those services and components. That’s your insight, your visibility. Keep an eye on those components, services, and initiatives (OBSERVE).
Now establish triggers so you know when initiatives are about to overload your components and services. This overloading will directly impact your availability. Keep it within your control through a variety of options. Don’t forget about new initiatives and components. Include them in the overall process (MANAGE).
What about the next steps? Use your tools to identify trends. Predict what will happen in 12, 18, or even 24 months in the future. Establish solid SLAs because you know you can provide a reliable service. Communicate this through metrics and statistics. Now you are MODELING.
Observe, Manage, and Model! Sounds easy enough, but is it? Yes, if you have a plan and a process.
P.J. Ryan Research Director, Infrastructure & Operations Info-Tech Research Group |
Executive Summary
Your Challenge
Managing the capacity of business service and applications and their associated components is an ongoing challenge. A crystal ball and unlimited funding would certainly reduce or eliminate some of the challenges.
Challenges such as:
- Infrastructure capacity planning not aligned with current business initiatives.
- The absence of a process for capacity management.
- Difficult to anticipate departmental needs related to IT capacity.
Common Obstacles
Fighting fires, keeping the lights on, project support, and legacy debt maintenance – these obstacles prevent you from addressing current capacity and availability issues. Other common obstacles include:
- No mechanism to manage capacity or measure availability.
- Supply chain issues related to many components.
- No gating or onboarding process for new components or capacity requests.
It feels like every time you overcome one obstacle, another challenge unexpectedly appears out of nowhere – but there is hope.
Info-Tech’s Approach
Observe, Manage and Model!
- Identify what components and services you have and add in the various initiatives that will impact those services and components. (Observe)
- Set thresholds, manage risk, and use your tools to take actions as capacity and availability become a challenge. (Manage)
- Use trending, metrics, and other insights to predict capacity requirements and reduce availability challenges. (Model)
Use the Info-Tech approach to get your capacity under control and provide increased availability.
Info-Tech Insight
The quest for availability drives the need for capacity management, which matures into proactive prediction and in turn enhances availability.
Manage Your Capacity to Increase Your Availability
Maximize availability through proactive capacity management.
Business Challenges
- Infrastructure is not aligned with current business initiatives.
- There is a lack of infrastructure resources for new initiatives.
- Existing capacity challenges lead to availability issues.
Capacity Management Plan
- Observe current components and business capacity requirements.
- Manage capacity adjustment solutions.
- Model future capacity requirements through proactive forecasting and adjustments.
Insight
The desire for better availability drives the evolution of monitoring into event management. This evolution enables better capacity management, which in turn improves availability.
Info-Tech’s Approach
Observe
Identify what components and services you have and add in the various initiatives that will impact those services and components.Manage
Set thresholds, manage risk, and use your tools to take actions as capacity and availability become a challenge.Model
Use trending, metrics, and other insights to predict capacity requirements and reduce availability challenges.
Understand what happens when capacity/availability management fails
The goal of capacity management is to optimize organizational performance by ensuring that the right level of resources are available, while also maximizing resource utilization and minimizing costs. (Source: Day.io)
Services become unavailable.
If availability and capacity management are not constantly practiced, an inevitable consequence is downtime or a reduction in the quality of that service. Critical sub-component failures can knock out important systems on their own.Money is wasted.
In response to fears about availability, it’s entirely possible to massively overprovision or switch entirely to a pay-as-you-go model. This, unfortunately, brings with it a whole host of other problems, including overspending. Remember: infinite capacity means infinite potential cost.IT remains reactive.
If IT is constantly putting out capacity/availability-related fires, there is no room for optimization and activities to increase organizational maturity. Effective availability and capacity management will allow IT to focus on other work.
Save money and drive efficiency with an effective capacity management and availability plan
Overprovisioning happens because of the old style of infrastructure provisioning (hardware refresh cycles) and because capacity managers don’t know how much they need (either as a result of inaccurate or nonexistent information).
According to 451 Research, 59% of enterprises have had to wait 3+ months for new capacity. It is little wonder, then, that so many opt to overprovision.
Capacity management is about ensuring that IT services are available, and with lead times like that, overprovisioning can be more attractive than the alternative.
Fortunately, there is hope. An effective capacity management and availability plan can help you:
- Observe your services, applications, and components as well as existing and upcoming initiatives.
- Properly Manage your capacity.
- Model your future capacity needs.
Balancing overprovisioning and spending is the capacity manager’s struggle.
Capacity and availability
Availability and capacity are not the same, but they are related and can be effectively managed together as part of a single process.
If an IT department is unable to meet demand due to insufficient capacity, users will experience downtime or a degradation in service.
To be clear, capacity is not the only factor in availability – reliability, serviceability, etc. are significant as well.
But no organization can effectively manage availability without paying sufficient attention to capacity.
“Availability management is concerned with the design, implementation, measurement, and management of IT services to ensure that the stated business requirements for availability are consistently met.” (OGC, Best Practice for Service Delivery, 12)
“Capacity management aims to balance supply and demand [of IT storage and computing services] cost-effectively…” (OGC, Business Perspective, 90)
Integrate the three levels of capacity management
Successful capacity management involves a holistic approach that incorporates all three levels.
- Business — The highest level of capacity management, business capacity management, involves predicting changes in the business’ needs and developing requirements in order to make it possible for IT to adapt to those needs. Influx of new clients from a failed competitor.
- Service — Service capacity management focuses on ensuring that IT services are monitored to determine if they are meeting pre-determined SLAs. The data gathered here can be used for incident and problem management. Increased website traffic.
- Component — Component capacity management involves tracking the functionality of specific components (servers, hard drives, etc.), effectively tracking their utilization and performance, and making predictions about future concerns. Insufficient web server compute.
The C-suite cares about business capacity as part of the organization’s strategic planning. Service leads care about their assigned services. IT infrastructure is concerned with components, but not for their own sake. Components mean services that are ultimately designed to facilitate business.
Consider the relationship between component capacity and service capacity
End users’ thoughts about IT are based on what they see. They are, in other words, concerned with service availability. Does the organization have the ability to provide access to needed services?
Service
- CRM
- ERP
Component
- Switch
- SMTP server
- Archive database
- Storage
“You don’t ask the CEO or the guy in charge ‘What kind of response time is your requirement?’ He doesn’t really care. He just wants to make sure that all his customers are happy.” (Todd Evans, Capacity and Performance Management SME, IBM.)
Manage availability and keep your stakeholders happy
If you run out of capacity, you will inevitably encounter availability issues like downtime and performance degradation. End users do not like downtime, and neither do their managers.
There are three variables that are monitored, measured, and analyzed as part of availability management more generally (Valentic).
Uptime:
The availability of a system is the percentage of time the system is “up,” (and not degraded) which can be calculated using the following formula: uptime/(uptime + downtime) x 100%. The more components there are in a system, the lower the availability, as a rule.Reliability:
The length of time a component/service can go before there is an outage that brings it down, typically measured in hours.Maintainability:
The amount of time it takes for a component/service to be restored in the event of an outage, also typically measured in hours.
Enter the cloud: changes in the capacity manager role
There can be no doubt – the rise of the public cloud has fundamentally changed the nature of capacity management.
Features of the public cloud | Implications for capacity management | |
Instant, or near-instant, instantiation | Lead times drop; capacity management is less about ensuring equipment arrives on time. | |
Pay-as-you go services | Capacity no longer needs to be purchased in bulk. Pay only for what you use and shut down instances that are no longer necessary. | |
Essentially unlimited scalability | Potential capacity is infinite, but so are potential costs. | |
Offsite hosting | Redundancy, but at the price of the increasing importance of your internet connection. |
Use best practices to optimize your cloud resources
Even in the era of elasticity, capacity planning is crucial. Spot instances – the spikes in the graphs above – are more expensive, but if your capacity needs vary substantially, reserving instances for all of the space you need can cost even more money. Efficiently planning capacity will help you draw this line.
Evaluate business impact; not all systems are created equal
Limited resources are a reality. Detailed visibility into every single system is often not feasible and could be too much information.
Simple and effective. Sometimes a simple display can convey all of the information necessary to manage critical systems. In cars it is important to know your speed, how much fuel is in the tank, and whether or not you need to change your oil/check your engine.
Where to begin? Specialized information is sometimes necessary, but it can be difficult to navigate.
Execute a business impact analysis (BIA) as part of a broader availability plan
Business impact analyses are an invaluable part of a broader IT strategy. Conducting a BIA benefits a variety of processes, including event management, disaster recovery, business continuity, and availability and capacity management.
STEP 1 — Record applications and dependencies
Utilize your asset management records and document the applications and systems that IT is responsible for managing and recovering during a disaster.STEP 2 — Define impact scoring scale
Ensure an objective analysis of application criticality by establishing a business impact scale that applies to all applications.STEP 3 — Estimate impact of downtime
Leverage the scoring criteria from the previous step and establish an estimated impact of downtime for each application.STEP 4 — Identify desired RTO and RPO
Define what the RTOs/RPOs should be based on the impact of a business interruption and the tolerance for downtime and data loss.STEP 5 — Determine current RTO/RPO
Conduct tabletop planning and create a flowchart of your current capabilities. Compare your current state to the desired state from the previous step.
Info-Tech Insight
Engaging in detailed capacity planning for an insignificant service draws time and resources away from more critical capacity planning exercises. Time spent tracking and planning use of the ancient fax machine in the basement is time you’ll never get back.
Combine historical data with the needs you’ve solicited to holistically project your future needs
Predicting the future is difficult, but when it comes to capacity management, foresight is necessary.
Critical inputs
In order to project your future needs, the following inputs are necessary:
- Usage trends: While it is true that past performance is no indication of future demand, trends are still a good way to validate requests from the business.
- Line of business requests: An understanding of the projects the business has in the pipes is important for projecting future demand.
- Institutional knowledge: Read between the lines. As experts on information technology, the IT department is well-equipped to translate needs into requirements.
Info-Tech Insight
If the factor, unit, utilization rate, etc., is trending upward or downward over time, it will continue in that direction unless something changes to impact that trend. Learn form your historical data!
Establish visibility into critical systems
You may have seen “If you can’t measure it, you can’t manage it” or a variation thereof floating around the internet. This adage is consumable and makes sense…doesn’t it?
“It is wrong to suppose that if you can’t measure it, you can’t manage it – a costly myth.” (W. Edwards Deming, statistician and management consultant, author of The New Economics)
While it is true that total monitoring is not absolutely necessary for management, when it comes to availability and capacity – objectively quantifiable service characteristics – a monitoring strategy is unavoidable. Capturing fluctuations in demand, and adjusting for those fluctuations, is among the most important functions of a capacity manager, even if hovering over employees with a stopwatch is poor management.
Three steps to manage your capacity and increase your availability
Observe
Identify what components and services you have and add in the various initiatives that will impact those services and components.Manage
Set thresholds, manage risk, and use your tools to take action as capacity and availability become a challenge.Model
Use trends, metrics, and other insights to predict capacity requirements and reduce availability challenges.