Eliminating planned downtime: the real impact and how to avoid it

Computer Technology Review, May, 2004 by Robert Peglar

Today's enterprises count on their data centers to be available 24 hours a day, seven days a week. But no matter how well those data centers may be architected and maintained, there will be times when applications and systems are unavailable because of planned downtime.

Planned downtime is by far the biggest reason that applications and systems are unavailable, accounting for 90% of the time a company's systems are offline. Typically, somewhat more than half of this planned downtime is attributable to database backup. The maintenance, upgrading and replacement of application and system software, hardware and networks typically accounts for most of the rest of the time in this category (Enterprise Management Associates, Inc., 2002).

Planned downtime, by definition, can be anticipated. Despite the planned aspect of these interruptions, however, even a minor modification to storage subsystems can cause servers and applications to be unavailable for an extended period of time.

Thus, although companies spend a great deal of time and resources to accommodate planned downtime and often regard such interruptions as a necessary part of doing business, enterprises will always seek solutions that let them minimize and ultimately eliminate them.

Proof Point: Reducing planned maintenance can have impressive results. Eliminating a daily 2-hour maintenance window on a system that supports a revenue process of $1,000 per hour adds $2,000 each day. Assuming a 24X7 business day, this one step will increase revenues $730,000 per year.

Downtime's Impact on IT

In fact, impact on the IT department may represent only a small piece of the overall shock that system downtime can have on a company's business processes.

When important systems go down, business processes also fail.

Obviously, any assumption that seeks to represent business losses due to system unavailability must understand the per-hour value of the systems involved. Such impact will, of course, vary according to industry, company within each industry, and application; it is clear, however, that many enterprises, and particularly those businesses engaged in large-scale financial transactions or actively engaged in e-commerce, may experience downtime that causes revenue losses of several thousand dollars (and in some cases, tens of thousands of dollars) for each minute a crucial storage system is unavailable.

Proof Point: Quantify the revenue impact of a key system going off line. In extreme cases, the revenue value of eliminating a single hour of downtime will more than justify the entire price of upgrading a storage system. For example, the cost of a single hour of downtime for a system doing credit card sales authorizations is estimated to be between $2.2M-$3.1M.

Beyond the tangible costs associated with revenue disruption, an enterprise's business can suffer more subtle damage, especially if its IT infrastructure is subject to repeated breakdown. When business systems go off line, in addition to losing revenue, companies can also lose their good reputation. Customers become frustrated, relationships become frayed, and formerly loyal clients turn to alternative sources.

Of course, a company faces more than disruption of revenue while business processes are offline. Employees sit idle or unable to work at full capacity because of poor system or application performance, wasting more resources and jeopardizing key customer relationships.

Problems of Manually Managed Systems

There are also problems other than downtime itself that stem from storage systems dependent on manual configuration and management. Highly skilled personnel with multiple skill sets are required to manage, configure, and optimize the performance of large, distributed storage infrastructures.

Companies invest both time and money to hire, develop and support the best people they can find. Unfortunately, when systems are managed manually, even a company with the best employee recruitment, training and retention program won't always be able to solve all the problems it may face in the course of an IT business shift. This is because even when the best people are recruited and well trained, they will still make mistakes--many of which either lengthen planned downtime or cause subsequent unplanned downtime.

Proof Point: Quality help at any level demands a premium. All computations of the costs associated with salary/wages must be at a fully burdened rate. That is, they must include the cost of all benefits that accrue to the employee in addition to salary. These benefits include vacation time, medical plans, and any other corporate contributions that are part of the pay package. Most companies figure the fully burdened rate at something between 33%-50% of the salary. Thus, a $50 an hour salary could actually represent as much as $75 per hour in cost. Most senior IT personnel and consultants make substantially more than this.

Retention of key skilled IT staff is also a problem. Due to the complexity of large enterprise storage infrastructures, retaining key staff skilled in managing that infrastructure is a critical issue.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale