Contingency Planning
There are several topics that you need to keep in mind during your contingency planning.
Disaster Planning
Just as every business needs to plan and budget for future growth, you need to plan for dealing with total or partial loss of business data. To determine what provisions to make for partial or complete loss of data, estimate the approximate cost in time and money to rebuild or replace critical data. Consider the following questions:
Do you know the cost of reconstructing your company's financial, personnel, and other business data?
Do you know if your business insurance would cover any or all of the cost of replacing data?
Do you know how long it would take to reconstruct your business data? How might this affect future business?
Do you know the cost per hour of computer downtime?
To prevent a natural disaster or sabotage from becoming a financial disaster for your business, test your plan for recovering and restoring critical data. Keep copies of your disaster recovery plan at on-site and off-site locations. Key personnel also need to keep a copy of critical data at home.
Because many books and magazine articles discuss disaster planning and recovery in detail, this section suggests topics to explore further instead of presenting detailed disaster plans to implement. Your insurance company can provide you with current and specific information for your situation.
The following are some important issues to consider when developing a comprehensive disaster plan to incorporate into your daily operations.
What data do you need to back up, and how often do you need to do backups?
What critical computer or other hardware configuration information, not saved during normal tape backups, needs to be saved?
What data needs to be stored on-site or off-site, and how does it need to be stored?
What training enables operators and administrators to respond quickly and effectively in an emergency?
Assessing the Probability of Failure
Mean time between failures (MTBF) information supplied by some equipment manufacturers is generally only helpful if you do extensive analysis and modeling based on your company's pattern of use. Thus, it is recommended that MTBF information be used only as a relative measure of reliability.
Maintaining a record of past failures and their causes can be very helpful. This information can help you categorize failures by type, such as:
Hardware failure on a server, client, or network component.
Software failure of the operating system or applications on a server or client.
Administrative error.
User error.
Deliberate damage, such as sabotage or a virus.
The following questions can help you analyze failures and your procedures for handling them:
What was done or can be done to solve the problem?
How long would or did the solution take?
What would or did the solution cost?
What actions have you taken to reduce the recurrence of each recorded failure?
What changes have you made that might affect the number of failures? Changes might include the size of local area networks (LANs) or wide area networks (WANs), or the number of:
Servers
Clients
Users
Administrators
Intermediary devices
External connections
Estimating Replacement Costs
There are several ways to measure the costs of recovering from problems. Some are easy to calculate, such as:
Replacing file servers, mail servers, or print servers.
Replacing servers running applications such as Microsoft® SQL Server™ or the Microsoft® Systems Management Server.
Replacing gateway servers running Routing and Remote Access, Microsoft® SNA Server, Proxy Service, or Novell NetWare.
Replacing workstations for personnel.
Replacing computer components, such as hard disks and network adapters.
Replacing products that have a set shelf life.
Far more difficult to measure, but just as devastating, are the invisible costs of computer downtime, such as lost sales, lost customer goodwill, lost productivity, increased costs for makeup time, missed contractual obligations, and loss of competitiveness.
If you have kept records of failures, you might find them useful in your contingency planning. You can investigate ways to avoid each failure, or to minimize the downtime associated with the failure. If you have cost information for the failures, you can then compare the cost of each failure to the cost of preventing or minimizing the failure. Table 11.1 describes two examples of failure, the costs related to each failure, and the effects of implementing solutions or workarounds to avoid future failures.
Table 11.1 Examples of the Effects of Failure
Category |
Example One |
Example Two |
---|---|---|
Failure description |
File server in sales department down, network adapter failure |
Router failure between development and testing department |
Effect |
Lost sales |
Lost productivity of employees |
Total downtime last year |
Three hours |
16 hours |
Costs of failure per hour |
$10,000 |
Average hourly wage of 10 affected employees is $18/hr |
Annual downtime costs |
$30,000 |
$2,880 |
Possible resolution or workaround |
Three spare network adapters at $100 each |
Put an alternate router in place or obtain a spare router |
Expected costs of resolution or workaround |
$300 |
$500 – $2,000 |
Estimated savings during first year with resolution in place |
$29,700 |
$880 – $2,380 |