Establishing an IT Disaster Recovery Plan

Subject: Tech & Engineering
Pages: 15
Words: 5599
Reading time:
22 min
Study level: Master

Abstract

Business today face not only uncertain markets but various threats and disasters that are man made as well as natural. While natural disaster include Tsunamis, Tornadoes, Earthquakes and others, man made disasters include terrorist attacks. What is common among these disasters is that they strike quickly, without warning and destroy the business infrastructure and this includes IT systems, networks along with buildings. To ensure that organizations continue to function even after a disaster has struck, a viable Disaster Recovery Plan (DRP) combined with a Business Continuity Plan (BCP) needs to be in place. These plans should specify the centres from which data should be recovered, alternate sites where operations can be resumed, what data to back up, detailed procedures to be used and the team structure that would implement the DRP along with details of team members that specify who will do what. Such a plan will avoid confusion in the face of a disaster and reduce the time to recover data to the least. This thesis provides details of implementation of a DRP and BCP for an IT organization. Details of the steps to be followed, network diagrams, templates for risk assessment and others are given in detail.

Introduction

Businesses worldwide operate under conditions that are subject to change, depending on the political situation, economic, and natural conditions. Business houses are constantly under the threat of disasters, such as earthquakes, terrorist attacks, fire, riots, power outages, and stock market crash. Due to such risks, the intellectual assets, such as classified documents, source codes, and physical assets, such as infrastructure and hardware, run the risk of compromise. In such a situation, a plan must be in place to allow business to recover their intellectual and physical assets and continue the business operations, at the earliest. It is also essential to assure clients and business partners, who have invested time and resources, that in case of a disaster, their investments could be recovered in an acceptable time frame. To handle such situations, it is important to have a IT disaster recovery plan that is implemented to counter the effects of disasters. Having a Disaster Recovery Plan (DRP) and Business Continuity Plan (BCP) becomes important, when business expands to overseas market. This paper discusses the important elements of a DRP and BCP for a company with global operations.

Why IT Backup Plan is required

Recent events such as the 9/11 attacks, Katrina hurricane, the Tsunami in south east Asia and others show that disaster, both natural and man made can strike with very little warning and totally take out the infrastructure that has been built such as buildings, whole towns and cities, cabling and any IT systems that are located in a particular location. Benton (2007) defined disaster recovery as “the process, policies and procedures of restoring operations critical to the resumption of business, including regaining access to data (records, hardware, software, etc.), communications (incoming, outgoing, toll-free, fax, etc.), workspace, and other business processes after a natural or human-induced disaster”. While a disaster recovery would also involve reconstruction of buildings, relocating people, building roads, restoring power and communications and many other activities, this paper would be limited to discussing the disaster recovery plan for the IT systems of a company (Meade, 1993).

In the current environment, the threat from terrorists and also from nature places IT systems at a high risk. Since many companies have very strict rules regarding retrieval and storage of sensitive information, data tends to get centralized. If a disaster strikes the central server room where the data is stored, then all the company’s soft assets would be lost forever. Information about customers, business strategies and records, marketing and trading information and other details would become irrecoverable. In such a scenario, strategic plan that protects all computer-based operations necessary for the company’s day-to-day survival is imperative.

If a company loses sensitive data, then it not only loses its soft asset but also the confidence of the customers and would probably go bankrupt. With increasing use of IT systems and dependence on business-critical information, the importance of protecting irreplaceable data has become a top business need. Since many companies rely on IT systems and regard it as critical infrastructure the need for regular backup is very crucial that even after a disaster strikes, the company can begin operating within a short period of time. Many large companies provide up to 4 percent in their IT budget on disaster recovery systems. It is estimated that 43 percent of companies that had lost data and could not replace the data went bankrupt while 51 percent had to shut down in two years while only six percent could service in the long run (Swartz, 2004).

So IT disaster recovery plan is required to ensure that a company is able to recover quickly in case of a disaster, customer confidence is retained and that the business is able to continue.

What to Backup

The question of what to backup is best answered by asking ‘what are the company’s soft assets? An IT company may regard its software source code, its database structure, software source code of its applications as very crucial. For example, a company such as Microsoft would consider the source code of Windows, XP, MS Office and other software applications as critical and would want to ensure that the code is recovered at any point of time. A banking company would consider the financial records of its customers, its own receivables and credit/ debit records as very important. Banks store the account details, credit card payment and receipt details, information about mortgages and loans, Forex accounts as critical and would be interested in taking the back up of such records. A large investment and share trading company or a bank that deals in futures would consider its stock portfolio as very important. Government defence bodies would consider details of their troop deployment, state of munitions and aircraft, status and position of different missile systems as crucial to the protection of their country and would want this information to be safe and recoverable at any point of time. So the data to be backed up would depend in what the company feels is crucial and important. Hence the data to be backed up would vary (Toigo, 2005).

Another issue that comes up is the question of data formats and the type of backup. An organization typically stores information either in encrypted form, binary code or in the form of documents such as MS Word, XLS, pdf, image files and so on and these formats have to be saved according to the organization needs. Many organizations, to preserve the integrity of their data systems usually encrypt data using 128 bit or 256-bit encryption. At any point of time during the recovery system, the encryption key should be available to authorized personnel with the required level of clearances (Toigo, 2005).

Different techniques are used for backing up data and these include the incremental back up system that writes only data that has been changed since the last backup. Considering that banks and large organizations have data sizes in the range of Terra Flops, if a daily back up of this huge ream of data was to be taken, then massive resources would be required, time used be excessive and the system would slow down. To get over this problem, incremental data back up is taken and this process ensures that only data that has been changed since the last backup is written in the back up area. Also, since backup slows down the system, company’s run the data backup process as a day end process, late in the night when very few users would be logged in (Hiatt, 2007).

It is worth to remember this statement “When it comes to back up, members of organization are paranoid. While some feel that every little bit of email or document that they have created (which would be probably be deleted by the recipient) has to be backed up, others tend to develop paranoia that their documents or writing would be available for everyone to see and they would not want to share it with others. The management has to step in at a certain stage and frame a policy on what is worth backing and what is best left on the PC of a warehouse assistant clerk” (Kaye, 2006).

Understanding different levels of Disasters

There are five levels of disasters that an organization would face and the effects of each level and the disaster recovery plan would differ as per the level such as Level 1 to Level 4. Level 1 would be the least severe while Level 4 would be regarded as a catastrophe.

Disasters can be classified into (Preston, 1999):

  • Level 1 Disaster: Causes minor outage. An example of Level 1 disaster is modem failure. Some or all business processes at a location might experience minor damage, but processes will continue to run with reduced efficiency. Full processing capability of mission critical business processes and related infrastructure and people can be restored within an hour. Recovery at an alternate site may not be required (Preston, 1999).
  • Level 2 Disaster: Causes moderate outage. An example of Level 2 disaster is LAN failure. Some or all business processes at a location might experience moderate damage. Processes may or may not continue since the equipment is below the minimum capacity to run. Full processing capability of mission critical business processes and related infrastructure and people may be restored within 2 hours. An alternate recovery site may not be required for continuing business but alternate equipment or communication links may be required (Preston, 1999).
  • Level 3 Disaster: Causes severe disaster. An example of Level 3 disaster is riots. Infrastructure ceases to function. Full processing capability of all business processes from that location and related infrastructure may be restored within 1-2 days. Use of alternate recovery site will be required (Preston, 1999).
  • Level 4 Disaster: Is a catastrophe, such as earthquake, war, or a major terrorist attack. This type of disaster results in major disruption of services. Full processing capability cannot be achieved for a substantial period of time. Recovery will require use of alternate recovery site (Preston, 1999).

The following table gives details of these threat levels.

Type Of Disaster Description
Minor Outage (Level 1) Some or all business processes at a location experience minor damage / outage but processes will continue on a degraded basis. Full processing capability of mission critical business processes and related infrastructure and people can be restored within 1 hourby getting the necessary infrastructure, people and data operational. Recovery at alternate site is determined not to be required. It is assumed that the usual office premises & people are available to the business. e.g.
  1. A link between two locations is temporarily unavailable
  2. Modem fails.
  3. Sparks in electrical connections force temporary shutdown of servers / routers in that area. Operations resumed as soon as electrical connections are repaired
  4. Virus and hacking attacks or due to improper behaviour of employees
Moderate Outage
(Level 2)
Some or all business processes at a location experience moderate damage / outage. Processes may or may not continue on a degraded basis. Full processing capability of mission critical business processes and related infrastructure and people may be restored within 4 hours. An alternate site may not be requiredfor continuing business but alternate equipment or route (in case of communication links) may be required depending on the criticality of the business process and infrastructure. It is assumed that the usual office premises and people are available to the business. e.g.
  1. Power surge damages equipment
  2. Link Failure (that can be recovered within 4 Hours)
  3. LAN Failure
Disaster
(Level 3)
A Centre has experienced severe disaster. There is a total shut down of infrastructure. Full processing capability of all business processes from that location and related infrastructure and people may be restored within 1-2 days.Use of alternate recovery site will be required. It is assumed that premises and equipment are inaccessible, but people can congregate elsewhere if required. e.g.
  1. Flood / Rain / Snow makes office premises at one of offices inaccessible.
  2. Riots / Arson at a location near one of the offices renders the office premises inaccessible.
  3. Extended power cut.
Catastrophe
(Level 4)
A centre has experienced a major disaster that will likely result in a major disruption of services. Full processing capability cannot be achieved for a substantial period of time.Recovery will require use of alternate processing site as well as offsite offices for employees over an extended period of time. e.g.
  1. War
  2. Earthquake
  3. Terrorist Attacks / Bombing
  4. Extended Communal Riots etc.

Table 1. Four Levels of Threats (Preston, 1999).

A disaster may impact an organization in the following ways (Gilchrist, 2001):

  • The organization may not be able to operate from the affected site.
  • The organization may lose critical resources, such as systems, documents, and people.
  • The organization may not be able to interact and provide services to business partners, clients, brokers, vendors, and other related financial institutions.
  • In addition to incurring financial losses, disasters may impact the credibility of the company. In extreme cases, the company may lose many of the clients.

Objectives of the DRP

DRP plan is intended to provide a framework within which companies can take decisions promptly during a business disruption. The objectives of this plan are (Kaye, 2006):

  • To identify major business risks.
  • To proactively minimize the risks to an acceptable level by taking appropriate preventive and/or alternative measures.
  • To effectively manage the consequences of business interruption caused by any event though contingency plans.
  • To effectively manage the process of returning to normal operations in a planned and efficient manner.

The scope of the corporate business continuity management plan document must include plans for restoring:

  • SBUs (Strategic Business Units) and all the Projects being executed by the SBUs
  • Shared services
  • Information Systems at all locations of the company

Framing the IT Disaster Recovery Plan

Information is the key to survival for organizations. Information could be stored either electronically or as hard copies. Disaster Recovery Plan (DRP) is a set of procedures designed to restore information systems. A DRP mostly deals with technological issues and also recommends infrastructure that should be implemented to prevent damages when a disaster occurs. A disaster can make the business processes totally or partially unavailable. Business Continuity Plan (BCP) focuses on sustaining the business processes of a company during and after a disaster and this plan is a continuation of the DRP and cannot be implemented in isolation. A BCP lists the actions to be taken, the resources to be used, and the procedures to be followed before, during, and after a disaster. An IT disaster recovery plan is implemented for an organization in this section (Facer, 2001).

The DRP within a company is responsible for performing the business impact analysis, a process of classifying information systems resources baselined on criticality, and development and maintenance of a DRP. Tasks that need to be covered are included in the BCP document. The DRP should also maintain the BCP document up-to-date. This responsibility includes periodic reviews of the document – both scheduled (time driven) and unscheduled (Event driven). DRP defines a Recovery Time Objective (RTO) that specifies a time frame for recovering critical business processes. The DRP meets the needs of critical business processes in the event of disruption extending beyond the time frame. Recovery capability for each Strategic Business Unit (SBU) – including all Projects being executed under the SBU – shared service, location and Offshore Development Centre are defined. In the event of any moderate / minor disaster, the recovery capability should ensure that the business processes work seamlessly without affecting any other dependent critical business processes. E.g. If the main power grid is disrupted, there must be standby facilities like generators to ensure that power is available (Facer, 2001).

In this paper, a DRP plan would be implemented for an IT company called ABC Ltd. The following illustration shows how the company is organized.

Assets and Nodes of ABC Ltd. for DRP
Figure 1. Assets and Nodes of ABC Ltd. for DRP (adapted from Preston, 1999).

The above figure shows different assets and nodes of ABC company are organized. The company has its head quarters at New York and a number of units in branches in areas such as Washington, Rochester, Syracuse and others. The company also has a number of off shore development centres and these are identified as ABC Europe, ABC Japan, ABC Australia, etc. In addition, the company has a number of clients and these are identified as Client 1, Client 2.

Defining the Organization Chart for DRP

Before implementing a DRP, it is essential that an organization chart be created that would identify key employees who would be members of the DRP team. The following figure illustrates the organization chart of ABC Ltd.

Organization Chart for ABC Ltd.
Figure 2. Organization Chart for ABC Ltd (Margaret, 2007).

Protecting Intellectual assets with the DRP

In a business relationship, a client invests in internal resources like personnel, funds to set up infrastructure. In addition clients may provide a company with resources in the form of confidential information, raw source codes, initial drawings, machinery. In addition a company, serving its clients has similarly invested funds and other resources in the business engagement. These investments represent assets. Companies must take preventive actions, such as setting up a dedicated security team or formulate policies that help you reduce damage when disasters occur.

IT Team Security Structure

The IT Security Team of a company is responsible for implementing and maintaining the corporate security policy at all ODC locations and other support units. A dedicated Security Officer should be assigned to all the units. In addition, the company needs to conduct security awareness program for all ODCs. Following figure shows a typical IT Security Team structure.

Structure of an IT Security Team.
Figure 3. Structure of an IT Security Team (Brunetto, 2006).

This figure shows the structure of the IT Security Team of a company, ABC, Ltd. The figure shows the various SBUs and their locations.It also lists the responsibilities of the IT security team of the SBU and the centre.

The DRP Network Diagram

The DRP would need to cover all these units and assets. To allow quick back up and DRP procedures for the company, the following network diagram is proposed.

Network Diagram for DRP
Figure 4. Network Diagram for DRP (adapted from Preston, 1999).

In the diagram, the connectivity is allowed through a primary ISDN Back Up Line and a Dial Up Line. A separate ISDN line for backup is required since the backup process consumes extra bandwidth and may slow down regular business processes.

Based on corporate security policy, all the locations with a direct Internet access/connection should be secured by deploying firewalls. You can have a dedicated team of professionals, certified in various technologies who centrally manage the firewalls. You also need to have a change management procedure that enables you to incorporate any desired change in the existing setup within a short notice. When a disaster occurs if a backup hardware exists, it can be used in the disaster recovery plan to restore services. You can protect gateways by installing Checkpoint Firewall Modules in the organization Network. This enterprise wide implementation is managed using a central management console. At each location a De-Militarized Zone (DMZ) must be created to protect important servers. It is also necessary to ensure that the policies installed on the Checkpoint Firewall Modules are based on the corporate network security policies. Precautions must be taken against Internet hacking and vulnerabilities. Vulnerabilities are holes or weak points in the network. Following figure shows a sample firewall installation for a location (Preston, 1999).

Firewall Network Diagram for DRP
Figure 5. Firewall Network Diagram for DRP(adapted from Preston, 1999).

The Firewall would ensure that unauthorized users would not be able to enter the network when back up processes are running or when a DRP plan is being implemented during a disaster.

Steps to Implement a DRP

Developing the DRP involves the following steps (Preston, 1999).

  • Risk Assessment
  • Business Impact Analysis
  • Strategy Selection
  • Business Continuity Plan Documentation
  • Testing
  • Maintenance

Next sections provide details of these steps.

Risk Assessment

In this phase, risks to the business processes have to be identified along with assessing existing mitigation measures, and recommend mitigation measure wherever necessary. The activities in this phase helps DRP administrators to determine the extent of the potential threat and the risk associated with the IT infrastructure and IT applications of your company. A threat is any circumstance or event that can potentially cause harm to the business. The risk assessment phase involves/includes the following (Hiatt, 2007):

  • Inventory: identifies/Documents the various business processes, hardware, software, communication links, documents, and associated people using standard templates developed by the risk assessment team.
  • Threat analysis: Identifies various threats to the business processes. It also identifies the probability of a threat being executed and the potential impact a threat will have on the business in the event of its execution. This is done using a standard template developed by the risk assessment team. The risk assessment team identifies a list of over 35 possible threats to any asset. Based on this list each location is assessed for the probability of each threat being executed and the potential impact on the business processes.
  • Vulnerability analysis: Scans critical servers and hardware devices owned by the company periodically for identifying vulnerabilities and taking corrective actions based on the audit reports. These reports should be studied for their completeness and adequacy. In addition, while arriving at the probability of a threat being executed, the existing vulnerabilities of each location must be analysed.
  • Business Risk Assessment: Includes a detailed assessment of the practices followed by the business units with respect to risk management. The risk assessment team should conduct detailed interviews using standard questionnaires with senior representatives of the business units to understand the risk management practices of the individual business units.
  • Single Point of Failure Analysis (SPOF): identifies the most vulnerable business process. A SPOF is the weakest link in a business process. Each SBU must identify the SPOF at their locations.
  • Risk Matrix: Analyses the identified risk, derived by qualitative analysis of various threats and vulnerabilities to business processes through threats and vulnerabilities analysis, business risk assessment and SPOF analysis. The risk areas are classified as Very High Risk Areas, High Risk Areas, Medium Risk Areas, and Low Risk Areas. You can also recommend mitigation measures for each risk area identified.

The following figure illustrates the risk analysis for the company.

Risk Analysis for DRP
Figure 6. Risk Analysis for DRP (Hiatt, 2007).

A number of templates have to be used at this stage to gather information about a project. These would provide micro information at a project level or at a client level. Some templates that need to be used include (Ambs, 2000):

  • Template for DRP Resource Requirements: This template is used to gather data for resources that are required to prepare a DRP.
  • Template For Project: This template is used to gather data about a project and helps to create a DRP at a project level.
  • Template For Project Team Details: This template is used to gather details of the project team members. The data is used to identify key members who may need to be moved to an alternate recovery site in case of a disaster.
  • Template For Client Team Details: This template is used to gather data about the client team details. Members identified here can be contacted in case of a disaster.
  • Template For Resource Requirement at Project Locations: This template is used to gather details of resources required at the alternate recovery site.
  • Template For Project DR Alternate Site: This template is used to gather data for an alternate recovery site.
  • Template At DR Location For People And Resources: This template is useful to gather data about people and other resources required at the alternate site.
  • Template For Min Required Resources At Alternate Site: This template is used to gather data about the minimum resources required at the alternate recovery site. Details of software and hardware that would be required need to be listed.
  • Template For Project Recovery Plan: This template is used to gather data for project recovery.

A sample template is shown below:

Project Disaster Recovery Plan – Project DR Procedures
Backup And Recovery Procedures.
Indicate Backup procedures and other details for each software resource (E.g. database, code under development etc.) and paper-based resource (e.g. hard copy of contract signed with customer etc.)
Backup Procedures
Frequency of Backup Weekly
Location of Stored Data CA
File Naming Convention 8.3
Description
Responsibility of taking Backup Jane Doe
Recovery Testing Procedures.
Indicate how frequently will backed up data be tested for recovery, what will be the sampling methodology, who will test for recovery, who will approve test results etc.
Frequency of Recovery Testing Monthly
Sampling Method for Recovery Testing Random
Description
Responsibility John Doe
Recovery Procedures
Describe the procedures that will be used to recover the resource in the event of a Disaster. Detailed step by step procedure to get the application/function up and running.
Description Install oracle and import all data.
Responsibility Mike

Table 1. Sample Template for Risk Assessment(Ambs, 2000)

Business Impact Analysis

The overall objective in this phase of the project is to gain an understanding of the business processes and to lay the framework of a business continuity plan for the business units. A Business Impact Analysis (BIA) must be performed with the objective of (Benton, 2007):

  • Evaluating the risk to the business due to systems and/or process failures.
  • Identifying critical business processes and the associated computing applications.
  • Estimating the impact of disruption.
  • Defining the recovery time objectives for critical business processes.

Following figure illustrates the methodology used for BIA

Business Impact Analysis.
Figure 7. Business Impact Analysis (Benton, 2007).

This figure shows the business impact analysis approach. BIA is performed by interviewing business processes owners using detailed questionnaires / templates. The primary areas on which the interviews should focus are (Benton, 2007):

  • Identification of critical business processes and critical resources and applications associated with critical business processes.
  • Interfaces between various business processes.
  • Identification of outage impacts of business function unavailability and maximum allowable downtimes.
  • Prioritisation of recovery processes through recovery time objectives.
  • The resultant BIA documented for each business process describes the following:
  • The outage impact for the business process.
  • The criticality of each business process based on the outage impact. The business processes are classified into four levels of criticality – Mission Critical, High Criticality, Medium Criticality, and Low Criticality Business Process.
  • The minimum human resource required to sustain the business process during a disaster.
  • Criticality of locations from where the business processes are executed.
  • Criticality of the IT infrastructure that support the business processes.
  • Existing recovery times for the business processes in terms of hardware acquisition time and software installation time.
  • Recovery time objectives for the business processes depending on the criticality of the business process.

Strategy Selection and Implementation

Based on the risks identified in the risk analysis phase and the RTO defined in the BIA phase, strategies are identified to adequately mitigate the risks and satisfy the RTO. The strategies included – for each business process and associated resource is (Margaret, 2007):

  • Infrastructure Strategy: Includes hardware, software, and networking redundancy.
  • Alternate Site Strategy: Defines the alternate site from where the business process will be recovered in case of disaster.
  • Equipment Strategies – Ensures availability of necessary equipment at the alternate site.
  • People Strategies – Ensures availability of critical personnel during at the alternate site. E.g.: Specialized software’s like databases, operating systems need skilled people who know what needs to be done to get the applications running quickly.
  • Other Strategies – Handles insurance, service level agreements, and annual maintenance contracts to transfer risks that cannot be mitigated directly.

In order to tackle the operational contingencies for a large organization, the BCMP outlines the BCP concept of operations.. The concept of operations is based on the risk mitigation strategies identified by the BCMP and approved by the corporate centre.

DRP – BCP Structure

Based on the size, geographical spread, and complexity of the organization structure, the DRP is divided into individual BCP for the various SBUs. Each SBU, shared service, and location. The location BCP covers the infrastructure and support functions for the location, whereas the business unit BCP covers the SDLC – Software Life Cycle Development Cycle, for all projects executed from the SBU site. The shared services BCP include the continuity plan for support services, such as finance, accounts, and human resource. Depending on the type and extent of the BCP event, relevant BCP is invoked. Following illustration gives the BCMP structure for a company (Pfleeger, 2002).

BCMP structure for ABC Ltd.
Figure 8. BCMP structure for ABC Ltd.

Crises Team Management Structure

Each BCP identifies a Crisis Management Team (CMT) that will take charge of respective operations in the event of a disaster. The composition of the various Crisis Management Teams is depicted in the following figure (Swartz, 2004).

CMTP structure for a Location DRP
Figure 9. CMTP structure for a Location DRP (Swartz, 2004).

Process Flow to identify disaster and activate DRP

Communication lines should be established that follow guidelines for reporting and managing disasters. The process flow diagram shown in the following describes the various stages of reporting a disaster.

Process Flow diagram for reporting disasters
Figure 10. Process Flow diagram for reporting disasters (Kaye, 2006).

The CMT may decide to activate some BCP procedures even before the DAT reverts back to the CMT with the Damage Assessment Report. This ensures that in case of a severe disaster, business processes, having a low recovery time objective, are activated immediately without awaiting a detailed assessment of the extent of damage.

DRP Invoking Procedures

DRP activation depends on the level of disaster. The BCP documents the following procedures during a disaster (Preston, 1999);

  1. Procedures for invoking relevant BCPs
  2. Procedures for communication of disaster. This includes procedures for –
  3. First notification of disaster and further escalation to CMT.
  1. Notification of disaster to SBU heads
  2. Notification of disaster to employees
  3. Notification of disaster to customers
  4. Notification of disaster to Media / media Management
  5. Procedures for Emergency Evacuation including Roles and Responsibilities of various personnel involved in Evacuation
  6. Recovery Procedures for various Infrastructure Items and IT Applications

Project Specific Disaster Recovery Plan

Each Project should prepare a DRP before the start of the Project in pre defined templates. Each Project Disaster Recovery Plan identifies an alternate site from where the project will be executed, in case the primary location is inaccessible based on the requirements of the project and availability of infrastructure at alternate site. This information is available from various templates that are used in the risk assessment (Toigo, 2005).

  • The Plan should identify critical project team members who will be shifted to the designated alternate location in case of such an incident. Where an employee may need to travel to onsite locations during a disaster, travel and other necessary documents are kept ready.
  • Data backup for all Projects should be stored at a predetermined location.
  • In case of a disaster where the primary site becomes inaccessible, each SBU from that location communicates requirements to the CMT to shift project team members.
  • CMT facilitates transportation of key employees to alternate locations through the Administration department.

Notification Procedures

A structure to notify disasters should be in place. This structure is also called as call tree. A call tree to notify occurrence of a disaster is shown in the following figure.

Call tree to notify disasters.
Figure 11. Call tree to notify disasters. (Toigo, 2005).

The figure shows the structure used to notify affected parties about the disasters. Emergency Procedures For Project DRP are

  • Control will be transferred to on-site – if required.
  • If recovery is required from alternate location, acquire resources / infrastructure from CML.
  • Initiate process of recovering processes, data, and applications as per the RTO or identified priority.
  • Make arrangements for transportation of people (as identified in Project DRP)
  • Resume operations at alternate location.
  • Confirm all Mission Critical services are restored
  • Use call tree to notify affected parties that services have been restored from alternate location.
  • Take control back to off-shore

Testing

Testing helps to evaluate the ability of recovery staff to implement the plan quickly and effectively. Each element of the BCP and DRP should be tested to confirm the accuracy of individual recovery procedures and the overall effectiveness of the plan. Plan testing is designed to determine (Pfleeger, 2002):

  • Whether the recovery teams are ready to cope with a disruption
  • Whether recovery inventories stored off-site are adequate to support recovery operations
  • Whether the business continuity plan has been properly maintained

Test Plan

Before conducting the test, a detailed test plan should be developed. The test plan includes (Pfleeger, 2002):

  • Scope of the Test – Defines the boundaries of the test. For example it lists the location, area, projects, components, and data.
  • Test objectives.
  • Test Scenario – This includes
  • Type of Test – For example Structured Walkthrough Test, Component Test or Full Function Test
  • Test Schedule
  • Description of the Test Scenario
  • Success Criteria For the Test – including the method used to evaluate the test results.
  • Test Participants
  • Sequence of Activities

In addition, maintenance procedures should be implemented for the DRP. To prevent Level 1 incidents of virus and hacking attacks or due to improper behaviour of employees, a security policy should also be implemented. The policy would specify rules of conduct while working, rules for email, data storage, personal storage devices such as iPods, MP3 players, mobiles with cameras and others.

Conclusion

The paper has examined the importance of data back up and disaster recovery plan for organizations. The paper has also discussed in detail the DRP and BCP implementation plan for an organization. Network diagrams for data back up and DRP as well as different steps to invoke DRP and BCP have been discussed.

References

Ambs Ken. 2000. Optimizing restoration capacity in the AT&T network. Interfaces Journal. Volume 30. Issue 1. pp: 26-40.

Benton, Dick. 2007. Disaster Recovery: A Pragmatist’s Viewpoint. Disaster Recovery Journal.

Brunetto Guy. 2006. Disaster recovery: How will your company survive? Journal of Strategic Finance. Volume 82. Issue 9. pp: 57-62.

Facer Dave. 2001. Rethinking: Business continuity. Journal of Risk Management. Volume 46. Issue 10. pp: 17-21.

Gilchrist Bruce. 2001. Coping with Catastrophe: Implications to Information Systems Design. Journal of the American Society for Information Science. pp: 271-278.

Hiatt Charlotte J. 2007. A Primer for Disaster Recovery Planning in an IT Environment, 2nd Edition. ISBN-10: 1878289810.

Kaye David, Graham Julia. April 2006. A Risk Management Approach to Business Continuity: Aligning Business Continuity with Corporate Governance. Rothstein Associates Inc. ISBN 1-931332-36-3.

Margaret Pember. 2007. Information disaster planning: An integral component of corporate risk management. ARMA Records Management Quarterly. Volume 30. Issue 2. pp: 31-39.

Margulies Stuart. 2006. Preparation for the DRP test: (Degrees of reading power), 2nd Edition. Educational Design publications. ISBN-13: 978-0876942857.

Meade Peter. 1993. Taking the risk out of disaster recovery services. Journal of Risk Management. Volume 40. Issue 2. pp: 20-26.

Preston W. Curtis. 15, 1999. UNIX Backup and Recovery. O’Reilly Media, Inc. ISBN-10: 1565926420.

Presswire. 2008. Price Waterhouse Coopers. New survey raises serious concerns about the effectiveness of disaster recovery plans. M2 Presswire. pp: 2-3.

Pfleeger Charles P. 2002. Security in Computing, 3rd Edition. Prentice Hall PTR. ISBN-13: 978-0130355485.

Swartz Nikki. 2004. Survey Assesses the State of Information Security Worldwide. Information Management Journal. Volume 38. Issue 1. pp: 16-20.

Toigo Jon William. 2005. Disaster Recovery Planning: For Computers and Communication Resources. Wiley; Publications. ISBN-10: 0471121754.