Tata Power Confidential

Risk Abatement Plan

Corporate IT

Version 1.0


Tata Power Company Ltd

Corporate Information Technology

January 2005

Approval History

Rev. 0October, 2005.

Signature
Created By / SHA/CPS/FKT / October, 2005
Checked By / E. R. Batliwala / October, 2005
Approved By / VP


Risk Abatement Plan- Corporate IT

Definition:

Disaster:

  • An unforeseen and often sudden event that causes great damage, destruction and human suffering. Though often caused by nature, disasters can have human origins. Wars and civil disturbances that destroy homelands and displace people are included among the causes of disasters. Other causes can be: building collapse, blizzard, drought, epidemic, earthquake, explosion, fire, flood, hazardous material or transportation incident (such as a chemical spill), hurricane, nuclear incident, tornado, or volcano. FEMA: “Disaster Assistance Programs”
  • A condition in which an information resource is unavailable, as a result of a natural or man-made occurrence, that is of sufficient duration to cause significant disruption in the accomplishment of agency program objectives, as determined by agency management.

Likely Sources of Disaster:

The disaster sources include the following:

  1. Fires due to accidents, Equipment failures, Electrical faults, Chemicals etc.
  2. Acts of Terrorism, Security Breaches, Malicious intents and Deeds of Friends and Foe.
  3. Earthquakes, flooding of server and networking equipment.
  4. Failures due to Pests (fiber, cable cuts etc.)
  5. Unpredictable behavior of hardware and software (Generally untested and certified)
  6. Cyber terrorism, Trojans, Software Time Bombs, hacking, Information Security Breaches, DOS (Denial of Service Attacks) etc.
  7. Human Errors, Mistakes due to lack of knowledge or, bad judgments etc.

Risk Abatement:

In order to get a handle on the IT risks, following areas are identified as vital to be able to tackle the risks:

  1. Systems, Hardware, and Software: These include redundancy in networks, computers, media, backups; software etc. which would require to be deployed during the failures of systems and hardware. Such systems are of hot standby nature and required to be in working condition. Such systems are stored physically at a different location and can be moved to the affected location or can be sprung into action from a location of convenience.
  2. Tools and Instruments: These would help in speedy diagnosing of faults and help in restoration of the services. These should be stored at locations other than the disaster locations. It is necessary to maintain these in working condition, and access to these should be fast and easy during disasters.
  3. Organized Expert Manpower (CERT): CERTs (Computer Emergency Response Teams) consist of trained, specialized and organized manpower whose task is to locate faults, problems etc., and restore the computer and network services during emergencies in shortest possible time.
  4. Policies, Procedures, Information, Documents, Planning, diagnosing checklists: These are intended to assist the CERT and other agencies in their task of service restoration. These are required to well documented, tested and need to be easily be accessible during emergencies.
  5. Emergency Assistance Agency Agreements

CERTs (Computer Emergency Response Teams):

As soon as the emergency is recognized, the CERT members shall meet in shortest possible time and shall be lead by the CERT leader. The CERT team members are experts in their respective technology areas and their responsibility is to identify, correct, restore the services affected in shortest possible time. All the team members are expected to keep themselves current in technology are should be avaiable during the emergencies. The CERT team can get additional help from other agencies, groups should the need arise.

Following chart indicates a general reaction of the CERT, though the actual response may depend upon the situation or the path decided by the CERT.

Start of Emergency

Emergency

Response

Restoration

Original Services Restored

with full functionality, ensuring

Restoration of any bypassing

of interlocks, protection mechanisms etc.

CERTs and Membership:

The table below proposes the various CERTs with members.

SAP CERT

S.N. / Name / Responsibility / Contact Number / Address
1 / Mr. R. K. Patra / Leader / 9223220842 / 1/14 Telec, Plot 30, Sector 17, Vashi, Navi Mumbai 400 703
2 / Mr. Suresh Patil / Member / 022-25542269 / Type 3D/21 Tata Colony, Aziz Baug Chembur
3 / Mr. F. K. Tamboly / Member / 9223220843 / 804C Ratty Lodge, Kingsway, Dadar, Mumbai - 400 014
4 / SAP/TTIL Rep / Member (Ms Meera) / 9322687384

WAN/LAN and Email CERT

S.N. / Name / Responsibility / Contact Number / Address
1 / Mr. S. H. Agarwal / Leader / 9223220815 / Flat No.187, BPCL Staff Quarters, Aziz Baug, Chembur Mumbai 400 074
2 / Mr. F. K. Tamboly / Member / 9223220843 / 804C Ratty Lodge, Kingsway, Dadar, Mumbai - 400 014
3 / Mr. F. A. Mistry / Member / 9223287287 / Flat No 43, 4th Floor, Jamasji Apartments, 32 Sleater Road, Mumbai 400 007
4 / Mr. G. Kingslin / Member / 022-55744108 / 3/22, Tata Housing Colony, Near Shalimar Industrial Estate, Matunga Mumbai – 400 019
6 / TCS/ADS Rep / Member (Mr. Swapnil) / 9224487179

IT Applications CERT

S.N. / Name / Responsibility / Contact Number / Address
1 / Mr. N. C. Shah / Leader / 9223220814 / 1/10 Telec, Plot 30, Sector 17, Vashi, Navi Mumbai 400 703
2 / Mr. C. P. Shah / Member / 9223220817 / 15/6, JPM Society, Sant Ramdas Rd., Mulund (E), Mumbai-400 081
3 / Mr. G. J. R. Nadar / Member / 022-25543285 / Type III D/25, Tata Colony, Chembur, Mumbai - 400074
4 / Mr. S. M. Zare / Member / 022-25625045 / B/504, Cypress, Near Swapna Nagari, Mulund (W), Mumbai 400 080

Emergency Assistance Agencies:

The Concept of “Emergency assistance Agencies” is similar to the assistance agreement in use with neighboring organizations such as BPCL, HPCL RCF etc during emergencies of fire outbreaks. Similar assistance could be sought in order to restore SAP, Email and WAN in case of intractable emergency situations. Such “Assistance Agencies” need to be identified and have assistance contracts signed.

IT Risk Abatement Measures Already in Place:

Corporate IT has several measures already in place in order to tackle the risks involved for the IT services. The following measures are already in place/in progress:

  1. OSPF based Fault Tolerant Backbone WAN Rings with L3 Switches.
  2. IT and Information Security Policies and Procedures in Place
  3. Antivirus System Implemented and Operational for Desktop, Server and Email. This is Updated Regularly.
  4. Implementation of Information Security Systems- Internet Firewall based on Linux, VPN and Intrusion Detection System. Intrusion Prevention System Required.
  5. Regular Awareness and Training Program Conducted for Employees.
  6. Installation and Commissioning of Secondary Recovery Assistance Server for SAP, Fall Back Server for Exchange, and EVA Server for Enterprise Storage.
  7. DR Server sites at different seismic zone to be worked out.

Time Line for Risk-Readiness

Activity / 2005 / 2006
Oct Dec / Jan Mar / Apr Jun / Jul Sep / Oct Dec / Jan Mar
Formal CERT Formation
Deployment of Additional L3 Switches
SAP Recovery Site (Seismic)
Web Apps & Email Recovery Site (Seismic)
Assistance Agency – Identification and Agreement
Deployment of Intrusion Prevention System
Fire Fighting Equipment
Disaster Recovery Drill Plan Deployment

Appendix




IT Contingency Plan

S.N. / Contingency (What-if) / Action / By / Comments
1. / Main Internet Link Failure via VSNL/TBB / Resume service via Technopolois L3 Switch to Tata Tele Gateway at Technopolis. / ADS/TCS/GK / Users to use Dial Up Connectivity to ISP (VSNL) on Extreme Urgency
2. / L3 Switch Failure at Dharavi SAP / Replace L3 Switch by Spare Preconfigured Switch at Dharavi New Server Room rack. / ADS/TCS/GK / Shall replace the spare L3 Switch ASAP.
3. / UPS Battery Drained / Switch to Raw Power. Sockets available in server room. / ADS/TCS/Basis
UPS OEM to fix Problem on UPS / Exercise Safety precautions for grounding and voltage limits.
4. / Primary Exchange Email Server Failure / Restart Email service using Spare Server kept at Carnac. / ADS/TCS/TPC / Delay of 6 Hrs.
5. / Backup Email Servers not Available / Use ISP (Tata Indicom) dial up service for messaging / Users / Extreme urgency only
6. / Strike, Civil Disorder leading to ADS Admin Personnel not being available. / TPC System Admin to Take over. / GK/Suresh Patil
CERT to assist / Remote Admin or Locally Resident Administrator.
7. / Power Grid Failure / Use Station DG Sets for Powering the Network Infrastructure. Use Broad Band UPS. Use Lap Tops for messaging. / ADS/TPC/Individual Stations.
CERT to assist.
Users. / Extreme Urgency/Requirements only. Exercise usual precaution for voltage limits and grounding.
8. / Fiber Cuts/ Links Not Available in main backbone. OSPF ring not available. / Use ISP (Tata Indicom) to Use Web Based Email / Users / Extreme Requirements only. Travel to Dharavi to use IT services
9. / Total AC Failure in Server Room / Keep Watch on Temperature limits before shut down / ADS/TCS/Basis
10. / Fire in Server Room / Apply Fire Fighting Procedure. / Sys Admin on Duty.
CERT to Assist. / Inform TPC Admin staff for seeking help. Call CERT Leader.
11. / Backup Tapes Not Working for urgent restoration. / Restore using the backup on SAN disk. / ADS/TCS
CERT to assist / Perform restoration and reject the tape using TPC media disposal procedure.
12. / Bright Store 6000 Backup Device not Working / For Backup and Restore: Use SAN for backup / ADS/TCS/BASIS
HP to restore Bright Store Backup Device on priority.
13. / Major Virus Attacks on Network / Remove Internet Connectivity. System Administrator to urge all users to disconnect their PCs (Affected or not) from network. Use Isolated virus free machines for work. Distribute AV patches on RO CDs to users. / Trend Micro/ADS/TPC
CERT to take charge. / Use virus free and latest Antivirus enabled PC to connect to internet to download security patches from Trend Micro site.
14 / Major Transportation Disorder / Provide Company Transport to Admin Staff / TPC Transport
CERT to assist / If there is Unlikely Restoration of Public Transport.
15. / Firewall/Proxy Server Firewall Failure / Use Alternate Firewall/Proxy server kept at Dharavi New Server room. Sys Admin to inform users / ADS/TCS / Firewall/Proxy Server to be restored within 4 Hours by reinstallation of software.
16. / Domain Controller Failure / Use Alternate DC kept at Carnac Server room. / ADS/TCS / DC to be restored within 4 Hours by reinstallation of software.
17. / Dharavi Site failure due to Earthquake / Bring Up Remote Site for SAP and Email / CERT to take over. Remote Assistance Agency to assist. ADT/TCS/TPC
18 / Major SAP OS problems / Advice from SAP and OSS Connectivity
CERT to take Chatrge / Basis
CERT
19. / Major Windows Email Server OS Problems / Seek Microsoft Assistance
CERT to take charge / TCS/ADS/TPC
CERT / Use ISP messaging for extremely urgent requirements
20. / Hacking, DOS attacks / Disconnect Internet Connectivity.
CERT to take Charge / ADS/TCS/TPC
CERT / Look for help from cert.org at PrincetonUniv. cert. In

Notes:

  1. It is possible to dial in the Tata Indicom ISP server via a Tata Indicom direct Phone line by dialing 155155 as this service is given by Tata Indicom by default
  2. See CERT document for details for different CERTs in TPC.
  3. Standby link arrangements with VSNL from Technopolis to be worked out.

Action Plan for Disaster Recovery - SAP

Sr. No / Incident / Action to be taken
1 / DEVDRS to be kept in sync with production by log shipping thru ORACLE Data Guard feature / DR Site is at 4th floor Carnac, Block B
2 / In case of unavailability of Production Server / Start the DEVDRS database (BASIS)
3 / Start Sap Service on DEVDRS (BASIS)
4 / Basis Personnel will lock all Sap User IDs
5 / User IDs will be unlocked according to time slots allotted to them.
(This presupposes that plan of time allotment is approved)
6 / User complaint of SAP unavailability is answered with “We are in DR situation. Pls refer to your instruction sheet”
(This presupposes that instruction sheet is available with all users.)
7 / Users login and perform their functions in DEVDRS
(This presupposes that all User Login Pads are configured to login to DEVDRS)
8 / After production system is available again, / Restoration to be carried out
  1. Backup of DEVDRS
  2. Transport of Tape to Primary Site
  3. Restore at Primary Site

SAP Service Restoration Scheme – Secondary Server

Exchange 2000 Server Disaster

Recovery Guide

for

Tata Power Company Ltd

Prepared by

Shailendra Shenoy

Version 1.1: Last Updated 19 June 2003.

CONTENTS

SHA 1

Tata Power Confidential

Disaster Recovery Procedures for EMAIL Server...... 14

Backup of Windows 2000 Server...... 15

Creating Windows2000 Backup Sets...... 15

Considerations for Backing Up Domain Controllers...... 16

Backing Up the System State of a Domain Controller...... 17

Recommendations for Backing up a Domain Controller ...... 17

Exchange 2000 Server Backup Procedures...... 18

Backing Up Exchange2000 Databases ...... 18

Rebuilding the Server EMAIL after a Disaster ...... 19

Restoring Windows2000 Backup Sets...... 20

Reinstallation of Exchange2000 in Disaster Recovery mode...... 21

To recover the Exchange2000 databases ...... 22

Restoring Exchange Server onto Alternate Server...... 29

Summary of Steps to Rebuild EMAIL on to alternate server...... 29

Performing a Repair of Windows 2000/Exchange 2000...... 31

Selection of Backup Types and Rotation Schedules...... 32

Summary of Recommendations for Server Availability...... 34

Useful Recovery Resources...... 35

Exchange2000 Server Disaster Recovery Technical Papers...... 35

Other Technical Papers...... 35

Additional Disaster Recovery Documentation...... 35

Microsoft Knowledge Base Articles...... 35

SHA 1

Tata Power Confidential

Disaster Recovery Procedures for EMAIL Server

The following table lists the summary of steps to be followed, in logical sequence, to rebuild the EMAIL Exchange server after a hardware failure or other disaster. The same process can be followed for the other Exchange Server at Jojobera.

Required Preventative Steps

Keep a Windows backup set of the Exchange2000 server you want to rebuild.
Keep Exchange2000 database backups of all databases you might need to replace.
Disaster Recovery Steps – on same server
 / 1. (Optional) Copy or move the existing Exchange2000 DB and log files (if possible) on Exchange2000 server being restored.
2. (Optional) Attempt a repair of W2K or a repair of Exchange2000 before restoring Exchange2000 server.
3. Reinstall Windows2000 on a newly formatted hard drive using a random computer name and placing into a temporary workgroup instead of a domain during setup. Use same logical drive configuration as before.
4. Reinstall any Windows2000 service packs, patches, or updates previously running on the server being rebuilt. Then reinstall any other applications (other than Exchange2000).
5. Restore the Windows backup set made on the server being rebuilt. Restart the server.
6. Reinstall Exchange2000 in Disaster Recovery mode.
7. Reinstall any Exchange2000 hotfixes or Exchange2000 service packs that were running on the server prior to the disaster.
8. Restore the Exchange2000 database backups that were made on the server you are rebuilding prior to the disaster.

The following sections describe each of the above steps in detail.

Backup of Windows 2000 Server

Creating Windows2000 Backup Sets

To completely back up the operating system of a server running Windows2000, you must back up both its SystemState data and its operating system files. A backup of Windows2000, including both the SystemState data and the boot and system partitions, is called a Windows backup set. A Windows backup set must contain the following data and must be backed up as part of the same backup job:

The SystemState data

The boot partition (the disk partition from which your computer starts. This partition contains files in the root directory such as NTLDR and BOOT.INI)

The system partition (the disk partition to where Windows is installed)

NoteIf you installed Windows2000 to the hard disk partition that is used to start your computer (known as the active partition), your boot partition and system partition will be the same.

You back up a computer’s SystemState data using the SystemState data option in Backup. When you perform a SystemState backup, Backup automatically backs up all of the SystemState data that is relevant to your computer. Because of the dependencies among SystemState components, you cannot back up or restore individual components of SystemState data using Backup. However, you can restore some types of SystemState files to an alternate location.

NoteWhen you back up the SystemState data, a copy of your registry files is also saved in the systemroot/repair/regback folder. If your registry files become damaged or are accidentally erased, you can use these copied files to repair your registry without performing a full restore of the SystemState data. This method of repairing the registry is only recommended for advanced users.

To back up your computer’s Windows operating system files, back up the boot partition (the partition that contains the files that start Windows2000) and the system partition (the partition where the Windows2000 folders reside, such as the WINNT folder, Documents and Settings, and Program Files folders).

ImportantIn preparing to restore the Windows2000 operating system configuration information, you must restore the server’s SystemState data and its operating system files; these data and files must be part of the same backup set.

Create Windows backup sets frequently—weekly if possible. In general, the older your Windows backup sets are, the more likely you are to experience problems that you must resolve before you can restore Exchange2000.

To create a Windows backup set

  1. Click Start, point to Programs, point to Accessories, point to System Tools, and then click Backup.
  2. In Backup, click the Backup tab. In the console tree, click the boxes next to the drive letters for your boot partition and system partition, and then click the box next to SystemState(Figure 1).

Figure 1Selecting a SystemState Backup

  1. In the Backup destination list, perform one of the following steps:

Select File if you want to back up files and folders to a file.

If you do not have a tape device installed on your computer, this option is selected by default and cannot be changed. Select a tape device if you want to back up files and folders to a tape.

  1. Next to the Backup media or file name box, click Browse to select a location and file name for your backup.
  2. Click Start Backup.
  3. In Backup Job Information, in the Backup description text box, type a backup description, set the appropriate options, and then click Start Backup.
  4. After the backup is complete, verify the backup was successful.

Important: Do not backup the Exchange IFS drive i.e. M:. If the logical drives you are backing up (system and boot partitions) also contain the Exchange databases, then you should exclude the folder containing the Exchange files i.e. ‘D:\Program Files\Exchsrvr’ from the backup set.