<SYSTEM NAME>

OPERATIONS & MAINTENANCE MANUAL

VersionNo>

Effective Date: <DD MMM YYYY>

Deliverable / Name / Title / Signature / Date
Prepared by:
Reviewed by: / <Project Manager & TSL>
<OPS Representative>
Tested by :
(For H/HH Systems except network systems) / SS Testing Representative
Approved and
Authorized for Use / <Senior Manager of PM>
Verified for Use / Manager, Systems Operations

Distribution List

< Manager, Systems Operations >

IT Technical Library >

ITProgramme Office >

< Project Manager Name>

<Technical Support Leader>

Copyright © by Airport Authority Hong Kong

All rights reserved. Permission to reproduce this document or to prepare derivative works from this document for internal use is granted, provided that the copyright statement is included with all reproductions and derivative works.

File Ref.: ITTMP010.docx / Page 1 of 16
<SYSTEM NAME>
OPERATIONS & MAINTENANCE MANUAL
Version No / Date : <DD MMM YYYY>

How to Use This Document

This is a template for developing the Operations & Maintenance Manual for IT systems/services as part of the project deliverable. It is recommended that you follow these steps:

  1. Replace all text enclosed within “” and “” signs (eg. <System Name>) with the correct field values.
  1. To facilitate editing of the document, please amend the following fields of the “Prepare  Properties ” so that the changes can be reflected automatically in document’s header and footer areas:
  • “Subject” for “System Name”
  • “Keyword” for “Version No.”
  • “Comments” for “Effective Date”

Press “F9” to refresh the field content if necessary.

  1. No sections are to be deleted from this document. Just put “Not Applicable” into the section.
  1. Refer to IT001 “Standard for IT Systems Operations & Maintenance Manual” on how to prepare the O&M document.
  1. Please delete this page when preparing your Operations & Maintenance Manual.

Document Change Record

Provide information on how the development and distribution of the Operations and Maintenance Manual was controlled and tracked. Use the table below to provide the version number, brief description of the change, the name of the person who makes the change and the date of the version. >

Version
Number / Description of Change / Person Making Change / Date
1.0 / Initial version of the template / Project Manager / DD/MM/YYYY

Table of Content

1Introduction

1.1Purpose

1.2Audience

2General Information

2.1System Overview

2.2System Architecture and Interfaces Overview

2.3System Owner and Technical Support Leader

2.4System Classification

2.5Service Level Agreement (SLA)

3System Operations

3.1Description of System Configurations

3.1.1Hardware Configurations

3.1.2Software Configurations

3.1.3Network Configurations

3.1.4Hard Disk Configurations and Directory Structire

3.2Definition of Job Schedules and Operation Procedures

3.3Data Backup and Data Restoration

3.4System Startup and Shutdown Procedure

3.5Health Check Procedure After System Restart

3.6File Maintenance and Housekeeping

3.7System Account Information

3.8Testing Environment

4System Monitoring, Recovery and Escalation

4.1System Monitoring

4.2Event Management (Categorization)

4.3System Recovery Procedure

4.4Incident Management and Escalation

4.5Contingency Plan

4.6System Limitations

4.7Vendor Support Information

5APPENDIX

1Introduction

1.1Purpose

Enter information describes the intended use of the system.

1.2Audience

Enter information describes the audience for this document such as system administrators, Technical Support Leader, Quality Control Manager and SOCC etc. >

2General Information

2.1System Overview

This section provides a brief description of the system, including its purpose and uses.

-Briefly describes purpose of the system and its key features including how it meets the business needs and objectives

-Primary users of the system and their locations

-Major inputs and outputs of the system

-The business impact to airport’s operations under different failure scenario

2.2System Architecture and Interfaces Overview

< This section describes the organization of the system by the use of a chart depicting components and their interrelationships.

Provide information describing major structure and its components of the system, servers, network, database, etc. architecture or provide a reference to other O&M Manual.

-Logical structure of the system and the relationship between each components

-Network design, drawing list and configuration diagram depicting intra / inter systems connections, i.e. including external interfaces

-Systems & its inter- dependency

-External interface (3rd parties)

-Internal Interface

-Mechanism to exchange information

-Physical interface if any, e.g. type of connections (serial, Ethernet), connection route( via which box / port ), and number of such connections

-Value, quantity and throughput of every system interface

-Inter-dependency between each interface

-Flow chart depicting information moves from the application to the databases >

2.3System Owner and Technical Support Leader

This section provides the name of system owner and technical support leader.

System Owner (SO) / Name and Title
Technical Support Leader (TSL) / Name and Title

The latest System Owner and TSL information should have been posted to the following hyperlink.

2.4System Classification

This section describes the Confidentiality, Integrity and Availability (C/I/A) of the system under IT system registry.

Risk category / High-level Impact Rating / Overall impact
(C / I / A) / Financial
(H/M/L) / Operational
(H/M/L) / Regulatory
(H/M/L) / Public image
(H/M/L) / Litigation
(H/M/L) / (H/M/L)
C
I
A

The latest C/I/A Information has been posted to the following hyperlink.

2.5Service Level Agreement (SLA)

This section states the agreed Service Reliability and Service Window of the system.

Service Reliability (%) / “HH” Availability(99.99%)
“H” Availability (99.65%)
“M” Availability (99.30%)
“L” Availability (99.10%)
Service Window / “A”00:00 ~ 23:59 (Monday ~ Sunday)
“B” 08:00 ~ 20:00 (except Saturday, Sunday and Bank Holiday)
“C" 09:00 ~ 18:00 (except Saturday, Sunday and Bank Holiday)

The latest Service Reliability (%) and Service Window have been posted to the following hyperlink.

3System Operations

3.1Description of System Configurations

This section provides information describing default and custom configuration, configuration options, and their associated definitions or provide a reference to where it is stored.

-Main configurations are : hardware, software, network, harddisk, database, directory structure, user account information and inventory list

-Server (host & client) locations and communication room connections details

3.1.1Hardware Configurations

< Server A

Hostname
Model
Processor
Memory
Drive Controller
NIC
Internal Storage
External Storage
Tape Drive
CD-ROM / DVD-Rom
Power Unit
Chassis
Rack Mount / Yes/No
Others Device
Operating Temperature
Typical Heat Dissipation
Maximum Heat Dissipation
Location (Cabinet ID)

3.1.2Software Configurations

< Server A

Software Name / Version

The latest software configurations should have been posted to the following hyperlink.

H:\99 Common Access\01 SAM Inventory/SWI-<TSL>V<Version<Modified Date>.xlsx

3.1.3Network Configurations

< Server A

Hostname
IP Address
Subnet Mask
Default Gateway
Domain
DNS Server
Connection / < Switch Name Port Number >
Duplex / Half / Full
Speed / 10Mbps /100 Mbps / 1Gbps

3.1.4Hard Disk Configurations and Directory Structure

This section provides information on the hard disk configuration and structure of directory or file system for each server. >

For Windows Platform

Hard Disk / Partition / Remark

For Unix Platform

Partition Table

< Server A

Path / Resource Assigned to / Description

3.2Definition of Job Schedules and Operation Procedures

This section lists details of the frequency of automatic and manual background processes and scheduled jobs being performed in daily, weekly, monthly and regular basis including interfacing system. For routine server restart or change-over, the maintenance windows, services impact and checklist must be provided and well documented.

-Job frequency, startup & shutdown procedure and sequences

-Job estimated completion time

-Job housekeeping and output retention period

Schedules

Job Description / Type
(Auto/Manual) / Schedule / Job Duration / Retention Period / Procedure

3.3Data Backup and Data Restoration

This section briefly describes the procedures for regularly scheduled backups of the entire system, including system, program, data storage and the storage of backup logs.

-Database

-Server

-COTS products

-User data

-Type of backup, frequency, mechanism, tools/scripts used, system operational impact, backup log location

-Estimated data backup and data restoration time with proven status

-Reconciliation with checksum or volumes

-Retention Period

-Recovery mechanism for most common failure scenarios

3.4System Startupand Shutdown Procedure

< This section describesthe system startup and shutdown procedure. It has to show clearly the sequence of bootup and shutdown steps. >

3.5Health Check Procedure After System Restart

This section describes the health check procedure to confirm the system healthiness and its functionality after schedule or non-schedule system restart. Also provide information on reconciliation acknowledgment and feedback messages.

3.6File Maintenance and Housekeeping

This section specifies the housekeeping strategy and data retention on handling system log, application log, configuration and temporary files in order to maintain the system healthiness. The annual growth in database and file system should be provided if available.

-Specify the usage of files, system log, configuration files, file size…etc

-Temporary files used for the input and output should also be clearly specifiedwith housekeeping strategy and data retention

3.7System Account Information

This section specifies the user account information and the purpose of use. >

< Server A

Account Name / Description / Purpose / Account Owner

3.8Testing Environment

<This section is used to:

-highlight the key difference and variations between the UAT setup and Production setup

-base on the difference, if any, analyze and summarize the impact: i.e. what cannot be tested, impact of such, the validity of the UAT results to support production rollout.

Hardware Configuration

< Server A

Hostname
Model
Processor
Memory
Drive Controller
NIC
Internal Storage
External Storage
Tape Drive
CD-ROM / DVD-Rom
Power Unit
Others Device
Location (Cabinet ID)

Software Configuration

< Server A

Software Name / Version

The latest software configurations should have been posted to the following hyperlink.

H:\08 Mgt Services\22 SAM\01 Software Inventory Repository/SWI-<TSL>V<Version<Modified Date>.xlsx

Network Configuration

< Server A

Hostname
IP address
Subnet mask
Default Gateway
Domain
DNS server
Connection / < Switch Name Port Number >
Duplex / < Half / Full >
Speed / < 10Mbps /100 Mbps / 1Gbps >

4System Monitoring, Recovery and Escalation

4.1System Monitoring

This section clearly states the system monitoring requirements, including application, system interface, database, network, etc. The crucial system performance information or alerts provided by the systems will enable SOCC support in a timely and effective manner.

All the pre-defined monitoring thresholds are subject to be reviewed periodically by the TSL or PM on a needed basis.

The system monitoring software must be able to quickly identify and report failures in any means so that the system incident can be managed or repaired whether automatically or via manual intervention.

The system monitoring should include the following key areas to be monitored by WEMS or related monitoring tools:

a)Resource Monitoring

Host / System Resource / Warning Threshold / Exception
Threshold

b)Application Process Monitoring

Host / Process or Service Name / Exception Threshold

c)Performance and Capacity Monitoring

Host / Performance/Capacity Name / Exception Threshold

d)System Security Monitoring

Host / Account / Exception Threshold

e)Schedule Tasks Monitoring

Host / Schedule Job / Exception Threshold

f)Interface Monitoring and Management

Host / URL / Exception Threshold

g)Customer Made Monitoring Based on Different Application

Host / Critical Process / Exception Threshold

h)Network Equipment

Host / Objects to Monitor / Warning Threshold / Exception
Threshold

i)Management Information Base (MIB)

Host / SNMP OID Value / SNMP Trap Alert

j)Miscellaneous Requirements

Host / Monitoring Requirement / Exception Threshold

Attached screen dumps from WEMS. >

4.2Event Management (Categorization)

Events reported via system messages received from system monitoring will be categorized into “Informational”, “Warning” and “Exception”. This section must clearly states the recovery procedures or actions to be taken by SOCC in response to those events reported.

Informational events are typically used to check on the status of a device or services, or to confirm the successful completion of an activity (e.g. Backup job, and housekeeping task etc). No action will be taken for those events. All informational events should be configured and sent to mailbox.

Informational Events

Events / Schedule / Log File

Warning Events

Host / System Resource / Warning Threshold / Action to be taken

4.3System Recovery Procedure

In order to minimize service downtime and impact to the business, all possible application/system error codes must clearly state in this section with detailed description on affected area, problem symptom, time to resume the service and detailed recovery procedure for SOCC to execute in a timely manner.

Host / System Monitoring / Exception
Events / Time Capped / Recovery Procedure

4.4Incident Management and Escalation

This section describes how to conduct and document incident management activities, and the formal escalation procedures to be used by SOCC in response to system criticality and incident severity.

The following information must be clearly stated:

-System Manager / Owner

-IncidentManager

-Escalation matrix by elapsed time

-Call flow

-Full list of all system interfaces users, including user names, company, phone no., fax, work location etc.

Incident Management

IncidentCategory / IncidentArea / Support Group

Support Escalation & Contact

Name / Contact / Service Hour
1st Level / SOCC / 2182 0030 / 24 x 7
2ndLevel
3rd Level

Elapse Time for Escalation

Severity Level / Time / Escalated by / To
Critical / SOCC / 2nd Level Support
2nd Level Support / 3rd Level Support
High / SOCC / 2nd Level Support
2nd Level Support / 3rd Level Support
Medium / SOCC / 2nd Level Support
2nd Level Support / 3rd Level Support
Low / SOCC / 2nd Level Support
2nd Level Support / 3rd Level Support

The latest second line support escalation information has been posted to the following hyperlink.

4.5Contingency Plan

This section provides information and steps to execute the contingency procedure during system failure.

-Provide a background of the contingency measure

-Under what circumstances to activate the system contingency

-Describe the roles and responsibilities in activation of system contingency

-Provide detail steps to describe the activation of system contingency

4.6System Limitations

This section specifies any functional, technical or capacity related limitations of the system. >

Services / Issue / Limitation

4.7Vendor Support Information

This section provides necessary vendors contract information and the procedures for providing vendor support.

-Maintenance agreement

-License ownership, system media, vendor manual & warranty certificate

-Support hours

-Spare parts

-Contacts

Maintenance Contacts Information

Server/
Application / Support Vendor / Contacts / Phone / Mobile / Email address

Maintenance Agreement Information

Server/
Application / Response time / On-site time / Maintenance Period / Maintenance Contract Information

Spare Parts Information

Model / Hardware Configurations / Usage / Location

The latest server maintenance service inventory has been posted to the following hyperlink.

\\hkairport\Share\IT\06 Operations\03 Sys Production Control\03 OS & Server\Maintenance Info\Shared\Share Index.htm

5APPENDIX

This section is an optional section that providessupplementaryadditional information about the system. >

--- End of O&M ---

File Ref.: ITTMP010.docx
Page 1 of 16 / Copyright © by Airport Authority Hong Kong