PERFORMANCE MONTHLY REPORT – NOVEMBER / DECEMBER 2011

Due to the way the final workweeks have fallen in 2011, this report encompasses 8 week period for November and December.

Part 1 - Week beginning 7th November – Week beginning 28th November 2011 (weeks 45-48)

Part 2 - Week beginning 5th December – Week beginning 26th December 2011 (weeks 49-52)

Part 1: Service Outages:

1.  Blackboard (Elearning) – 1 Unscheduled Outage

Date:

/

Duration:

/

Cause

7/11 / 5 minutes / No cause was recorded.

Impact: Connectivity to hosted Blackboard systems was temporarily lost. Blackboard was then unavailable through UCD Connect for a further 25 minutes, but users could access it through Direct Login.

Action: Direct service came back without any intervention. Restoring SSO access required the Connect team to restart the Blackboard connector.

Improvement: Blackboard applied an Oracle patch over the Christmas break. This addresses some system-wide performance issues which may address this.

2.  UCD Connect – 1 Unscheduled Outage

Date:

/

Duration:

/

Cause

25/11 / 2hr 10 minutes / LDAP stopped responding overnight because the size of the backup was too large and caused the disk space to reach 100%.

Impact: Access to UCD Connect was unavailable.

Action: LDAP was restarted to and the LDAP database automatically initiated. This took 2 hours to complete and once complete, front end services were restored. After 10 minutes of coping with login demand, users could access the system normally.

Improvement: Reduced the amount of backups held on disk from 5 to 4 in order to decrease the capacity needed for backups.

3.  Crumlin Network unavailable – 1 Unscheduled Outage

Date:

/

Duration:

/

Cause

23/11 / 40 minutes / 3rd party contractors invited on site by Crumlin staff installed cabling and network splitters on network which caused network loops and shutdown the entire network.

Impact: There was no network connectivity to Belfield for Crumlin Hospital.

Action: Crumlin staff were advised to remove illegal network device from network.

Improvement: Crumlin staff invited 3rd party IT company to do networking after IT Services advised them not to do what they were proposing. In this case our advice was ignored and the end result was a campus outage.

4.  UCD Connect unavailable – 1 Unscheduled Outage

Date:

/

Duration:

/

Cause

30/11 / 5 minutes / CPIP Connectors were failing for those user trying to login and use mail, Blackboard and Calendar through Connect

Impact: Connect was unavailable for those not already logged in and access to other services using the CPIP connector were also unavailable.

Action: CPIP servers were restarted and normal service resumed.

Improvement: Increased monitoring has been put in place in order to notify us sooner if one server goes down. Also, upgraded CPIP servers with greater capacity are being implemented.

Service Availability Levels 9.00am-9.00pm – Part 1: November / December 2011

Period / 45 / 46 / 47 / 48 / Monthly
Avg.
Beginning / 7-Nov-11 / 14-Nov-11 / 21-Nov-11 / 28-Nov-11
Network / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
UCD Connect / 100.00% / 100.00% / 96.39% / 99.86% / 99.06%
Staff Email / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Student Email / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Staff File Sharing & Connect files / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Software Applications / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Staff Printing / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Student Printing / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Internet / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Elearning / 99.86% / 100.00% / 100.00% / 100.00% / 99.97%
Infoview / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Banner / 100.00% / 100.00% / 100.00% / 100.00% / 100.00%
Remote Sites / 100.00% / 100.00% / 98.89% / 100.00% / 99.72%
Overall Services / 99.99% / 100.00% / 99.64% / 99.99% / 99.90%


Part 2 : Service Outages –

1.  Daedalus Data Centre fire – 1 Unscheduled Outage for all services.

Date:

/

Duration:

/

Cause

8/12 / 40 minutes – 4 hours / On Thursday afternoon (3pm) there was a serious incident from a localised fire in the Daedalus Data Centre. A large research cluster in Daedalus caused the fire. The specialised fire protection systems in the data centre worked very effectively, releasing FM200 gas into the room and shutting down all power. This means there was no damage to equipment, other than the equipment which caused the fire, damage to this cluster is localised.

Impact: All services were unavailable for various periods of time to staff and students.

Action: Campus network was restored in 1 hour and 10 minutes.

Essential Student services (Blackboard & Email) were restored within 2.5 hours

Essential staff services were restored within 4 hours

Improvement: Fire suppression system and data recovery plan operated as required/correctly.

2.  Network Connection to Lyons Estate – 1 Unscheduled Outage

Date:

/

Duration:

/

Cause

15/12 / 15 minutes / There was an outage at on HEAnet equipment which caused this outage

Impact: There was no network available from Belfield to Lyons Estate.

Action: HEAnet resolved the radio link issue.

Improvement: Due to the location of the equipment providing the link, it is an identified risk that events outside our control will interfere with the link. There is constant monitoring by both HEAnet and IT Services so that service can be resumed quickly as an issue is identified.

3.  Network shared files unavailable – 1 Unscheduled Outage

Date:

/

Duration:

/

Cause

15/12 / 15 minutes / Controllers on Daedalus EVA 01 and EVA 03 both rebooted at 12.51pm causing volumes on Staff cluster to become unavailable. All resources were back online by 1.10pm.

Impact: There was no access to shared files for staff during this time.

Action: Controllers Rebooted automatically. Reboot caused connection to Storage to be lost on the Staff Data Cluster. The cluster was then brought back online

Improvement: Recommended Action - Update the EVA Controller Firmware to latest version.


4.  Connect files unavailable – 1 Unscheduled Outage

Date:

/

Duration:

/

Cause

21/12 / 10 minutes / All file systems did not mount completely since the full power on of Daedalus data centre the previous day.

Impact: Connect files was unavailable to staff and students.

Action: The missing file systems were remounted and Xythos was stopped and restarted on all clients.

Improvement: Recommend auto mount of the files systems.

Service Availability Levels 9.00am-9.00pm – Part 2: November / December 2011

Period / 49 / 50 / 51 / 52 / Monthly
Avg.
Beginning / 5-Dec-11 / 12-Dec-11 / 19-Dec-11 / 26-Dec-11
Network / 98.06% / 100.00% / 100.00% / 100.00% / 99.51%
UCD Connect / 93.75% / 100.00% / 100.00% / 100.00% / 98.44%
Staff Email / 93.75% / 100.00% / 100.00% / 100.00% / 98.44%
Student Email / 95.42% / 100.00% / 100.00% / 100.00% / 98.85%
Staff File Sharing & Connect files / 93.75% / 99.58% / 99.65% / 100.00% / 98.25%
Software Applications / 95.42% / 100.00% / 100.00% / 100.00% / 98.85%
Staff Printing / 95.42% / 100.00% / 100.00% / 100.00% / 98.85%
Student Printing / 67.08% / 100.00% / 100.00% / 100.00% / 91.77%
Internet / 96.53% / 100.00% / 100.00% / 100.00% / 99.13%
Elearning / 95.42% / 100.00% / 100.00% / 100.00% / 98.85%
Infoview / 93.75% / 100.00% / 100.00% / 100.00% / 98.44%
Banner / 93.75% / 100.00% / 100.00% / 100.00% / 98.44%
Remote Sites / 96.53% / 99.58% / 100.00% / 100.00% / 99.03%
Overall Services / 92.97% / 99.94% / 99.97% / 100.00% / 98.22%

2. Support Statistics December 2011

Overall totals
Total cases logged / 1809
Total cases logged by Students / 606
Total cases logged by Staff / 1203
Overall top 5 queries logged
NON ITServices Call / 445
Applications / 275
Service Outages / 210
Account Related Calls / 172
Customer Equipment / 143