/ Safety Management Systems / March 20,2011
Occurrence or Hazard Investigation / Version 1
Introduction / Page 1

Occurrence or Hazard Investigation

Introduction

Operator’s need to have a process in place to investigate or follow up any event or condition that has the potential to affect their operation, whether it is in the realm of safety, operational efficiency or financial.

Any event or condition that can have a negative effect should be deemed a threat. To properly react to any given threat, you must first understand what the threat is and respond with a measured response. You can view this as a scale of response. The goal is a proper response that doesn’t waste personnel’s time or company resources but has the effect of properly mitigating the threat to an acceptable level.

In the section on Risk management you will find various definitions and explanations of the basic aspects of Risk Management. In this section we will concentrate on the specific activities around the process used to identify the risks following an event or a report identifying a potential hazard.

Throughout the explanation it will be beneficial for you to keep the flow chart handy so that you can follow the logic and flow of the explanations.

We will be covering two versions of the process, the full or long form version which is used for major investigations following serious events and the short form which should be used in the majority of investigations.

The reason for having two versions is that the majority of events will only require an short form investigation. However if the event is quite serious or had the potential of becoming a major accident, you may want to conduct a full blown investigation. The major difference between the two is the amount of people involved in the process and the length of time to complete.

A short form can be done in a few days whereas a full investigation could take weeks.

If you’re going to use both types then you need to develop what criteria you will use to do one or the other. One approach is to start every investigation as if it will be a full investigation and then review the event to see if it warrants being investigated by the short form. In this way if you err it will be on the cautious side.

Definitions

Assumption / Accepting the risk and proceeding.
Avoidance / Use of an alternative approach that does not have as high a level of risk.
Consequence / The possible negative outcomes of the current conditions that are creating uncertainty.
Hazard / a source of potential harm, or a situation with a potential for causing harm, in terms of human injury; damage to health, property, the environment, and other things of value; or some combination of these.
Hazard Matrix / A table used in the prioritizing of analyzed risks
Hazard Risk Index / The analysis of a hazard by estimating its probability and severity, the result of which is the Risk Index.
Mitigation / The measures to eradicate the hazard or to reduce the probability or the severity of a risk, thereby reducing the Risk Index.
Potential Remedial Action / Possible action, such as procedural or equipment changes that are use to lower the risk index.
Probability / An expression of how likely the risk is to cause loss, damage, or injury.
Risk / The potential consequences of a hazard, measured in terms of severity and probability
Risk Control / Controlling risks involves the development of a risk reduction plan and then tracking to the plan.
Risk Management / The sum of all proactive management-directed activities that are intended to acceptably accommodate the possibility of failure.
Risk Management Process / A systematic way of identifying, analyzing, and managing risks.
System Safety / A risk management process wherein a systematic process is employed to identify and control risks throughout the life cycle of a project, program or activity.
System Deficiency / The circumstances which permit hazards of a like nature to exist within a system.
Severity / Severity is a measure of the negative impact which could result from an occurrence caused by a hazard.
Terminate / Action will be taken to immediately cease operations until acceptable correction action is taken.
Tolerate / Risks that have a Risk Index so low that they will be tolerated without further action.
Transfer / An attempt to pass the risk to another entity, external or internal.
Treat / Action will be taken to correct the situation and develop mitigation activities.

Full Investigation Process

Any process has a starting point. In this case we have a section called the Trigger Event or Condition. There are four items listed that can provide you with this starting point.

Phase 1 Trigger Event or Condition

/ Internal reports are generated by staff, normally as part of your occurrence and hazard reporting system. This is the formal method of reporting. However, it could be a letter, email, phone call or text message. This is your informal reporting system. The goal here is to provide staff with an easy way to advise senior management that there is an issue. Staff should be trained to properly fill out the company reports, but safety issues should not be slowed down because we don’t have a proper copy of the form to fill out. If senior managers receive an informal report they should be fully aware of the process to be followed to ensure that the designated staff receives this information as quickly as possible.
/ External reports are normally received from the National Aviation Authorities, Safety Boards, Air Traffic Services or police agencies. It can also involve information received from the local community, such as a report by a local member of the public. This can take the form of a written item (paper/ digital) or an phone call. Since some of these reports do not normally arrived through normal channels, staff throughout the organization need to know how to deal with this information. This includes how to respond to a phone call or an in person complaint, as well as who needs to receive this information for follow up and how to get it to them.
/ These items are normally gathered from the final reports following any audits or reviews. However you may wish to have protocol in place that if during an audit or review, the auditing team discovers an item that they consider time critical, they know who to contact and how to report it. This is normally not an issue with internal company audits; however it should be raised with external auditors and review teams as well.
/ The obvious event that requires an investigation is an occurrence. The company should however develop response levels based on the severity of the occurrence. The responses can range from simply tracking the data, (flat Tire) to an all-out investigation (aircraft has hard landing resulting in damage)

Phase 2 Preliminary Assessment and Decision Making

This phase of the process is mainly focused on:

  • Gathering of initial data
  • Briefing the senior staff as required
  • Deciding on the level of response needed

This phase is conducted by the company designated personnel, normally the Flight Safety Officer or Quality Assurance Manager. For the purposes of this flow chart we will refer to him or her as the investigator. See appendix A for a description of some basic knowledge and skills the investigator should have.

/ The first step in the process is the Investigator (INVESTIGATOR) reviewing all the preliminary data. The purpose is to provide the safety department the information to make a go forward decision.
/ Once the preliminary data has been gathered and reviewed, the investigator makes the initial determination if this data represents an incident (event) or a hazard. If it’s determined that we are dealing with a hazard, then the investigator proceeds to “Conduct Hazard Process”.
/ This process is explained in detail in Chapter xyz.
/ The next decision is to decide if this warrants an investigation. On many occasions you will find the event in question is something you may already be dealing with. For example if you’re experiencing a rash of flat tires you probably will not investigate everyone. However you will want to collect basic data and compare with the data you already have to try to find a common fault. This type of event however does not require the expense of resources beyond the basic data gathering.
However if the event is serious, rare or the investigator in coordination with senior management decide there is a possible lesson to be learned as a result of the event, then the company should conduct an investigation. The level of investigation will be discussed shortly.
When doing this step one must remember there are really only four choices:
  • Do nothing
  • Data gather only
  • Short form investigation
  • Full and in-depth investigation

/ When you do a simple data gathering exercise, then the data is recorded in the company database for future use in establishing trends.
/ Depending on what the company has established as criteria for a full investigation, the investigator now reviews the initial data from the event to make the decision as to whether to investigate or not. You may wish to establish a policy that states the default response to any incident or occurrence is to conduct a full investigation, unless after review of the preliminary data by the investigator it meets the company criteria to either data collect or conduct a short form investigation.
This forces the investigator to properly evaluate each incident the also cuts down the likely hood that an important investigation will not be missed.
For a sample listing of events or conditions requiring a full investigation see page xyz.
/ Once the decision has been made to conduct a short form investigation then that decision is formally recorded in the events file and the short form investigation process is started.
To review this process, please see page xyz.
/ At this point the investigator formally opens a safety event file and gives the file a unique number. If this event has Transport Canada or Transportation Safety Board participation or involvement, those file numbers should be ascertained from the respective agencies and documented in the file for cross reference purposes.
/ When conducting an investigation, senior management (Accountable Executive or his/her delegate) should be informed as soon as possible of the preliminary data. This can be done in person or via telephone or email. This is especially critical if preliminary data indicates possible serious repercussions for the company in the areas of operations, finances or legal. You may decide to have a time frame for the initial report such as 24 or 48 hours.

Phase 3 Investigation

/ Interviews should be conducted as soon as possible with all parties directly involved in the incident. The focus needs to be on determining the root cause of the event and not to establish punishment. Therefore some items that the company should keep in mind are as follows:
  • investigator should not be involved in the company disciplinary process.
  • Company needs to clearly establish a process to establish liability for punishment and then strictly adhere to it. For an example of a culpability flow chart please see page xyz.
  • As soon as it is clearly established that disciplinary action is to be considered the file should be handed over to identified operational personnel responsible for that process.
  • Individuals being interviewed need to fully understand this process , if they do they will be more likely to participate fully whenever they know that they are not at fault.
The company may wish to establish a protocol that says that if you are involved in any event you are required to complete a short written statement as soon as possible after the event. This of course is assuming the individual requires no medical aid as this take priority. This is also the case for witness statements as well.
/ Witness statements should be gathered as soon as possible. Normally interviews are good follow up to a written statement as the investigator get request clarification of the written statement.
/ Gather all statements and properly document them on the file. When gathering written statements you should record where the statement was written, who if anyone was present during the writing oif the statement and also the time and date completed.
/ At this point the investigator must gather all the information pertaining to the event. This can include any documents such as training records, policies, aircraft documentation, log books, operational manuals, etc. You should have a policy in your response plan document that details who should quarantine what documents following an accident or serious event.
/ In the case of a flight school or an aircraft operator the company may wish to establish an automatic grounding whenever someone is involved in an operational event. If the company does this, then it also needs to provide a methodology for a quick review and return to flight status. In the case of an AMO a similar process could be used for signing authorities.
/ The investigator now analyses all the information and starts to establish possible root cause scenarios. The goal is to attempt to find actions or inactions on the part of the individual involved and what conditions within the system allowed this event to place.
/ Once all the data has been analyzed the investigator must now develop findings as to cause and also identify any hazards as a result of this event. Any hazards found should be put in the hazard assessment process. See page xyz.
/ Once the draft is completed then parties directly involved should be allowed to review it with the purpose on ensuring the any factual data in the report is accurate. They may make comment about the findings, but the findings are not to be changed unless the factual data that is changed has a bearing.
/ The investigator now takes the report and reviews all comments from interested parties. He/she will then produce the final draft of the report.
/ The report is now given to the system manager for review and comments. Once again the System Manager can comment on findings, but the findings should not be changed.
/ The system manager reviews the report for accuracy of factual data.
/ The System Manager now develops mitigation. The mitigation needs to be well defined and explained, in writing. The proposed mitigation should be reviewed by the investigator for comments. However the final decision of the mitigation rests with the system manager.
/ System Manager comments are returned to the investigator for inclusion in the report as required.
/ The investigator completes the final report and forwards the report to the Senior Manager. In most small to medium companies this would be the accountable executive.
/ Senior management can now add comments or direction for the implementation of the corrective actions.
/ Mitigation actions and initial results need to be recorded. At a later date the results will be revisited and reviewed for effectiveness.
/ The results are now recorded for future reference.
/ The information can be sent to the safety committee fort review. This step is optional but strongly recommended for small companies. If your company has a safety committee under the Canada Labour Code, giving the information to the committee is mandatory.
/ Mitigation follow up can take the form of dedicated review or it can be incorporated within company audits that are conducted on a regular basis. The method to use depends on company policy and procedures, but can also be a result of the severity of the hazard in question.
/ The investigator and the system manager should review the effectiveness of the mitigation. If the mitigation is working adequately, then the file is closed and the results recorded in the database. If it is not then the process returns to the “Develop Mitigation” box and the process starts over from that point.
/ This event is now case closed and the final report with all mitigation verification complete should be sent to the senior manager (Accountable Executive) for final information purposes. This step is simply the final close out, as the senior manager would h have received briefings and various copies of reports throughout the investigation.
/ The event is now case closed.
/ All information must be documented. This can either be done in paper files or in an electronic database. You will need to decide the timeframe that you will keep your data for. It should be at last 5 years, preferably 7.

Short Form Investigation Process

The short form investigation will be used the majority of the time. The goal in any investigation is to get the information as efficiently and as effectively as possible. There is no point doing a full blown investigation on a relatively minor event.

/ The process involved to conduct a short form investigation is similar to the full investigation. The only areas that are cut out or limited is the amount of other parties involved in the process.
/ The first step is to open a file. The same numbering system is used as if you were conducting a full investigation. An investigation is an investigation. The goal is the same for both Full and Short forms, the only difference is a slight change in the methodology.
/ You will need to decide the timeframe for the initial report to senior management. Where you’ve made a decision to do a short form, you may choose to have this report done on a weekly basis vs. “within 24 hours” or whatever you decided as the reporting time for a full investigation.
/ Interviews are conducted with all involved parties such as the personnel involved in the event as well as witnesses.
/ If a flight crew member is involved and if they have been grounded, you will want to conduct this process as soon as realistically possible and return the individual to flight status.
/ Collect all data such as log sheets, log books, training records etc.
/ The investigator now analyses all the information and starts to establish possible root cause scenarios. The goal is to attempt to find actions or inactions on the part of the individual involved and what conditions within the system allowed this event to place.
/ The investigator develops any findings and identifies and hazards. The reason Hazards are identified and sent to the Hazard process is because in any investigation you could find numerous hazards. In some cases you may come across a hazard that had nothing to do with the event.
/ Any hazards need to be run through the Hazard Assessment and Mitigation Process.
/ The investigator meets with the system manager and reviews all findings. The system manager is now responsible to develop the mitigation. The system manager can do this on their own, however they should consider using the investigator as a resource in the mitigation development.
/ Once the mitigation has been developed, the investigator can finalize and produce the final report.
/ The system manager now must take the previously developed mitigation and implement it. The system manager must report back to the company through the SMS program as to the effectiveness and status of the mitigation implementation.
/ The investigator records all results from the mitigation implementation.
/ At a specified timeframe, anywhere from 1 to 6 months, the mitigation effectiveness is reviewed. This can be done as part of a formal audit or as a standalone check.
/ The investigator now accesses whether the mitigation is working. If it is then it is simple of recording that information and case closing the event. You will note that the event is not closed until you have determined the effectiveness of the mitigation.
If the mitigation is not working effectively enough then the investigator and system manager review the hazard again and see what other options might exist. As far as the process is concerned, you return to the “Review Findings with System Manager and Develop Mitigation” box and start the process again from that point.
/ You will note that in the short form process there is no need to send the final report to the senior manager. This process is used for minor events; therefore unless the senior executive requests it, there is really no need to forward separate reports for each investigation. You should however submit a general report on a monthly basis that can be reviewed by both the senior manager and the safety committee.
/ All information must be documented. This can either be done in paper files or in an electronic database. You will need to decide the timeframe that you will keep your data for. It should be at last 5 years, preferably 7.

Investigator Skills and Knowledge

The individual you put in charge of safety or occurrence investigations should have a working knowledge of the following areas: