OIWG Meeting Agenda Oct 05, 20161
1.Welcome, Call to Order, Introductions— Glenn Rounds
2.Review WECC Antitrust Policy—Glenn Rounds
All WECC meetings are conducted in accordance with the WECC Antitrust Policy and the NERC Antitrust Compliance Guidelines. All participants must comply with the policy and guidelines. This meeting is a closed session. The discussion should be limited to confidential information and should not extend to any matter that can be discussed in open session. You are also reminded that you have signed a non-disclosure agreement that governs any and all confidential information that is discussed during this meeting.
3.Approve Agenda
4.Review Sept 07, 2016 Meeting Minutes
5.Review of Previous Action Items—Glenn Rounds
Action Items for Closed Events
None
Action Items for Open Events
December 6, 2015 – Category 1a
The event was a human failure event that caused a backup tripping of the source due to the fault current setting on the breaker-failure relay being set too high when a circuit breaker failed to trip because of mechanical issues.
The OIWG reviewed the report concerning this event. The OIWG would like to request that the entity report back at a future date to ensure that the corrective actions took place.
STATUS: 4/06/2016 –The entity is waiting for the plant to shut down to verify the circuit.
June 01, 2016 – Category 1a
A 115-kV transmission line fault inadvertently triggered a breaker failure scheme, locking out a 230-kV bus and tripping off 436 MW of generating as well as another 115-kV and 230-kV line.
AI – Regarding the June 01, 2016 – Category 1a event, Tim Reynolds will follow up with the questions below:
- Will this event be filed as a miss-operation?
STATUS: 8/03/2016 –Yes, this event will be filed as a mis-operation.
- Was the miss-wiring done to design? Is there peer review for drawings?
STATUS: 8/03/2016 –There actually was no mis-wiring. The wiring was done to design, but there was an oversight during the process regarding the possibility that the transformer could induce a small amount of current back onto the line for an unusual event like the one that occurred. Yes, there is peer review for drawings. For major relay scheme changes, we generally have consultants do the design and then our engineers review it. However, for incremental improvements in our protection systems, we sometimes conduct that design and review in-house. In this instance, consultants helped with the original design for this transmission line, but when we later decided to connect the relays to a transformer low side CT, we only had an in-house review.
- Are there additional corrective actions to be performed?
STATUS: 8/03/2016 –We are still investigating additional corrective actions that we could take. We have contacted our consultants for a peer review, but they have been slow to respond so far.
- Is there any other peer review other than the consultants?
- When do you plan to be done with the corrective action?
June 04, 2016 – Category 1a
A failed cross arm on a 230kV line caused a sustained single phase fault. Relays on one end opened an Auto Transformer bank tripping all circuit breakers. Power circuit breakers on the other end reclosed but tripped back out for permanent fault.
AI – Regarding the June 04, 2016 – Category 1a event, Tim Reynolds will follow up with the questions below:
- Was there a protection peer review of the bank settings?
June 28, 2016 – Category 1h
Two hours after implementing a database update, operators started to lose control to field devices, receive failure alarms, and eventually the EMS stopped updating field data.
AI – Regarding the June 28, 2016 – Category 1h event, Tim Reynolds will follow up with the questions below:
- Does the Vendor have a recommendation to not complete failovers until the vendor can replicate and fix the software issue?
STATUS: 8/30/2016 –No. The Vendor recommended otherwise, and informed us to proceed with regular DB switches since the problem could not be replicated.
- Regarding the information in Section 19, what triggers emergency measures? Are there time elements? Does the procedure clearly define triggers and steps to be taken?
STATUS: 8/30/2016 –Section 19 (excerpt below) addresses SOCC evacuation if there are reliability concerns, and specifically addresses BUCC activation within a two-hour clock if the primary control system becomes inoperable. On June 28, 2016, the entity did not activate the BUCC. The problem experienced that day (where field data appeared static and was not updating) was resolved in less than an hour. The entity did notify their neighbors, power plants and had started the dispatch of personnel to key locations such as 345-kV substations.
Excerpt from their Emergency Procedure Manual: Emergency evacuation of the SOCC will occur if there are reliability concerns due to the control center not functioning as intended and it becomes inoperable, or uninhabitable. The SOCC may also be evacuated due to personnel safety concerns, for example: a bomb threat, gas leak, or a toxic chemical spill near the SOCC. SOCC Management may, at their discretion, execute any portion(s) of this plan under emergency or non‐emergency conditions. In the event that the primary control system becomes inoperable, the BUCC must be activatedwithin two hours.
- Section 20 states that patches have been provided by vendor; how does that compare with the vendor not being able to replicate trouble? Has this been tested in QA environment? What patch QA process is completed at entity to ensure no repeat?
STATUS: 8/30/2016 –Although the entity and the vendor could not replicate the trouble, the vendor informed the company that a problem with the data acquisition function had been seen with other customers, and provided patches to the entity to address it. The patches were tested in the company’s QA environment with no identified issues. The patches were then applied to the backup system to test in a production environment and a new issue was identified with the vendor-supplied patch. The company has backed out the patch and is now awaiting a new patch from the vendor. The company has not had another similar event after multiple on-lining of new databases in the production environment.
- Who is the vendor and do they have their own root cause analysis process for this issue?
- Why did it take two hours to test a field device?
- Is there a heartbeat monitor that is specific for the EMS (separate system that monitors it)?
July 06, 2016 – Category 1h
After updating anti-virus updates on the SCADA network servers, the updated prevented the servers from communicating which cause a loss of visibility to the entity’s SCADA system and ICCP data that was being pushed to neighboring systems.
AI – Regarding the July 06, 2016 – Category 1h event, Tim Reynolds will follow up with the questions below:
- Was there a defined policy to address firewall configuration settings or was this overlooked? Peer check completed?
STATUS: 8/03/2016 –The entity does not have a documented policy or procedure that defines specific settings for the Symantec system. The SCADA vendor disabled the Symantec firewall settings when initially configuring the SCADA system, but the entity is not aware of a procedure/manual from the SCADA vendor that specifically addresses these settings. There was no peer check completed before these settings were implemented.
- Is there any sort of peer review or check list review that has been implemented to prevent this from happening again?
July19, 2016 – Category 2f
A 161-kV capacitor bank suffered an external fault and caught fire, tripping two 161-kV buses on differential trip. 485 MW of load and 236 MW of generation were tripped. An island of 95 MW remained for 29 minutes before collapse. 121,964 customers affected.
- Has it been discovered where the voltage transient came from during the time marker 17:08:07?
STATUS: 9/22/2016 –No. I haven’t been able to discern if a unit issue (boost vars) or a LVE event.
- What was the reason for the 31 MW picked up on a 6 MW line during the time marker 18:22?
STATUS: 9/22/2016 –LVE sectionalizing wasn’t small enough plus some “cold load” effect (the earlier two load pickups were within parameters).
- Was the system configuration setup according to the RC memo, and if so was the system condition study also aligned with the plan? Was the line that was out from BPA in the study with this configuration as well?
STATUS: 9/22/2016 –The Bus Configuration was implemented per our internal operating procedures (PCC-913) which has been posted to the RC, however there was not an “RC Memo” in place at the time of this event. There may be confusion between this event and the 2013 event (during which there was a Memo in place – however since that time, the 345kV Bus was reconfigured and the reliability basis for that operating memo no longerexists).
- When the temporary measures were put in place, what was the review and approval process? Was there an electrical SME involved in created the temporary measures or reviewing them?
STATUS: 9/22/2016 –Yes, when this temporary configuration was put into place, it was extensively reviewed by numerous Planning and Relay Engineers/Technicians. Further it has been reviewed annually by Grid Operations before being put into place. While it was identified that a Close In Fault could potentially be seen by the Differential Scheme, the Electrical-Mechanical Relay coordination issue (see below) was not known – and it was believed that the other Zones of Protection would correctly isolate Faults away from the Bus and the Differential Trip would not initiate.
- When will the final measures be implemented?
STATUS: 9/22/2016 –Changes to the Single Bus Mode of the Differential Scheme were put into place following our investigation and will prevent this misoperation from occurring in the future (See below). In addition, an area RAS Scheme has been approved and is scheduled for installation late Q1 2017. However, this RAS Scheme is intended to prevent subsequent overloads for a loss of the 345kV Sources into Goshen and is NOT intended to provide supplemental protection against a Bus Fault.
- Is there a mitigation plan in place until the final measures are implemented? Will there be training for operators on how to handle a similar situation in a simulation type scenario?
STATUS: 9/22/2016 – It will be presentedat the dispatchers fall training class. The entity has annually conducted Load Shed drills for the Area to help prepare operators to handle a similar event. Following this event, it has been directed to the Training Department to develop training scenarios using the new EMS Simulator for operator training for the future.
- Assuming that the faulted capacitor bank had its own zone of protection, did this protection operate correctly?
STATUS: 9/22/2016 –Yes they did. The Capacitor Zone of Protection operated correctly and isolated the fault.
- Why did the bus differential scheme trigger? Is the capacitor bank outside thebus differential zone of protection?
STATUS: 9/22/2016 –Differential Scheme we had in place had an additional contribution current from the Bus Tie Breaker that resulted in the normal flows through the bus not summing to zero. To correct, the relay settings were adjusted to allow for a ‘deadband’ where the normal current through the breaker would not trigger the differential protection – but a Bus Fault would. This is subsequently been corrected by the addition of a CT that removes the Bus Tie Breaker current flow.
However, while this issue is what allowed the Differential to see the Cap Bank Fault – it was not the actual cause of the lockout. The investigation showed that the Electro-Mechanical Device providing Differential Protection had a .1 second Disk Overtravel – and this is what caused the Differential to operate after the Cap Protection Zone isolated the fault. Basically; the Cap Bank Fault, being close in, resulted in excessively high momentary current through the Bus Tie Breaker – pushing it outside of the deadband and initiating travel on the Differential Disk. When the Cap Breaker cleared the fault, the high current condition was removed, but the disk continued to travel and tripped out the Bus. This coordination issue has also been addressed.
Aug26, 2016 – Category 1h
The state estimator went down for 7.3 hours due to errors in system statuses and system changes outside of the entity’s boundary.
- Is there a plan to bring in more ICCP points from the neighboring entities?
STATUS: 9/28/2016 –For this particular problem we have requested the ICCP data point that contributed to this issue from our neighbor, once that point is received we will incorporate it into the model. In general we are always looking to improve our accuracy, particularly at the boundaries where an equivalence model is used to represent the outside system. The majority of the time, the State Estimator will produce ‘Gross Measurement Errors’ eluding to model weakness well before a failure to solve. When GME’s appear that is a trigger for additional ICCP information.
- Is there a process in place for neighbors to notify you of changes external to your system and where you made aware of these changes?
STATUS: 9/28/2016 –Yes, we coordinate element outages with our neighbors. In this particular case we did not associate this outage as impacting the state estimator, now that we know it can we have requested this point via ICCP to prevent future issues with this element.
- Are there any concerns to the accuracy of the State Estimator due to changes with the sensitivity?
STATUS: 9/28/2016 –Yes, changing the sensitivity is impactful; however in this case the sensitivity was changed to allow the state estimator to converge, once we had a converged model the sensitivity was changed back to its original value. With the converged model the state estimator continued to solve once the original values were reinstated.
6.Review of New Events since Last Call – Glenn Rounds
Sept 10, 2016 – Category 1h
Due to a failure in the plants cooling tower, 585 MW if generation was lost and 157 MW of load was shed to prevent SOL overloads.
7.Review of New Action Items
8.Review Upcoming Meetings
Nov 16, 2016...... Webinar
Dec 07, 2016...... Webinar
9.Adjourn
Western Electricity Coordinating Council