"Forum on Risks to the Public in Computers and Related Systems" <"Peter G. Neumann" Tue, 8 May 2007 10:58:28 PDT This is another example of a system environment in which components that were supposedly not safety related could compromise safety. The case is of considerable interest to RISKS. On 19 Aug 2006, operators manually scrammed Browns Ferry, Unit 3, following a loss of both the 3A and 3B reactor recirculation pumps, as required after the loss of recirculation flow -- which placed the plant in a high-power, low-flow condition where core thermal hydraulic stability problems may exist at boiling-water reactors (BWRs). Generally, intentional operation is not permitted under this condition. Although some BWRs are authorized for single loop operation, sudden loss of even one pump could present the plant with the same stability problems and could result in the reactor protection system initiating a shutdown of the plant. [Source: Effects of Ethernet-based, Non-safety Related Controls on the Safe and Continued Operation of Nuclear Power Stations, United States Nuclear Regulatory Commission, Office of Nuclear Reactor Regulation, Washington, DC 20555-0001, 17 Apr 2007; PGN-ed, although the following text is abridged but unedited.] The initial investigation into the dual pump trip found that the recirculation pump variable frequency drive (VFD) controllers were nonresponsive. The operators cycled the control power off and on, reset the controllers, and restarted the VFDs. The licensee also determined that the Unit 3 condensate demineralizer controller had failed simultaneously with the Unit 3 VFD controllers. The condensate demineralizer primary controller is a dual redundant programmable logic control (PLC) system connected to the ethernet-based plant integrated computer system (ICS) network. The VFD controllers are also connected to this same plant ICS network. Both the VFD and condensate demineralizer controllers are microprocessor-based utilizing proprietary software. The licensee determined that the root cause of the event was the malfunction of the VFD controller because of excessive traffic on the plant ICS network. Testing by site personnel performed on the VFD controllers confirmed that the VFD control system is susceptible to failures induced by excessive network traffic. The threshold levels for failure of the VFD controllers due to excessive network traffic, as determined by the on-site testing, can be achieved on the existing 10-megabit/second network. The NRC staff's review of industry literature and test reports on network device sensitivity, and the threshold levels for such failures, confirmed these testing results. The licensee could not conclusively establish whether the failure of the PLC caused the VFD controllers to become nonresponsive, or the excessive network traffic, originating from a different source, caused the PLC and the VFD controllers to fail. However, information received from the PLC vendor indicated that the PLC failure was a likely symptom of the excessive network traffic. To ensure that excessive network traffic will not cause future Unit 3 VFD controller malfunctions, the licensee disconnected these devices from the plant ICS network before restart. The licensee also disconnected the Unit 2 VFD controllers from the plant ICS network. Licensee corrective actions included (1) developing a network firewall device that limits the connections and traffic to any potentially susceptible devices on the plant ICS network and (2) installing a network firewall device on each unit -- VFD controller and condensate demineralizer controller. The Browns Ferry Unit 3 event is discussed in Licensee Event Report 05000296/2006-002, dated October 17, 2006, Agencywide Documents Access and Management System, Accession No. ML062900106. The reason the licensee at Browns Ferry investigated whether the failure of one device, the condensate demineralizer PLC, may have been a factor in causing the malfunction of the VFD controllers is that there is documentation of such failures in commercial process control. For instance, a memory malfunction of one device has been shown to cause a data storm by continually transmitting data that disrupts normal network operations resulting in other network devices becoming locked up or nonresponsive. A network found to be operating outside of normal performance parameters with a device malfunctioning can effect devices on that network, the network as a whole, or interfacing components and systems. The effects could range from a slightly degraded performance to complete failure of the component or system. Major contributors to these network failures can be the addition of devices that are not compatible, network expansion without a procedure and a overall network plan in place, or the failure to maintain the operating environment for legacy devices already on the network. While only non-safety related network devices became nonresponsive at Browns Ferry Unit 3, it is important to protect both safety-related and non-safety related devices on the plant network to ensure the safe operation of the plant. The 19 Aug 2006, transient unnecessarily challenged the plant safety systems and placed the plant in a potentially unstable high-power, low-flow condition. The potential safety implications for future similar events would depend on the type of devices that are connected to the plant ethernet. Careful design and control of the network architecture can mitigate the risks to plant networks from malfunctioning devices, and improper network performance, and ultimately result in safer plant operations." The link on the website is wrong here is the updated link... Great tech article.... EFFECTS OF ETHERNET-BASED, NON-SAFETY RELATED CONTROLS ON THE SAFE AND CONTINUED OPERATION OF NUCLEAR POWER STATIONS (AKA WTF HAPPENED!) Reminds me of when the "network administrator" just forgot what end to plug into a switch and ended up plugging both ends into an unmanaged switch. (Linksys I think it was.) It caused so much wild and crazy traffic that everything slowly came to a grinding halt as packets were flying all over the place.... The network really did crap all over its self. Browns Ferry 3 nuclear power site scrammed |