The Practice of Risk Analysis and the Safety of Dams

Baecher, G.B.[1] and Christian, J.T.[2]

Abstract

Despite the best efforts of engineers to design conservatively, dams and other geotechnical structures do fail at relatively constant rates. While no engineer or engineering organization designs dams to have a finite probability of failure, the use of risk analysis techniques is growing in popularity as a means of dealing with the uncertainties in geotechnical performance. Risk analysis forces the engineer to confront uncertainties directly and to use best estimates of site conditions and performance in predicting performance. Uncertainties, rather than being dealt with by conservative assumptions, are themselves treated as quantifiable entities. Methodologies that originated in the aerospace and nuclear industries are now being applied to geotechnical structures, and with notable success. Nonetheless, the development of geotechnical risk analysis procedures requires that the unique considerations of geotechnical uncertainties, different in many ways from structural uncertainties, be confronted and dealt with. This is leading to risk analysis procedures which are in themselves specially tailored to geotechnical applications.

Introduction

Engineers, and the organizations for which they work, do not build dams with an intentional probability of failure. Engineers work on one dam at a time, and they design it to be safe. If they are uncertain about site conditions or flood frequencies, they design conservatively. Engineers exercise the public trust. They are not gamblers, and for the most part do not believe that nature is random. They believe that the world behaves according to fixed rules of physics, and their job is to work with those rules to plan, design, and build structures that behave as intended.

But dams do fail, and not just occasionally. Modern, well-designed dams operated by competent authority fail at a rate about 10-4 per dam-year. Failure here means loss of pool. Dam incidents, as defined by ICOLD, which are serious events that do not cause loss of pool, happen at a rate more than ten times greater than the rate of failures. The number 10-4 per dam-year sounds small, but in the United States, to take one example, there are 75,000 dams over 7.7m (25 feet). The 10-4 rate implies an average of 7.5 dam failures a year. Indeed, in the last ten years, the National Performance of Dams Program (McCann 1999) has recorded 440 dam failures in the U.S., of structures conforming to the Federal Guidelines on Dam Safety, that is, structures more than 7.7m (25 feet) high or impounding more than 32 thousand cubic meters (25 acre-feet) of water. These include many privately owned and many relatively small structures, which have a higher rate of failure than large, government-owned structures. Most, but not all, of these failures occur during major storms. Hurricane Agnes (1972) alone may have caused 200 dam failures in the eastern United States.

Regulatory authorities and the public in general have grown ever more aware of the risks posed by chemicals, consumer goods, and other products of industrial society. They have also grown more aware of risks posed by infrastructures, including dams, levees, and other water resource structures. They do not necessarily believe the engineering community’s assurances that a dam is safe. How should designers and the organizations that build and operate dams respond to this challenge? Despite misgivings by some factions of the profession, increasingly the response is to turn to the risk analysis procedures pioneered and proven in the aerospace and nuclear power industries.

Rate of failure of modern dams

Despite the difficulties, the International Congress on Large Dams (ICOLD) and its national affiliates, for example, the United States Committee on Large Dams (USCOLD) and the Australian Committee on Large Dams (ANCOLD), have devoted a great deal of attention to compiling information on dam failures and their causes. These are voluntary professional societies, but most dam-building agencies and engineers participate in their activities. From the efforts of ICOLD and from the information generated by organizations examining their own operations, engineers have developed a fairly clear picture of the causes of dam failures (International Commission on Large Dams 1995; International Commission on Large Dams. 1973)

The foremost cause of failure as cited in the catalogs is overtopping. More water flows into the reservoir than the reservoir can hold or pass through its spillway. The excess water has to go somewhere, and the most likely place is over the top of the dam. This does serious damage to the dam, especially to an embankment dam, which is likely to fail. At some dams, even when the outlet gates are fully open, the spillways are not large enough to carry the water piling up behind the dam. Overtopping and inadequate spillway capacity tend to be lumped together in the catalogues of dam failures.

Actually, the rate of failure by overtopping of modern, well-build dams operated by competent authorities is small. Most of the overtopping failures recorded in the catalogs are of dams build in earlier times, or dams that were poorly maintained, or operated by other than competent authority. Almost all modern, large dams have benefited from significant advances in hydrological science, including hydrological risk analysis, over the past few decades, and are designed to conservative assumptions about the largest flood they must be prepared to store or pass, the so-called, “probable maximum flood” (PMF) in US practice. Indeed, Lave and Balvanyos (Lave and Balvanyos 1998) maintain that no major US dam has ever experienced a PMF, although probable maximum precipitations (PMP) have been approached or exceeded (U.S. Bureau of Reclamation 1986).

Spillway capacity has a major influence on the likelihood of overtopping, but the way the reservoir is operated is equally important. Organizations develop manuals to instruct operators in what to do in various situations, and the organizations assume that operators know the procedures and follow them. Yet, this is not always the case. As with failures in many spheres, some failures happened because the operators did not follow the prescribed procedures. An example is the Euclides da Cunha dam in Brazil. In 1977, during a torrential rainstorm, water in the reservoir rose faster than the rate at which the spillway gates were supposed to be opened. Operators were reluctant to open the gates because the resulting flood would affect their families, friends, and property downstream. They waited too long, and the result was a major dam failure.

The next most common cause of failure is internal erosion. This starts when the velocity of the water seeping through an embankment or abutment becomes so large that it starts to move soil particles. Once particles are removed, the channel becomes larger, it attracts more flow, which picks up more particles, and enlarges the channel further. The end of this process can be a channel so large that the flow through it destroys the dam or abutment. On June 5, 1976, the Bureau of Reclamation’s 300 foot-high Teton Dam failed. The dam had only recently been completed, and the reservoir had never been filled. Unusually large snow melt in the Grand Teton mountains sent water into the reservoir more rapidly than had been anticipated, filling the reservoir to capacity. The outlet works were not yet operating, so the water could not be diverted. Engineers still debate how the failure occurred, but internal erosion created a full breach near the right abutment that allowed the pool to escape in a wave that engulfed the towns downstream.

Engineers have learned a great deal about internal erosion and the effects of seepage at dam sites. They go to great lengths to control seepage under and around dams. This can involve constructing walls to contain the seepage or pumping concrete at high pressure into the rock to seal openings. Embankments have multiple layers with different permeabilities and grain sizes, some to prevent seepage, some to channel the flow safely into drains, and some to prevent particles from migrating under seepage pressures and initiating piping. To make sure that all this is working properly, engineers install devices to measure movements and pressures and monitor the readings regularly. A modern dam is a complicated and ever-changing structure with which the operators interact continuously.

People who deal with older dams recognize that they were not built with the same knowledge and experience as a modern dam. This is particularly true of dams that were built and maintained by inexperienced groups without adequate engineering support. The Johnstown flood of 1889, one of the worst public disasters in U. S. history, killed about 2200 people. It happened because a badly designed embankment dam, operated by a private club to retain water for a resort lake, and maintained poorly if at all, collapsed during a heavy rainstorm. In 1977 the Toccoa Falls Dam, built originally with volunteer labor at a religious camp, failed under similar circumstances; 39 people died in the resulting flood.

The risk of dam failure is also not uniform over the life of the dam. Like most engineered products, the chance that a dam will fail is highest during first use, which for a dam is first-filling, the first time that the reservoir is filled to capacity. If something was overlooked, or if some adverse geological detail was not found during exploration, then this is usually the time that it will first become apparent. As a result, about half of all dam failures occur during first filling. The other half occur more or less uniformly in time during the remaining life of the dam. So, if the rate of failure averaged over the whole life of a dam is about 1/10,000 per dam-year, the rate during the first, say, five years reaches almost 1/1,000 per dam-year, or ten times higher. This is exactly what the historical record shows.

That about half of all dam failures occur during first filling is a troubling observation, for the following reason. In the arid areas, which use dams primarily for irrigation and only secondarily for flood control, reservoirs are often kept full. If a heavy storm is forecast, the reservoir is lowered to make room for the larger inflows coming from upstream. But in temperate regions, where dams primarily serve flood control needs and irrigation is not an important benefit, reservoirs are typically kept low. If a flood comes, either its entire flow is caught behind the dam, or if it is a very large storm, at least its peak flow is caught. But since most flood control reservoirs are designed for floods of a size that essentially never comes—the probable maximum flood (PMF)—many dams in temperate regions, such as the eastern US, have never experienced design pool levels, they have never seen first filling, and thus have never been proof tested. The probability of failure of these dams, should an extreme flood come, could be ten times greater than that of a load-tested dam. Of course, the chance of PMF is purposely remote.

How risk analysis is carried out

How do we think about the risk of dam failure? Most risk analyses begin with a systematically structured model of the events that could, if they happened in a particular way, lead to failure. This model is called, an event tree.[3]

An event tree begins with an initiating event, and graphs the sequences of subsequent hypothetical events that ultimately could lead to failure. Examples of initiating events include earthquakes, floods, and hurricanes. An example of something other than a natural hazard that might be an initiating event is excessive settlement, which may cause equipment failure, say, a spillway gate, and simultaneously disrupt utility services needed to deal with that equipment failure.

The steps in a risk analyses are:

  1. Define what “failure” means.
  2. Identify initiating events.
  3. Build an event tree of the system.
  4. Develop models for individual components.
  5. Identify correlations among component failures or failure modes.
  6. Assess probabilities and correlations for events, parameters, and processes.
  7. Calculate system reliability.

It is often said that a principal benefit of risk analysis lies simply in structuring the problem as an event tree and in trying to identify interactions and correlations—what reliability engineers call “failure modes and effects analysis”—whether or not quantitative reliability calculations are ever carried out or used.

Structuring risk in an event tree

An event tree is nothing more than a graphical device for laying out chains of events that could lead from an initiating event to failure. Each chain in this tree leads to some performance of the system. Some of these chains lead to adverse performance, some do not. For each event in the tree, a probability is assessed presuming the occurrence of all the events preceding it in the tree, that is, a conditional probability. The total probability for a particular chain of events or path through the tree is found by multiplying the sequences of conditional probabilities.

Simple random experiments can be used to show how event trees are useful in diagramming outcomes and identifying sample spaces. These event trees for simple experiments are the same in concept as those used to analyze complex system reliability, only much simpler. The event tree shown in Figure 1 represents the experiment comprising two successive tosses of a fair coin. On the first toss, the coin lands either heads-up or heads-down, and similarly on the second toss. If one presumes these tosses to be independent, the branch probabilities in all cases are 0.5. The probability of each of the four possible outcomes is the same, 0.25. If a wager is placed such that the player looses if and only if two tails occur (T,T), then the “probability of failure” from the event tree is, 0.25. The use of event trees in analyzing complex system behavior is exactly the same as in this simple example.