Stardate
20031120.1238 (On Screen): Life is risk. Everything we do can lead to disaster, and disasters great and small happen all the time. If you plot seriousness versus prevalence on a chart, the result is not a bell curve. It's more like a power curve, with an inverse relationship between seriousness and likelihood. (Which is why engineers use terms like "the ten thousand year flood"; it refers to a flood so big that it's only expected to happen about once every ten thousand years.)
Some kinds of disasters (e.g. weather) are natural events whose probability isn't affected by the choices we make, but all of us consciously choose to take risks as part of our everyday lives. For most people, the most dangerous thing they do routinely is to drive their cars – unless they're smokers. But it's a calculated risk; driving has immediate benefits and in general the risk of a serious car accident is pretty low. In a nation the size of the US there's a steady rate of deaths in car accidents, but that's only because the nation is so big and so many people in it drive so often. Still, how you drive and where you drive changes your risk, and people who are prudent can decrease their risk without giving up their cars.
Risk is an element of engineering, because it's not feasible to design a solution which cannot fail. (In many cases it isn't even possible.) In an article last year I included the following:
So in this case it's not just a question of whether a given system will eventually fail; it's guaranteed to do so. The only debate is when, and how, and how serious the consequences of the failure will be, and whether the system will be robust enough to recover from it.
Not that this is anything to intimidate an engineer, because no competent engineer ever designs a system which cannot fail. What you try to do is to design so that failure is acceptably rare, and that the consequences of failure are acceptably benign.
Any design which is perfect will also be perfectly unbuildable. A perfect aircraft will weigh too much to get into the air. A perfect anything will cost too much and take too long.
And the process of creating a perfect design is itself unacceptable, because in practice it never ends. You don't ever get to implementation because you'll be stuck in an endless redesign loop, chasing ever more obscure and improbable failure modes well beyond the point of diminishing returns.
That's why most engineers roll their eyes in exasperation when laymen ask for perfect safety from anything, such as nuclear power or the use of genetic modifications on food. Asked if he can guarantee that a nuclear power plant won't fail in such a way as to release significant radioactivity into the area around it, the engineer will answer, "Of course I can't. Why would I even want to try?" That makes the layman think the engineer is a reckless fool who doesn't give a damn about that layman's life and health.
But to that engineer, the idea of a system guaranteed not to fail is idiocy. There ain't no such beast. The system in question isn't guaranteed to never fail, but it has an adequate margin of safety along with a benign failure mode, and that's all the better anyone can do.
It's a cultural thing; laypeople don't understand that all engineering is a tradeoff. In some cases the effort applied to reliability will be very extensive, as the work involved with the Shuttle's computers demonstrates. In other cases it will not make commercial sense to do that; it will massively raise development costs without any commensurate increase in sales to pay for it.
In some cases when you take risks, then if you roll snake-eyes it doesn't mean that catastrophe happens; it means that it begins to happen, but if you keep your eyes open you may be able to see it and do something about it. In a lot of cases, the risk you're taking isn't so much of failure as a risk that you'll have to augment the design.
In any really large engineering design it's essentially certain that this will be necessary. The challenges that the system will face cannot be predicted perfectly, and the ways in which users utilize the system also cannot be predicted perfectly. New and unforeseen challenges will appear, and you'll have to deal with them. New use pattern may appear, changing the fundamental problem space.
And what's even more important, new tools and resources may also appear. Not only is it impractical to try to design a perfect solution immediately, you wouldn't want to even if you could, because it's likely that some parts of the solution will be cheaper and easier in future.
The engineers who collectively designed the beginnings of the modern phone system in the 1940's and 1950's only had mechanical technologies to work with. Vacuum tubes were too expensive and too unreliable to use in large numbers, so pretty much everything had to be done with physical switches. Their solution to the problem of "direct dial" with the old rotary phones was quite clever, actually, but by modern standards was also te
|