By Charles Miller
Continuing with last week’s tale, what is this mysterious thing that can cause a computer malfunction that almost made an airliner fall from the sky, and could also causing intermittent and unexplainable behavior of your computer or smartphone? The answer is cosmic rays, high energy radiation from outer space.
The planet Earth and everything on it is being constantly bombarded by these rays. These particles are 90% protons, 9% are helium nuclei, and 1% are heavier nuclei. Some come from our sun, and other high energy cosmic rays moving close to the speed of light come from exploding stars, supernovas in deep space. These invisible speeding bullets are constantly raining down on us.
When one of these particles pierces a transistor or other semiconductor it has the capacity to sometimes strike in just the right place to create a large number of free charge carriers. These electrons could cause a “bit flip” changing a byte of the computer’s binary language from a one to a zero or vice versa.
This is known as a Single Event Upset (SEU), a type of soft error. The error is called “soft” because the hardware was not damaged. “Single Event” means the error is not reproducible and that exact error might never occur again. And I assume “Upset” could be a reference to how you might feel if you spent a small fortune on a forensic investigation only to find out it is still impossible to know for sure what caused a malfunction.
When manned space flights began in the 1960s, NASA became aware of the fact that, in addition to other things, cosmic rays could have an effect on electronics. Engineers understood that if a malfunction occurred in the computerized systems of the guidance and control on the Space Shuttle that the consequences could be catastrophic. It would not fly (pun warning) to protect the computers behind several tons of lead shielding, so the solution was simply to install four computers in stead of one.
The four computer systems all do the same calculations at the same time, and every few milliseconds they compare their results. If ever one of the four disagrees with the others then that disparate one is ignored. This quadruple redundancy of the Space Shuttle computer systems also provided a way to track the frequency of bit flips. On Space Shuttle mission STS-48 in 1991 there were 161 bit flips recorded in five days, and thanks to four computers Discovery landed safely.
Here on Earth, the chip maker Intel estimates that in every computer, smartphone, and tablet there could be one bit flip every few months; but who knows because these SEU soft errors leave no clues. Chip makers have created sophisticated Error Correcting Code to help mitigate the effects of bit flips. After Quantas Airlines flight 76 almost crashed, airplane manufacturers made upgrades to avionics making already-reliable system even more reliable. But things sometimes still go bump in the night. If your electronic device does something inexplicable only once, it could have been those cosmic rays.
Charles Miller is a freelance computer consultant, a frequent visitor to San Miguel since 1981 and now practically a full-time resident. He may be contacted at 415-101-8528 or email FAQ8@SMAguru.com.