USS Clueless - Fault tolerant distributed computing
     
     
 

Stardate 20020821.1402

(Captain's log): The last job I held was in the cell phone industry, working on an advanced form of communications called Code Division Multiple Access, or CDMA. It has all sorts of interesting and cool characteristics, and by far the most interesting aspect of it is the fact that many cell phones in the same area will transmit on the same frequency at the same time. Unlike TDMA systems, which share a frequency but time-share so that only one is transmitting at once, in CDMA they truly are all transmitting simultaneously. And yet, the cell can take the output of a single receiver and divide it into the individual messages that each phone was sending.

It works that way going out, too. The cell figures out what it wants to send to all those phones, and mixes them all together to be broadcast by a single transmitter. Each individual phone receives the exact same thing, but can separate out the part intended for it, discarding all the information intended for other phones. If you're curious about the details of how this miracle works, I wrote about it here in April. The overall principle, however, leverages two very powerful concepts from information theory: signal and noise. CDMA is designed to encode all the information in such a way that any given phone can treat the information intended for it as signal, while treating what's being sent to all other phones as noise. In most cases there's much more noise than signal, but as long as there's enough signal, then a sophisticated noise rejection algorithm is capable of discarding all the noise leaving only a clean signal. For it to work, the amount of signal that must exist is an absolute, not a proportion. (Technically, in CDMA the measurement of signal availability is known as "EC/I0" and it's expressed in dB.) As long as it exceeds a certain threshold (with current generation hardware, somewhere in the range of 18 dB) then the total amount of noise doesn't matter.

In fact, for esoteric reasons having to do with what "bandwidth" means in systems like this (which is non-intuitive), if there is too much signal relative to the noise then the system is wasting capacity. If a phone detects that the cell is sending too much signal, it will send a message to request that the cell use less.

CDMA incorporates extraordinarily large amounts of signal processing in order to work properly. At a level above the transmission of chips and reproduction of bits, the system uses forward error correction. This is a cool technique for transmitting data over unreliable links which can get the data through without retransmission as long as the unreliability of the link doesn't exceed a certain level.

The forward error correction encoder takes a packet of bits to be transmitted, say 256 of them, and runs it through an algorithm which is actually similar to the way that some kinds of encryption work, and out the back end you get a somewhat larger packet, say 270 bits. The resulting bit pattern doesn't resemble the incoming one in any way.

And it has odd characteristics: if you were to flip one of those incoming 256 bits, and then examine the output packet, about half of them would flip. Flip a different bit, and about half of the output bits would flip but not all the same ones. What it means is that every one of the output bits contains information about every one of the input bits.

The longer packet can be transmitted over an unreliable link which munges a few of the bits in it. At the receiving end, the forward error correction decoder can then reproduce the original 256 bits as long as no more than (say) six of the transmitted bits were munged, no matter which ones they were. What it's doing is to take the information each of the transmitted bits says about each payload bit and tallying them up, while assuming that some of them will be lying. As long as enough signal gets through, the noise can be completely rejected.

More interestingly, if it's constructed properly, then if there was too much munging it will know. It either provides the right answer or tells you that it cannot provide any answer at all, permitting you to recover from the transmission failure at a higher level in the stack. The particular FEC algorithm used in CDMA was invented by Dr. Viterbi, and bears his name: Viterbi encoding. (And it's covered by a major patent.)

Remember those cheap walkie-talkies you can buy in toy stores for kids, which the package claims are good for "up to a quarter of a mile"? My memory of those from when I was a kid was that in practice they didn't carry any further than you could shout. A quarter mile required absolutely ideal conditions, which would be damned rare in the real world.

Those are using simple AM transmissions on one of the CB frequencies, and they transmit at 50 milliwatts (above which power the FCC requires a license). AM is just about as bad as it gets in radio transmission for reliability, and it depends entirely on power to get its signal through. Simple FM is slightly better, but not much. (That's what AMPS cell phones use.)

CDMA cell phones are restricted by the FCC to no more than 200 milliwatts, and usually operate at much lower levels than that, yet their signal can carry more than four miles to a cell. It's all that noise rejection and encoding which makes it work. Technically, it's referred to as "coding gain" and it means that sophisticated signal processing can substitute for r

Captured by MemoWeb from http://denbeste.nu/cd_log_entries/2002/08/Faulttolerantdistributedc.shtml on 9/16/2004