dogdidit
Posts: 315 Joined: Mar. 2008
|
Quote (goalpost @ Aug. 29 2008,15:30) | Quote (dogdidit @ Aug. 27 2008,13:17) | Quote (goalpost @ Aug. 27 2008,12:21) | Ok, so I send two messages:
Both messages contain a human DNA sequence - ACGT etc etc, each letter coded as two bits, ie 00 = A, 01 = C, 10 = G, 11 = T. This message is non-compressible. |
Actually, it is very compressible if it is a DNA sequence, since codons (triplets of base pairs) code for only 22 possible states - start, stop, and twenty amino acids - even though the symbol set could accommodate 64. So the real measure of information in DNA is no more than 4.5 bits (log2 of 22) for every three base pairs, not 6 bits (log2 of 64). |
You're quite correct, I forgot about that bit. Let me rephrase the question, then: I send 2 messages, both DNA sequences as before. Each is coded in such a way that it may not be further compressed. Each is of the same length, and contains the same number of DNA triplets.
One has 'usefulness' - it codes for a protein. The other doesn't. Does the 'useful' message contain any more information by virtue of its 'usefulness'? |
The challenge is coming to an acceptable definition of "information". Shannon's definition had to do with the entropy of the source, but this is a bit confusing because "entropy" is a poorly understood concept (at least, for me) so I prefer to think of the measure information as the reduction in uncertainty at the receiver (which is also consistent with Shannon's interpretation).
If you are sending me a symbol, and I have no idea what it might be, then each bit of information you send represents a reduction of 50% in my uncertainty. (I'm assuming that we are dealing with symbols selected from a finite and discrete set.) Let's assume you are sending a hexadecimal number in binary format. I expect to receive xxxx but I have no idea if the x's are 1's or 0's, so my uncertainty at the outset is that the symbol belongs to one of sixteen possible states, from 0 to 15. Your first bit -- let's assume it's a 1 -- cuts my uncertainty in half, since now I know that the symbol is of the form 1xxx and therefore the set [0 7] are ruled out and the symbol must lie in [8 15]. Eight possible states; half as many as before. Half the uncertainty.
Notice that at no time do I rule that your message is cogent or noise. A 1,000-character post on AtBC contains the same quantity of "information" as a 1,000-character post on UD (shudder!! except we also know that the UD can't be a post from kairosfocus) if both are written in the same language.
Applying this to DNA (and here I wander out of engineering and into microbiology - ALERT! ALERT!), mRNA (m = message!) is a message from the cell nucleus to the ribosome: "Here. Make this protein". The ribosome will read the mRNA three bases at a time and use those triplets (codons) to determine* which amino acid to append to the polymer it is assembling. It does not care whether the polymer is useful or toxic or garbage or a viroid. The semantics of the message are irrelevant to the measure of information.
* I'm overlooking the role of tRNA and the myriad other helper molecules that help make the magic happen.
In information theory, there is AFAIK no way to measure the "usefulness" of the message. (Is a viroid useful? To the virus, it is. Are point mutations useful?) The IDers might wish to extend information theory to do just that, but so far they've not come up with the goods. "FCSI" and other concoctions appear to me as just so much unsubstantiated wishful thinking. I appreciate their desire to distinguish between "useful" and "useless" information but the science does not help them and they have not extended the science to do so. Invoking "information theory" in the defense of their efforts is nothing more than intellectual hi-jacking.
Claude Shannon was a brilliant man, and Bell Labs was the "Google" of it's day. Interestingly enough, his doctoral thesis was An Algebra for Theoretical Genetics. However, as far as I can tell he earned only one doctorate.
-------------- "Humans carry plants and animals all over the globe, thus introducing them to places they could never have reached on their own. That certainly increases biodiversity." - D'OL
|