RSS 2.0 Feed

» Welcome Guest Log In :: Register

Pages: (5) < 1 [2] 3 4 5 >   
  Topic: Complexity vs. Information< Next Oldest | Next Newest >  
goalpost



Posts: 2
Joined: Aug. 2008

(Permalink) Posted: Aug. 27 2008,12:21   

Ok, so I send two messages:

Both messages contain a human DNA sequence - ACGT etc etc, each letter coded as two bits, ie 00 = A, 01 = C, 10 = G, 11 = T. This message is non-compressible. It has maximum complexity as I understand it.

Message one's sequence codes for a protein.
Message two's contains junk DNA.

Does message 1 contain more information?

  
dogdidit



Posts: 315
Joined: Mar. 2008

(Permalink) Posted: Aug. 27 2008,13:17   

Quote (goalpost @ Aug. 27 2008,12:21)
Ok, so I send two messages:

Both messages contain a human DNA sequence - ACGT etc etc, each letter coded as two bits, ie 00 = A, 01 = C, 10 = G, 11 = T. This message is non-compressible.

Actually, it is very compressible if it is a DNA sequence, since codons (triplets of base pairs) code for only 22 possible states - start, stop, and twenty amino acids - even though the symbol set could accommodate 64. So the real measure of information in DNA is no more than 4.5 bits (log2 of 22) for every three base pairs, not 6 bits (log2 of 64).

(There may be other functional purpose by the apparent redundancy in the DNA code. For starters, it confers some point mutation error immunity to -some- amino acids...but that's an archival storage relieability issue, not a communication issue. The error imunity redundancy could be restored at the receiver but the inital information would be lost. That might matter to molecular biologists who want to use the genome data and point mutation distributions for cladistics and synteny and other cabbalistic darwinian materialist evolander corruptions...)

Quote
It has maximum complexity as I understand it.

What is your measure of complexity?

Quote

Message one's sequence codes for a protein.
Message two's contains junk DNA.

Does message 1 contain more information?

Difficult question. What you're asking is how much entropy (uncertainty) is there in the sequence of amino acids (our message set) in the proteins that make up the human proteome. Are some amino acids rarer than others? Are some amino acids sequences more likely than others? If the answer is yes, then the entropy of the source will be less than that of a source whose symbols have equal probability. That would reduce the information content from 4.5 bits per codon to something less.

Junk DNA, assuming it is not under selection pressure (else why would it be "junk"?), would be likely to accumulate mutations more rapidly than DNA related to the proteome, yes? Those mutations should help to "shuffle the deck" and over time one would expect the symbol set to drift toward equiprobability. (But never quite get there - equally random sequences of base pairs does not code for equally random sequences of amino acids.) So my guess is that yes, the junk DNA has more information (as defined by information theory) than DNA that codes for proteins.

BTW IANAB. I are uh injineer.

--------------
"Humans carry plants and animals all over the globe, thus introducing them to places they could never have reached on their own. That certainly increases biodiversity." - D'OL

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 27 2008,13:24   

Just to mess things up further:

you can have 10 coders write 10 programs to do the same thing. They are all different and of varying length. Then you can compress them. The shortest initial program may not still be the shortest after compression - possibly due to luck, or perhaps the structure of reusable common elements with the code. So which program has the most information, given that bigger can be smaller, and should we be measuring the code or the inputs and outputs?

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
slpage



Posts: 349
Joined: June 2004

(Permalink) Posted: Aug. 27 2008,13:57   

Quote (dogdidit @ Aug. 27 2008,13:17)

Quote

Actually, it is very compressible if it is a DNA sequence, since codons (triplets of base pairs) code for only 22 possible states - start, stop, and twenty amino acids - even though the symbol set could accommodate 64. So the real measure of information in DNA is no more than 4.5 bits (log2 of 22) for every three base pairs, not 6 bits (log2 of 64).


Yes, but isn't the compression you write of 'conceptual' (I can't think of a better word)?
Sure, you can run a computer file through a compression algorithm and all that, but DNA is physical - more akin to trying to 'compress' a CD as opposed to the 'information' ON the CD, if my point is making any sense.

Quote
Quote

Message one's sequence codes for a protein.
Message two's contains junk DNA.

Does message 1 contain more information?

Difficult question. What you're asking is how much entropy (uncertainty) is there in the sequence of amino acids (our message set) in the proteins that make up the human proteome. Are some amino acids rarer than others? Are some amino acids sequences more likely than others? If the answer is yes, then the entropy of the source will be less than that of a source whose symbols have equal probability. That would reduce the information content from 4.5 bits per codon to something less.

Junk DNA, assuming it is not under selection pressure (else why would it be "junk"?), would be likely to accumulate mutations more rapidly than DNA related to the proteome, yes? Those mutations should help to "shuffle the deck" and over time one would expect the symbol set to drift toward equiprobability. (But never quite get there - equally random sequences of base pairs does not code for equally random sequences of amino acids.) So my guess is that yes, the junk DNA has more information (as defined by information theory) than DNA that codes for proteins.


OK, so while we are discussing hypotheticals, how about this one.

Two DNA sequences, both 1000 bps long, both identical with one exception - one sequence starts with TAA instead of TAC.
The 'functional sequence' has a measured information content of (just tossing out a number here to make it simple) 1000.
Would the non-functional sequence have a content of 999 or 0?

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 27 2008,14:03   

food for thought, the compressibility of PI:

http://en.wikipedia.org/wiki/Leibniz_formula_for_pi

PI is a non-recurring non terminating decimal.

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
stevestory



Posts: 13407
Joined: Oct. 2005

(Permalink) Posted: Aug. 27 2008,14:07   

Quote (goalpost @ Aug. 27 2008,13:21)
Ok, so I send two messages:

Both messages contain a human DNA sequence - ACGT etc etc, each letter coded as two bits, ie 00 = A, 01 = C, 10 = G, 11 = T. This message is non-compressible. It has maximum complexity as I understand it.

Message one's sequence codes for a protein.
Message two's contains junk DNA.

Does message 1 contain more information?

proteins often have repetitive subunits, so that's another reason the DNA for a protein would be compressible.

   
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,15:43   

Quote (Wesley R. Elsberry @ Aug. 26 2008,18:14)
Antievolutionists want to confuse and conflate meaning and information. Spetner, Gitt, Truman, and Dembski... all of them want meaning to be folded within whatever sort of "information" they propose.

Shannon's discussion of information explicitly excluded meaning. Algorithmic information theory only cares about one aspect of meaning: what is the shortest program and input that can generate a string?

Critique of Dembski's "complex specified information"

Wes, the one exception to what you say is the Kolmogorov structure function of algorithmic information theory, which I have seen Paul Vitanyi relate to "meaning."

For criticism of the morph of CSI that came after the one you and Shallitt addressed, see this.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,16:55   

The two leading notions of the quantity of information in an object are the self-information of Shannon and the algorithmic information of Solomonoff, Chaitin, and Kolmogorov (working more or less independently). Algorithmic information is widely known as Kolmogorov complexity, so distinctions between "complexity" and "information" are not clear-cut. What self-information and algorithmic information have in common is relation of information to description.

For Shannon, there is an objective probability distribution p on a set of outcomes of a random experiment, and the information in outcome x is -log p(x). (Let's say that the logarithm is base-2, and the unit of information is the bit.) A justification for regarding this as the intrinsic information in x is that if you want to transmit messages indicating the outcomes of repeated experiments to a receiver, and you want to minimize the average number of bits you transmit per outcome, then you and the receiver ideally agree on a code that associates a binary description of length -log p(x) with each possible outcome x.

A problem with Shannon's self-information is the issue of how we know the objective (actual) distribution p in terms of which ideal description length is defined. In algorithmic information theory, description is defined in terms of computation instead of probability. That is, the algorithmic information of a string (finite sequence) of symbols is the length of the shortest binary computer program that outputs the string and halts. To be more specific, the computer is a universal computer, equivalent in "computing power" to a universal Turing machine. If we restrict ourselves to "simple" universal computers, the program length varies little from one computer to the next. To make this more concrete, and to relate it to earlier comments, the program is like a self-extracting zip archive. The program is a compact description of the string x it outputs.

From the equality

description_length = -log p(x),

we may obtain

p(x) = 2^-description_length.

In algorithmic information theory, it is common to define probability in terms of description (shortest program) length. What is known as the universal distribution corresponds roughly to the latter of the equalities I just gave you. So, in a sense, information (ideal description length) "comes from" probability in Shannon's information theory, and probability comes from information (shortest-program length) in algorithmic information theory.

I'm sure I've just thrown way too much at some of you. But some of you have been in the ballpark with your remarks, and I hope this helps a bit.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 27 2008,17:00   

Thanks for your posts, turncoat. Now, if you can put your empirical hat on, why don't we see CSI calculations?

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
dogdidit



Posts: 315
Joined: Mar. 2008

(Permalink) Posted: Aug. 27 2008,17:24   

Quote (slpage @ Aug. 27 2008,13:57)
       
Quote (dogdidit @ Aug. 27 2008,13:17)
Actually, it is very compressible if it is a DNA sequence, since codons (triplets of base pairs) code for only 22 possible states - start, stop, and twenty amino acids - even though the symbol set could accommodate 64. So the real measure of information in DNA is no more than 4.5 bits (log2 of 22) for every three base pairs, not 6 bits (log2 of 64).

Yes, but isn't the compression you write of 'conceptual' (I can't think of a better word)?
Sure, you can run a computer file through a compression algorithm and all that, but DNA is physical - more akin to trying to 'compress' a CD as opposed to the 'information' ON the CD, if my point is making any sense.

The OP spoke about using bits to encode the nucleotides:
Quote (goalpost @ Aug. 27 2008,12:21)
Both messages contain a human DNA sequence - ACGT etc etc, each letter coded as two bits, ie 00 = A, 01 = C, 10 = G, 11 = T.

...so I was responding to that. I would agree that compressing functional DNA does not seem possible. Perhaps a very large steam press...

Quote
Quote
Quote

Message one's sequence codes for a protein.
Message two's contains junk DNA.

Does message 1 contain more information?

Difficult question. What you're asking is how much entropy (uncertainty) is there in the sequence of amino acids (our message set) in the proteins that make up the human proteome. Are some amino acids rarer than others? Are some amino acids sequences more likely than others? If the answer is yes, then the entropy of the source will be less than that of a source whose symbols have equal probability. That would reduce the information content from 4.5 bits per codon to something less.

Junk DNA, assuming it is not under selection pressure (else why would it be "junk"?), would be likely to accumulate mutations more rapidly than DNA related to the proteome, yes? Those mutations should help to "shuffle the deck" and over time one would expect the symbol set to drift toward equiprobability. (But never quite get there - equally random sequences of base pairs does not code for equally random sequences of amino acids.) So my guess is that yes, the junk DNA has more information (as defined by information theory) than DNA that codes for proteins.


OK, so while we are discussing hypotheticals, how about this one.

Two DNA sequences, both 1000 bps long, both identical with one exception - one sequence starts with TAA instead of TAC.
The 'functional sequence' has a measured information content of (just tossing out a number here to make it simple) 1000.
Would the non-functional sequence have a content of 999 or 0?

1000. That assumes a C is as likely as an A. Functionality ("semantic content") is irrelevant.

@Turncoat: yep, I am using Shannon's definition (and thanks for not mentioning my errors).

--------------
"Humans carry plants and animals all over the globe, thus introducing them to places they could never have reached on their own. That certainly increases biodiversity." - D'OL

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,17:27   

Almost all binary strings x have no program ("self-extracting zip archive") much shorter than themselves. A string that is in this sense incompressible is algorithmically random by definition. The definition makes sense, though, because an algorithmically random string x passes all computable tests of randomness. In essence, there is no effective procedure that allows you to say that the bits in x did not come from a random source.

The problem some of you have had in experimenting with compression of texts is that you haven't taken into account the fact that you're working with strings of (presumably) 8-bit characters. If I recall correctly, the bzip utility compresses to bit strings instead of character strings. This is what you need for your experiments.

For large N, a string of N digit-bytes (8 bits per byte) coming from a random source will compress from 8N bits (always) to about N log 10 bits (almost always). The high compression ratio comes only from the inefficiency of an 8-bit representation of textual digits.

There are similar considerations with Macbeth. There are not 2^8 = 256 letters, spaces, and punctuation marks in Shakespeare's text, so you get some compression simply because the initial 8-bit representation of characters is inefficient. At first blush, I would say that the number of distinct characters is no more than 64. Thus to get a better estimate of the compressibility of Macbeth, count 6 bits per byte of the source text, not 8. Count 8 bits per byte of bzip output. (Bzip may tell you the number of bits in the output -- I haven't used it in a long time.)

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Henry J



Posts: 5786
Joined: Mar. 2005

(Permalink) Posted: Aug. 27 2008,17:55   

Turncoat,

That's kind of what I was getting at in my post last night.

Henry

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,18:08   

Quote (Richardthughes @ Aug. 27 2008,17:00)
Thanks for your posts, turncoat. Now, if you can put your empirical hat on, why don't we see CSI calculations?

RIP those who were waiting with bated breath to see Dembski's computation of CSI for the bacterial flagellum.

To apply CSI, you have to come up with an upper bound on the probability that a configuration of matter arose by purely natural processes. In principle, that does not reduce to argument from ignorance. In practice, you run afoul of argument from ignorance whenever you go from something like coin tossing to the  flagellum. We know all we're going to know about outcomes of tossing a fair coin, but we don't know how much we might eventually know about natural processes giving rise to the flagellum. From a Bayesian perspective, scientific learning generally increases the probability of observed phenomena. How is Dembski going to place an upper bound on the probability a future model might assign justifiably to the emergence of the flagellum by natural processes?

I think that assignment of probabilities to explanations of historical events that occurred under largely unknown circumstances is a philosophical quagmire. Dembski complains that evolutionary models are not sufficiently detailed. I would contend that evolutionary models are correct in reflecting our ignorance of the details. We cannot turn the emergence of the flagellum into a repeatable, controlled experiment, and I have no idea how Dembski justifies his frequentism in the development of CSI.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 27 2008,18:35   

They have compressed it to "bac flag", though. Progress!
???

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,18:47   

Quote (Henry J @ Aug. 27 2008,17:55)
Turncoat,

That's kind of what I was getting at in my post last night.

Henry

Yep. Hadn't read that far yet.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,18:53   

Quote (Richardthughes @ Aug. 27 2008,18:35)
They have compressed it to "bac flag", though. Progress!
???

I saw "BF" yesterday. You may see that as further progress, but I say that when the information content of the message is zero, the waste of bandwidth is infinite.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
steve_h



Posts: 544
Joined: Jan. 2006

(Permalink) Posted: Aug. 27 2008,20:29   

Quote
There are similar considerations with Macbeth. There are not 2^8 = 256 letters, spaces, and punctuation marks in Shakespeare's text, so you get some compression simply because the initial 8-bit representation of characters is inefficient. At first blush, I would say that the number of distinct characters is no more than 64. Thus to get a better estimate of the compressibility of Macbeth, count 6 bits per byte of the source text, not 8. Count 8 bits per byte of bzip output. (Bzip may tell you the number of bits in the output -- I haven't used it in a long time.)

I just downloaded Macbeth from Project Gutenberg and compressed it with the latest version of WinZip.  It reduced from 116kb to 31.5kb. Instead of one character being stored in eight bits, it required a little over two bits. That's rather better than what you'd get by saving random six-bit characters.

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 27 2008,20:52   

All this experimenting is not very congruent with ID, folks. back to navel-gazing and hand-waving, please.

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,21:44   

Quote (Richardthughes @ Aug. 27 2008,20:52)
All this experimenting is not very congruent with ID, folks. back to navel-gazing and hand-waving, please.

Have you forgotten how amazingly "creative" one can be with MatLab?

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 27 2008,22:26   

Here's a little shell script for getting word counts (case insensitive) from a text:
Code Sample
#!/bin/sh

tr -d "'" |
tr -cs "[:alpha:]" "\n" |
tr "[:upper:]" "[:lower:]" |
sort |
uniq -c |
sort -rn |
awk '{ cum += $1; print  $1, cum, $2; }'


It says that Macbeth contains 18596 instances of 3379 distinct words. Here are the ten most frequent words:

740 740 the
579 1319 and
385 1704 to
368 2072 of
335 2407 i
284 2691 macbeth
253 2944 a
233 3177 that
207 3384 in
202 3586 you

The first column is word frequency, and the second is cumulative frequency. Applying this command
Code Sample
awk '{h += $1 / 18596 * -log($1/18596)/log(2)} END{print h}'

to the output, I get per-word entropy of about 9.357 bits. It would have been nice to treat punctuation marks as words, but I don't happen to have a script on hand that does that.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 27 2008,22:32   

Quote
It would have been nice to treat punctuation marks as words, but I don't happen to have a script on hand that does that.


I'm guessing they have low compressibility given their single character nature? You could perhaps make capitalization at the start of a sentence a rule and then compress that way?

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 28 2008,01:05   

Quote (Richardthughes @ Aug. 27 2008,22:32)
 
Quote
It would have been nice to treat punctuation marks as words, but I don't happen to have a script on hand that does that.


I'm guessing they have low compressibility given their single character nature? You could perhaps make capitalization at the start of a sentence a rule and then compress that way?

Periods and commas have low self-information (are given short encodings) because they are of high probability. Most capitalization can be recovered from punctuation. My transformation to lowercase of words that are always capitalized (e.g., "Macbeth") has no effect on the entropy calculation.

My brain is fried, so I'm not good for any real work. Perhaps I'll diddle a bit with punctuation.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 28 2008,01:59   

Here is a preprocessing script. Run Macbeth through this first, and then run the output through the filter (script) I gave above.
Code Sample
#!/bin/sh

# Spell out punctuation marks. Leave apostrophe as part of word.
# Treat end-of-line as punctuation for processing of verse.
# This works well for Shakespeare from Gutenberg Project.
# Manually remove copyright notice prior to each act.

sed '/^ *$/d'   |
sed "
s/'/papostrophe/g
s/,/ pcomma /g
s/\./ pperiod /g
s/\!/ pbang /g
s/\?/ pquestion /g
s/-/ phyphen /g
s/\[/ plbracket /g
s/\]/ prbrackt /g
s/;/ psemicolon /g
s/:/ pcolon /g
s/(/ plparenthesis /g
s/)/ prparenthesis /g
s/$/ pendofline/"

I decided that end-of-line should be treated as punctuation in verse. I also noticed that there were copyright notifications prior to Acts II-V that I had not removed in my previous analysis. Now there are 25,307 "words," and the per-word entropy is 8.228 bits. Yes, the per-word entropy went down because the empirical distribution concentrates  a great deal of probability mass on punctuation pseudo-words. Here are the twenty most frequent words and pseudo-words:

2625 2625 pendofline
1873 4498 pperiod
1646 6144 pcomma
736 6880 the
567 7447 and
385 7832 to
356 8188 of
318 8506 i
284 8790 macbeth
253 9043 a
238 9281 pquestion
229 9510 that
208 9718 psemicolon
208 9926 pbang
207 10133 in
202 10335 you
192 10527 my
183 10710 is
164 10874 not
155 11029 with

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 28 2008,08:58   

WRT punctaution rules, which become part of the codec in our examples - I guess this sort of thing works if:

1) The rule is lossless
2) Compressing the rule < compressing the examples?

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Wesley R. Elsberry



Posts: 4991
Joined: May 2002

(Permalink) Posted: Aug. 28 2008,09:42   

Quote (Turncoat @ Aug. 27 2008,21:44)
Quote (Richardthughes @ Aug. 27 2008,20:52)
All this experimenting is not very congruent with ID, folks. back to navel-gazing and hand-waving, please.

Have you forgotten how amazingly "creative" one can be with MatLab?

That was fun, though the creative folks just turned around and re-asserted everything just on their say-so rather than rely on the authority of their script.

I once actually had a program that worked properly (that is, the results were computed accurately), but I had failed to initialize pointers. Everything worked fine up until the program ended, at which point my computer would reboot itself. The uninitialized pointers apparently happily pointed into memory regions used by MS-DOS... worked fine while my program was doing its thing, but it hammered the in-memory parts of some system stuff, COMMAND.COM and MSDOS.SYS or similar bits. That, though, was enough of an inducement to track down the problem.

I would have thought that coming up with results so starkly inconsistent with decades of peer-reviewed research would have given the MATLAB programmer(s) pause, but apparently since the error went in a direction parallel to their prejudices, it seemed not to raise any alarm bells.

--------------
"You can't teach an old dogma new tricks." - Dorothy Parker

    
Wesley R. Elsberry



Posts: 4991
Joined: May 2002

(Permalink) Posted: Aug. 28 2008,09:46   

That reminds me... has anyone seen the Marks/Dembski collaborations appear in print anywhere yet?

Though I suppose that if they do, notice is likely to be given the IDC equivalent of a ticker-tape parade, appearing on the DI blog, the ID-the-Future blog, UD, TT, and however many DO'L blogs there are at the time.

--------------
"You can't teach an old dogma new tricks." - Dorothy Parker

    
slpage



Posts: 349
Joined: June 2004

(Permalink) Posted: Aug. 28 2008,11:06   

Quote (dogdidit @ Aug. 27 2008,17:24)

Quote
The OP spoke about using bits to encode the nucleotides:
 
Quote (goalpost @ Aug. 27 2008,12:21)
Both messages contain a human DNA sequence - ACGT etc etc, each letter coded as two bits, ie 00 = A, 01 = C, 10 = G, 11 = T.

...so I was responding to that. I would agree that compressing functional DNA does not seem possible. Perhaps a very large steam press...


Indeed.

This has always sort of bugged me in these discussions - talk of compressability and information and DNA.  
Quote
Quote
OK, so while we are discussing hypotheticals, how about this one.

Two DNA sequences, both 1000 bps long, both identical with one exception - one sequence starts with TAA instead of TAC.
The 'functional sequence' has a measured information content of (just tossing out a number here to make it simple) 1000.
Would the non-functional sequence have a content of 999 or 0?

1000. That assumes a C is as likely as an A. Functionality ("semantic content") is irrelevant.

@Turncoat: yep, I am using Shannon's definition (and thanks for not mentioning my errors).


Interesting.  Funny - when I present IDcretos with similar scenarios, then get themselves into a tizzy and can never seem to even try to address the question.

  
stevestory



Posts: 13407
Joined: Oct. 2005

(Permalink) Posted: Aug. 28 2008,13:33   

Quote (Wesley R. Elsberry @ Aug. 28 2008,10:46)
however many DO'L blogs there are at the time.

That's a subject best left to the guy who managed to calculate Disaster Area's tax returns.

   
Turncoat



Posts: 129
Joined: Dec. 2007

(Permalink) Posted: Aug. 28 2008,15:55   

Quote (Wesley R. Elsberry @ Aug. 28 2008,09:46)
That reminds me... has anyone seen the Marks/Dembski collaborations appear in print anywhere yet?

Though I suppose that if they do, notice is likely to be given the IDC equivalent of a ticker-tape parade, appearing on the DI blog, the ID-the-Future blog, UD, TT, and however many DO'L blogs there are at the time.

Nothing in print that I know of. The fact is that I'll be obsoleting anything they publish, anyway.

--------------
I never give them hell. I just tell the truth about them, and they think it's hell. — Harry S Truman

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Aug. 28 2008,15:58   

Quote (Turncoat @ Aug. 28 2008,15:55)
The fact is that I'll be obsoleting anything they publish, anyway.

*Proffers High-Five*  :p

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
  127 replies since Aug. 26 2008,14:35 < Next Oldest | Next Newest >  

Pages: (5) < 1 [2] 3 4 5 >   


Track this topic Email this topic Print this topic

[ Read the Board Rules ] | [Useful Links] | [Evolving Designs]