One of the important functions fulfilled by governments is to provide enforcement of standards, like weights and measures. Having clear and objective measures is essential for trade. History provides many instances of people putting rather a lot of effort into making sure that measurements are made fairly. I think that those who wish to make arguments whose claims are couched in terms of quantities must provide the scale by which those quantities are determined. If they do not do so, and it appears that the arguments presented otherwise show certain flaws or inconsistencies, I think it is perfectly appropriate to classify the original claims as mistaken apologetics or polemics. From Lee Spetner: LS>Thank you for forwarding me the questions that have arisen LS>about how I define and measure genetic information. The LS>presumption is that unless I quantify the information in a LS>gene, I am not entitled to say, for any mutation, whether LS>the gene gains or loses information. I reject that LS>presumption. The presumption being that if one asserts a claim that certain quantities differ, that one can provide an objective measure by which those quantities are derived. This seems a generally reasonable sort of presumption. I might have guessed that Spetner would reject the presumption. What remained was to see how good an argument Spetner could muster for avoiding quantifying information change such that everyone could objectively do the job. I remain unconvinced that Spetner's argument does more than show that some certain classes of change must necessarily show decrease in information under some relevant information measure. Spetner does not show that the sub-class of genetic change mentioned comprises all possible genetic change. LS>Before addressing the issue of quantifying the information LS>in a gene, let me point out that all the random mutations I LS>discussed in my book (and by extension, all known mutations LS>whose molecular structures have been examined) cannot serve LS>as prototypes for the mutations that are supposed to make LS>up the long series of evolutionary steps claimed by LS>neo-Darwinists to have led to major evolutionary advances. Spetner's opinion is noted. LS>They cannot serve as prototypes of the mutations in the LS>steps that are supposed to have led from a single cell to LS>an insect, from a fish to a mammal, and so on. Most of LS>these mutations are single-nucleotide substitutions that LS>disable a control gene. Disabling a gene cannot be a recipe LS>for evolutionary advance. Although sometimes, perhaps, a LS>gene would have to be disabled in the course of evolving a LS>new enzyme, such disabling cannot represent a major portion LS>of what has to occur to achieve a new function. It cannot LS>even represent a small fraction of what must occur. Most LS>mutations in a putative series of evolutionary steps LS>leading to a new species or a new order, class, or phylum, LS>must add to the genome the information necessary to achieve LS>that advance. It should be clear that information must be LS>added to the genome to evolve a bacterium into a human, or LS>even into a fruit fly. One who insists that it is not LS>obvious that a human genome contains vastly more LS>information than that of a bacteriulm is a sophist. On the other hand, those who would argue that the human genome contains more information than certain amoebae or amphibians simply are not familiar with the data. But already Spetner has introduced *meaning* into the discussion. Information does not have any simple relationship with meaning. Whether a point mutation disables, enables, or re-enables a protein product makes no difference whatever under a Shannon measure of information applied to the sequence of base pairs. And so one possibility, that Spetner simply utilizes a Shannon information measure as his scale, is shown to be false. LS>If no mutation that has been studied is of the type needed LS>for neo-Darwinian macroevolution, then there is no LS>molecular evidence that random mutations and natural LS>selection can achieve that evolution. Even if one credulously assumes the premise is true, this is a non sequitur. Research can (and does) uncover the condition of linkage disequilibrium, which is indicative of the action of natural selection, although identifying the mutation or type of mutation which led to that condition is a separate and tougher problem. The molecular evidence for natural selection, though, exists with or without that identification. Nor do I accept that no such mutations have been studied. LS>Sure, we know many single-nucleotide substitutions that can LS>lead to microevolution. But there is no argument about LS>microevolution. My argument is against the premise that LS>random mutation, even with the help of natural selection, LS>is the driving force behind an evolutionary advance from a LS>primitive cell to human beings. There is no genetic LS>evidence for such a premise. A relevant question is whether Spetner can admit that any such evidence is possible. If no amount of evidence taken from modern organisms can be accepted by Spetner as having relevance to the issue, then it would seem that the claim, while sounding portentous, means nothing. But I'm looking at a claim at a different level. It has been said here, with the invocation of Spetner as an authority, that no mutation can in principle increase genetic information. It is this claim that most clearly needs the quantification method specified so that we can resolve whether to consider it true (supported by the available evidence), false (contradicted by evidence), or rhetorical (evidence is superfluous to acceptance of the statement). [Quote] "Information theory, which was introduced as a discipline about half a century ago by Claude Shannon, has thrown new light on this problem. It turns out that random variation cannot lead to large evolutionary changes. The information required for large-scale evolution cannot come from random variations. There are cases in which random mutations do lead to evolution on a small scale. But it turns out that, in these instances, no information is added to the organism. Most often, information is lost. A process that adds no heritable information into the organism cannot lead to the grand evolutionary advances envisioned by the neo-Darwinians." (Spetner, 1997 p.vii) [End Quote - SE Jones quoting Lee Spetner, post on 1998/06/18] I find the invocation of Shannon in this manner amusing, since his analysis cited all sorts of stochastic sources as producing information. LS>I submit that one need not measure the information in a LS>gene to know if a particular mutation has added or LS>subtracted information. I submit that if one wants to make a claim about comparing quantities, one had better provide a means of measurement. LS>There is no general way of measuring the information in a LS>single message without relating it to the ensemble of LS>messages from which it was chosen. If one's task is to precisely determine the absolute number of bits of information in a message such that this number does not change, then Spetner's comment is both relevant and correct. But our task is to determine whether a change in some message causes an information increase, decrease, or no change. For this purpose, one does not need to have an absolute measure of information; a relative measure may serve quite well, even if we have to specify a number of assumptions in order to apply it. Since both our original analyzand and the changed version represent messages taken from the same ensemble of messages (or ergodic source to use Shannon's terminology), we could specify a minimal ensemble that covers production of both messages as a basis for measuring their information content. We then find the information content of each message relative to this minimal ensemble, and can then compare the numbers, which tell us which of the two messages has the greater information content. Of course, this does not tell us in some absolute sense how many bits of information each message contains, but that data was not necessary to resolve our problem of interest. If we have knowledge of the ensembles of possible messages, then H is determined with respect to the known statistical structure of the ensemble. That statistical structure concerns the probability of each symbol, and sequence probabilities where the prior symbols in the sequence are correlated with following symbols with characteristic probabilities. But this statistical structure is a finding of empirical research in the biological cases that we are discussing, and thus results from a process of discovery. Our initial state is ignorance of the statistical properties of possible ensembles. But by Shannon's definition of an ergodic source, we know that the statistical structure will be reflected to some degree in every message generated by it. We can characterize the ensemble by the properties of the messages taken from it. If Spetner's measure of information does not retain this property, that is an important and telling datum. If despite what I pose above, Spetner finds that even a relative information measure is inappropriate, then I submit that broad claims about whether mutational processes increase or decrease information are also inappropriate, as no suitable metric can be applied in general as a basis for making the claim. LS>Similarly, there is no general way of measuring the entropy LS>in a single message without relating it to the ensemble of LS>messages of which it is a member. So, under a Spetnerian information measure, no analysis can occur without the complete knowledge of the ensemble of messages from which some particular message of interest is taken. This is just about completely inverted from Shannon's analysis, where the properties of the message approach the properties of the ensemble in the limit as the message length approaches infinity. The longer the message, the less likely it is that the statistical distribution of symbols within it might differ significantly from those of the ensemble of possible messages. (And for an infinite-length message, that likelihood is demonstrably *zero* under Shannon's analysis.) If Spetner's measure of information does not retain this property whereby sources can be characterized mathematically from the contents of messages, then it is highly questionable whether it ought to be applied to any real-world situation where our knowledge of such sources is the result of a process of discovery. Specifically, it should not be applied to genetic information if it does not have this property of Shannon's measure. LS>Shannon was careful to avoid relating the information LS>measure he was defining to the meaning contained in a LS>message. The communication engineer must build a LS>communication channel that will faithfully transmit a LS>message regardless of how much meaning the customer LS>attaches to that message. Yes, the distinction that Shannon made between information and meaning is one clue that what Spetner means by "information" is not what Shannon meant by "information". [Quote] "You can easily add symbols to a message and not add information: just add random symbols. Then you won't be adding information - you'll be adding only nonsense. Similarly, if you add random nucleotides to the genome you add no information. Symbols without meaning carry no information." (Spetner, 1997 p83) [End Quote - SE Jones quoting Lee Spetner, post on 1998/06/18] LS>There is no adequate definition of the information in a LS>message without relating it to the ensemble of messages LS>that could have been sent. This is a statement that there is no adequate definition of information that would yield an *absolute* measure that was not subject to change if our knowledge of the ensemble changed. But the lack of an *absolute* information measure may not be a reason to avoid using some information measure to *compare* two messages taken from the same ensemble, as described above. If lack of this absolute information measure is asserted to apply regardless, then claims about quantities of information before and after mutation should be modified to reflect the subjective and speculative nature of the claim. LS>Thus I cannot expect to measure the information in an LS>arbitrary paragraph of English text. Nor can I expect to LS>measure the information in a section of a genome. Spetner is overlooking the possibility of quantifying information for comparative purposes. We can get a relative information measure and thus objectively determine whether Paragraph A contains more, less, or the same amount of information as Paragraph A' which is Paragraph A after alteration under a specified set of assumptions. And we can do the same for two sections of a genome, one of which represents a change from the other. See for a brief discussion of information entropy as applied to sections of genomes. LS>But whatever the information in a paragraph of text, if I LS>struck out one or more sentences, I can be sure that I have LS>not increased the information. Rather, I can confidently LS>say that I have decreased the information. This is true under a Shannon measure. The number of bits in a message increases with the length of the message, and thus a deletion reduces the number of bits. But without seeing Spetner's information measure, it is impossible to know whether information goes up, down, or sideways with deletions when it is employed. It is always possible that Spetner intended and believed his equation for information to make information decrease with decreasing message length, but that it might not actually do so. Until Spetner produces the equation, this point is in doubt. LS>(I exclude the case in which the paragraph was nonsense and LS>didn't contain any information to begin with. In such a LS>case the information was zero both before and after I LS>struck out the sentences.) Again, I have to wonder what information measure Spetner uses. Under Shannon, information and meaning are separate concerns. But Spetner's equation of "nonsense" and lack of information content conflates and confuses meaning with information. Under Shannon, those messages with zero information content all share the property of being a message comprised of one symbol repeated over and over again. Those messages with maximal information content are those with the property of equal proportions of each symbol within the message. This has nothing whatever to do with the meaning that is taken from the message. Once one introduces meaning into the discussion (other than to exclude it from further discussion), it is not clear that information theory is still being discussed. LS>This example shows that indeed one can sometimes determine LS>whether a change in a message has decreased the information LS>without having quantified the information of the original LS>message. Only by knowing the properties of a relevant information measure can this assertion be made. The assertion is consistent with the properties of a Shannon information measure. I don't know what the Spetner information measure looks like, except that it cannot be the Shannon measure. By symmetry, though, one can by Spetner's argument also determine that certain changes necessarily increase information without having a quantification in hand. LS>I hold that the disabling of a genetic function is a LS>decrease in information. This is not necessarily so. Even if we allow meaning to be conflated with information, it would depend upon the manner in which disabling occurs. Spetner has discussed the case in which a point mutation causes a disabling of the function of the protein product. This introduces an argument which is premised upon meaning rather than information. A change of base due to a point mutation does not necessarily change message length, and may only alter the information content by some small fraction of the total number of bits needed to represent the entire allele. A point mutation is only occasionally going to alter the information content of any biological allele by more than just a few bits (and in those cases it will be due to changing the allele length by altering some codon such that it either ceases to be a stop codon or becomes a stop codon), even if it rather drastically alters the function one obtains from its protein product. Disabling a genetic function may thus, even in the case of a point mutation, be associated with a decrease in information (e.g., truncation of allele due to change to an early stop codon or decrease in heterogeneity of bases within the allele), no change in information (e.g., change of codon in a critical region to code for an amino acid that reduces or eliminates function but where the base changed to has the same frequency of occurrence as the original base), or an increase in information (e.g., change of the stop codon to another codon effectively lengthening the allele or an increase in heterogeneity of bases within the allele). Another class of cases can be brought up, where a change in another gene causes the production of a protein product which acts to inhibit the function of another protein product, or which causes another gene to be repressed. In these cases, the inhibition of function of one protein or the repression of a gene is actually the function of another protein, and thus represents an increase in information. LS>Disabling a repressor gene is a decrease of LS>information. It depends upon how it is done, as explained above. Take an analogy from neurobiological modeling. A stable neural circuit is much easier to implement when one has both excitatory and inhibitory synapses to work with. The addition of inhibitory synapses to an unstable circuit can stabilize it. Function improves, and the underlying complexity of the neural structure also increases. This kind of inhibition represents an information increase, not a decrease. This can be contrasted to increasing the effect of already existing inhibition in such a circuit by the expedient of removing excitatory synapses, which would correspond to the "striking out" mentioned by Spetner before. LS>It's like striking out a sentence in a paragraph. OK, let's see how this goes. Paragraph 1: "Collect eggs. Warm skillet. Add pat of butter to skillet. Break eggs into skillet. Season with salt and pepper. Turn eggs once. Throw skillet at ceiling, swallow eggshells. Cook until whites have light brown color at edges. Serve." Paragraph 2: "Collect eggs. Warm skillet. Add pat of butter to skillet. Break eggs into skillet. Season with salt and pepper. Turn eggs once. Cook until whites have light brown color at edges. Serve." Paragraph 3: "Collect eggs. Warm skillet. Add pat of butter to skillet. Break eggs into skillet. Season with salt and pepper. Turn eggs once. Ignore the next sentence. Throw skillet at ceiling, swallow eggshells. Cook until whites have light brown color at edges. Serve." Which paragraph has the most information? By Shannon, it is number three. Paragraph number two has the least information, and paragraph one has an intermediate amount of information. Spetner's analysis is like looking at the change from P1 to P2 while ignoring the fact that changes like going from P1 to P3 can happen as well. LS>The strikeout might be improve the readability of the text, LS>but it is not an addition of information. Certainly, one LS>cannot write a book by starting with a few paragraphs and LS>blue-penciling them. One might improve those paragraphs LS>(analogous to microevolution), but one could never produce LS>a book that way (analogous to macroevolution). Fortunately, "strikeouts" are not the only sorts of genetic changes that have been documented. Deletions of genomic content do happen, to be sure, but this is not the same thing as a point mutation altering an allele. A point mutation is conceptually quite different from a deletion. It is an substitution of one base with a different base. The length of the genetic segment in question quite commonly remains the same after the point mutation as it was before (where the genetic segment is defined as continuing until astop or nonsense codon is reached). The *effect* of a point mutation can sometimes be the loss of function of the protein product, but that is a question of meaning, not information. LS>This analogy applies to mutations like the disabling of a LS>repressor gene (which can cause the overproduction of an LS>enzyme) or degrading the specificity of an enzyme (which could LS>increase the enzyme's activity on some other substrate), even LS>though such mutations might be beneficial under special LS>circumstances. And again this objection has more to do with meaning than with information. LS>Neo-Darwinian macroevolution is supposed to proceed by LS>getting rare lucky mutations, one after another, each LS>installed in the population by natural selection. Simple models look at serial changes, but there is nothing in the theory to exclude alleles being acted upon in parallel. I would suggest reading Sewall Wright with reference to "interaction systems". LS>Single isolated adaptive mutations of the types that have LS>been found are not sufficient. Eventually some real LS>information has to be added to achieve macroevolution. The LS>classic scenario of the neo-Darwinist is to duplicate a LS>gene and then have it evolve without losing the function of LS>the original gene. I think it fairer to say that the duplicate originally codes for a protein product which has the same function as the original. Once duplicated, there is no necessity that it continue to retain the original function. LS>The duplicate might first lose some of its function, but LS>then it has to build up something new. Or, given that the original copy provides sufficient function, it could simply drift. LS>To use our example of reducing the specificity of the gene, LS>it might be beneficial first to reduce specificity so as to LS>grant the enzyme some activity on a new substrate. But that LS>can be only the beginning. The second job is to have random LS>mutations increase the specificity of the enzyme for the LS>new substrate. The first is easy and can be done LS>quickly. The second is much harder, and we have no evidence LS>that it has ever occurred, in spite of the necessity for LS>orders of magnitude more of this kind of mutation than for LS>those of the type that disable a gene. Really? What about nylon-eating bacteria? On what grounds is this example excluded as evidence of the sort Spetner would want to see? There are known mutations that definitely result in increases in information under a Shannon measure. Autotetraploid speciation in orchids is pretty common. (One can browse the orchid society lists of species and note how often the attribute "tetraploid" is applied.) Tetraploidy is the condition in which the chromosome count doubles, usually by a failure to divide somewhere in gametogenesis. Shannon's information measure gives lower values for shorter messages, but higher values for longer messages. The doubling of genomic content yields a larger Shannon value, and one can see this occurs by Spetner's own argument that points this out for the purpose of discussing deletions leading to lower values. That's on the information side of things. What about meaning, since Spetner seems to care about that? Well, tetraploid orchid daughter species are typically larger and have more robust structures than those of the parent species. This means that the argument that doubling the same material doesn't change the information content is simply (and visibly) counterfactual. The fact is that morphology of the daughter species changes, and it changes because of the change in genome. The change in genome is well-characterized. The change is that there are now two copies of every locus where before there was one. By both a technical measure of information (Shannon's) and by a more casual and common-sense measure that incorporates discussion of meaning, autotetraploid speciation in orchids where the parent and daughter species differ in morphology represents an increase in information. Wesley