Joined: April 2005
On overlooking the obvious
This is an outline of an article I am working on. I am sending/posting it to get feedback on the ideas and suggestions for alternative ways of making the proposed measurement.
As a gross simplification the process of evolution consists of an organism acquiring a change in its inheritance or genes and in consequence incurring changes in its characteristics and subsequently a variation in its prospects for flourishing in the environment.
A wide variety of things can produce changes in an organisms’ genes and for many, if not most, we have a detailed theory of the process in neo-Darwinism which covers all of the common changes that can arise from within the organism’s own genes. However, as well as an organism’s own genes it exists in an ocean of alternative genetic material, from fragments of dead organisms through pollens, viruses and bacterial plasmids to complete consumable organisms of different species. We know that there are some instances where an organism has acquired a gene or group of genes from another. The process is referred to as horizontal gene transfer. Two classic instances are the transmission of resistance to antibiotics and mitochondrial DNA. Concern over genetic engineering has brought further examples to light.
The processes of horizontal gene transfer and neo-Darwinism operate in very different ways and therefore each should leave its own distinctive signature on the pattern of new species found in the evolutionary landscape. So we will compare the two processes and look to see if there are any ways in which horizontal gene transfer could leave an identifiable signature on the new species and use this to see if there is a detectable component of horizontal gene transfer in the evolutionary record and, if so, estimate the proportion of evolution that could be attributable to it.
There is a problem that you encounter as soon as you attempt the calculation. This can be seen in the obvious case of the fossil record. A single gene contains a truly extraordinary amount of accumulated evolution. It takes hundreds of individual changes to make a gene sequence in a piece of DNA. Of greater importance are the innumerable alternatives and byways that produce DNA sequences that are not genes or are genes that kill the organism. The possibilities are so numerous that it is at present unknown just how any significant number of useful genes can be produced within the time life has been present on earth. Remember that until a gene is complete and activated there is no evolutionary pressure that can be brought to bear on it. There is an astronomical ratio between the number of neo-Darwinian changes required to bring about a given level of change in a species and the number of horizontal gene transfers to achieve a comparable effect. Neo-Darwinism could be expected to contribute an essentially continuous gradation between species whereas horizontal gene transfer would leave far fewer but much larger jumps in the fossil record. As Darwin himself noted, the fossil record does not match the neo-Darwinian prediction and continuous gradation from one species to another is not seen at all. A simple analysis of the fossil record would merely conclude that 100% of evolution was attributable to horizontal gene transfer. In this case we have instead a problem in identifying a signature of neo-Darwinism in the evolving species. Neo-Darwinian evolution occurs, there is ample evidence for that. Some unexplained factor is suppressing the expected pattern in the evolved species.
There is an alternative analysis that can be made that depends on a different characteristic of horizontal gene transfer. In neo-Darwinian evolution each organism experiences its own changes, good or bad. If one population has ten times as many members as another it will experience ten times as many mutations and have ten times as many new genetic variants on which natural selection can operate. A large population has many potentially useful variants, a small population can run out of genetic diversity leaving it vulnerable to extinction. Conversely in the case of horizontal gene transfer the organism fragments, pollens, viruses and plasmids occur in large numbers so that there is a huge pool of genetic material available to any taker, irrespective of the size of the population of the mutating species. A change large enough to lead to a new species will be far more likely for a very large population than a very small one for neo-Darwinism, but essentially equally likely for all population sizes for horizontal gene transfer. So an estimate of the significance of horizontal gene transfer can be made by looking at the evolutionary tree and computing the correlation between estimated population size at a point on the tree and the probability of there being one or more descendent species. However simple observation of the published trees indicates that the problem we had before is again apparent. The trees are replete with extensive outgrowths arising from source species originally present in very small numbers. The sum total of the entire populations of every single species that has given rise to a vertebrate descendant species is far less than that of many single celled species that over the same period and with the same evolutionary pressures have give rise to few if any descendant species. We find again that the problem in estimating our ratio is an almost total lack of the expected signature of the neo-Darwinian descendant species and are left with the same, peculiar, estimate that 100% of new species arise from horizontal gene transfer
An alternative way of determining the effect of numbers on probability of evolving that eliminates many variables is to take a well studied organism, like homo sapiens, and note that for every individual homo there are numerous accompanying parasites and symbiotic bacteria. Since these all have gone through the same evolutionary history the history can be largely discounted and the question becomes one of counting the number of new species of human flea, skin mite, athletes foot yeast, gut bacteria etc that have appeared in the last couple of million years while humans evolved from a Miocene ape. To the extent that the numbers of new species reflect their population sizes the process is neo-Darwinian. To the extent that the numbers of new species match the number of human species in the interval the process is horizontal gene transfer. The actual numbers are again and oddly essentially identical. The problem with this analysis is that the numbers clearly indicate that there is no tendency whatever for larger populations to be more likely to leave descendent species and again leads to the conclusion that virtually all change is horizontal gene transfer.
There is third way of estimating the ratio of neo-Darwinian evolution to horizontal gene transfer. We can look at the whole issue from a global perspective. Basically life involves taking energy from the sun and using it to create living organisms and support other organisms living on the first group and each other. The amount of sunlight does not vary by much. The proportion of the surface of the earth where life can occur does not vary by much. So the total amount of life should not vary by much. Since more than 99.9 % of living organisms are single celled the total number of living organisms should also not vary by much. There should be roughly the same number of organisms per planet now as there were three million years ago – give or take a factor of ten or so. The number of genes per organism also does not vary by much. There are organisms with a lot of genes but they appear in such small numbers compared with plankton, bacteria, yeasts and other single celled life that their contribution may be ignored in the global scheme of things. So although the number of species varies the total number of copies of all genes on the planet should be roughly constant and if they all change at a roughly constant rate the total number of new gene variants per year should be roughly constant over geological time. This leads to an expectation that the number of cases per year where enough variation is accumulated in a population to constitute a new species should be roughly constant for neo-Darwinian evolution and not dependent on the number of species forming populations at that time. If there are ten times as may species each gets, on average, a tenth of the available variation. Some success at last, this was true for the first three million years of biological history and we have a neo-Darwinian signature. The process of horizontal gene transfer follows a different rule as new species creation depends on the number of genes available in the global pool and number of species absorbing genes from the pool. This has an increasing growth and must, inevitably, eventually dominate over any constant process. This applies no matter how insignificant a proportion of genetic change is initially attributable to horizontal gene transfer. The observed distribution of number of species does in fact follow the expected curve where some species arise from an essentially constant process and some from an essentially exponential process starting from a lower base level. (plot a graph of return from 5% simple interest on $1000 plus .5% compound on $1 over 3000 years to see the resulting curve). The problem with the analysis is that almost all species arose after the exponential component became dominant and would have to be assumed to have been derived from horizontal gene transfer. We are still back to the nearly 100% of species arising from horizontal gene transfer.
To hark back to the second mode of estimate – there is a simple calculation that says that if 99.99 % of living organisms are single celled then 99.99% of copies of genes are in single celled organisms and 99.99% of the genes changed by neo-Darwinian processes are in single celled organisms so that 99.99% of populations that accumulate enough change to produce a new species will be single celled and 99.99% of new species arising under neo-Darwinism should be single celled. Put this down as another odd calculation that suggests that most multicellar life originated via horizontal gene transfer.
Let us try a fourth, and more direct approach. The human genome has been decoded, as has that of much of the immediately related primate species. Thus we can look at the actual genes that have changed in the process of evolving. For neo-Darwinism the process of obtaining a new gene is clear. A stretch of DNA acquires random changes until eventually it meets the criteria used by the cell chemistry for identifying it as a gene at which point it can produce a protein that will have an effect on the cell and the whole process of natural selection and gene-tuning can commence. Thus for any new gene the presence in several related species of a string of DNA that is not a gene and never has been a gene but has a 99% match to the sequence of the new gene will be a reliable indicator that the gene has originated by neo-Darwinian processes. Conversely the absence of such a DNA sequence or alternatively anything that positively identifies the gene as having come from somewhere external to the organism will identify the gene as having arrived through horizontal gene transfer. Once again, however, we get the “wrong” answer. The genes identifiable as new to the homo sapiens which have a corresponding non-gene found in any related species are, to put it mildly, rather thin on the ground. Worse still, over 200 new genes can be almost certainly identified as having originated in bacterial sources. We still deriving values of the ratio of neo-Darwinism to horizontal gene transfer with almost 100% horizontal gene transfer at least for the case of new genes.
Complete information on genetic sequences would allow the proportion of horizontal gene transfer to be estimated by checking which genes were duplicated in what species. In the case of neo-Darwinian evolution a gene can only appear in an initial species and species linearly descended from the initial species. In contrast a generally useful gene is likely to appear in a substantial number of unrelated species, especially following a widespread ecological crisis. All that is necessary is for the mutating organisms to be exposed to a donor species for the gene. There need be neither physical nor temporal proximity. Too few organisms have been sequenced for this approach to be applied with any accuracy although there are a few regulatory genes that have a suspicious distribution. In the case of the human genome the information is definite. It is impossible to reconcile the human gene pattern with a model of divergent human groups forming separately evolving groups. The discrepancy is so marked that some writers, whose commitment to neo-Darwinian orthodoxy appears almost religious, appear ready to claim that racial differences are merely a culturally determined illusion. The approach demonstrates only that there is a significant amount of horizontal gene transfer as a lot of data is required to distinguish the two cases of a gene being passed to all or most descendent species and all or most descendent species acquiring the same gene subsequent to speciation.
On to approach five. Let us look at what sort of genes are being created. Are there any distinctive characteristics that would identify a gene or a group of genes as having almost certainly either originated through neo-Darwinian processes or having been imported from an external source? In both cases the answer is "yes" although the characteristics are quite different. For genes produced through random permutations of DNA there is a very simple pattern to the length of the gene. If you step through a randomly produced gene, codon by codon, there is at each step a probability of encountering a “stop” marker that terminates the gene. The twenty-fifth codon of a gene can only exist if none of the preceding twenty-four codons was a terminator. This leads to an expectation that if you plot the numbers of new genes against the length you should get an exponentially decreasing curve with a gene of length one being the most common and anything above one hundred being extremely unlikely. This is, as one might by now have come to expect, never observed.
For genes acquired by horizontal gene transfer the identifying characteristic relates to the fact that most, if not all, new transferable genes originate in bacteria or other single celled organisms (largely a question of numbers). There is a problem in explaining any gene that has a function related to, say, bone formation, as bacteria do not have bones and the gene cannot have that function in the original organism. For horizontal gene transfer to be a workable form of evolution there would have to be genes that are dual function and serve one purpose in the original bacterium but some other, quite different, function in some other organism. Since it is highly improbable that a randomly generated string of DNA that happened to meet the criterion of being a gene would have any useful function at all the probability of such a gene having two totally different useful functions is grossly improbable. However there is at least one class of genes and their proteins which have exactly this peculiar property. Many bacteria respond to toxic levels of an element by chelating the offending atoms and excreting the chelate. Unlike bacteria, all higher level organisms have a significant trace element requirement. These trace elements are used in chelates and in some cases it is possible to work out how the whole process works. For example a bacterium occupying a niche at the edge of an underground clay pan will be exposed to very high levels of iron atoms with the proportion of ferrous to ferric varying with water flow. Such a bacterium will find considerable use for a chelate that will allow several atoms of either ferrous or ferric iron to be excreted. If, later, a species with internal fluids acquires this gene it has just obtained haemoglobin. A similar analysis can be done on other trace elements – the elements naturally occur somewhere in toxic quantities and the protein involved serves both to excrete the toxic element and also for some purely serendipitous function in the secondary organism. All this analysis takes us no further forward. We are left, again, with no clearly identifiable genes that look neo-Darwinian but at least some that look as if they arrived via horizontal gene transfer.
There is one other characteristic of horizontal gene transfer that might allow an estimate to be made of the extent to which it contributes to evolution. For this we need a little bit of theory that is not widely circulated. Neo-Darwinism contains a couple of principles stating that natural selection operates only on existing genetic variation and that an organism cannot affect its own evolution. These principles do not apply to evolution in general and serve only to delimit the forms of evolution that can be correctly described by neo-Darwinism. Natural selection can be a lengthy process and a genetic change can easily occur within or as a result of the selection process. Organisms can and do exhibit behaviour that affects the probability of genetic change. Indeed, in the two common cases of cancer and HIV, you personally may be aware of actions that you have taken or not taken with the specific intent of reducing the probability of that genetic change. With horizontal gene transfer the degree to which cellular mechanisms are involved make it impossible to ignore this factor in the way it is, for example, in the case of the radioactive decay of carbon 14 to nitrogen. To understand the implications we need only look at the simplest version of what I term the “abandon ship” effect.
Consider, for a moment, a hypothetical single celled organism living within some structure as a stromatolite or algal slime. Maintenance of the structure requires each organism to react in some way to the presence or absence, or for that matter health, of the surrounding organisms. For such an organism it is not stretching the bounds of possibility to imagine that the probability of acquiring a gene from the environment might increase under the stress arising from the death of several surrounding organisms. This condition is a very strongly self-reinforcing evolutionary adaptation and once established will persist. For an organism suffering very high losses and headed for extinction the normal rules about mutation do not apply. For normal mutation the majority of mutations are deleterious and attract a penalty derived from the large cost of death compared with non-mutation and survival and the benefits attract only a small advantage in being possibly slightly better at surviving compared with non-mutation. In the case of incipient extinction non-mutation leads to extinction and deleterious mutations attract only the negligible disadvantage of death slightly earlier than for non-mutation whereas the benefits can include the large one of survival. This resembles the situation where jumping off a ship in a storm is normally a suicidal idea, but following a call to abandon ship because the ship is sinking becomes an excellent stratagem as the small chance of survival is far greater than the zero chance of survival associated with going down with the ship. If all unmutated organisms of a species will die, and some mutated ones survive then mutation, whatever the risks, becomes advantageous. Our hypothetical organism can thus be seen to have achieved the status where it detects and responds to the condition that mutation is advantageous. Since it is advantageous the organism is more likely to produce a successor species which will also, since this is a heritable characteristic, behave similarly in the face of the population decline that heralds extinction. For the individual organism mutation is a hit-or-miss affair. However for the species the effect is very powerful. If there is an advantageous gene available anywhere in the environment, and this is very probable, then having every member of the population acquire a gene will inevitably result in some individuals acquiring the advantageous gene or genes and after the dust has settled the variant with the most useful of the advantageous genes will go on to become the preponderant representative of the successor species. Thus the innocuous individual characteristic “if surrounded by dead, acquire a gene” becomes scaled up in the broader case of the population to “in a crisis acquire the most useful available gene”. This has so great an evolutionary advantage that it does not take very many brushes with extinction for it to become a preponderant trait in all evolved species.
With this theory under our belt we can identify two more characteristics of horizontal gene transfer that should be absent when neo-Darwinian evolution is responsible for new species. The first of these is a strong correlation between the appearance of new species, especially radically new species with the incidence of mass extinctions. Since it is the actual mass extinction that initiates the new species creation new and novel species will appear in the fossil record almost immediately following the extinction whereas for neo-Darwinism the only increase in the rate of species formation arises from the emptying out of ecological niches which will then be refilled at the same rate as the original colonisation of the niche and by a similar species to the original colonist of the niche. The neo-Darwinan effect is masked in this case but the plots of numbers of species against time do show a very rapid recovery in numbers following mass extinctions suggesting a high proportion of new species arising through horizontal gene transfer
The second characteristic derives from the basic "shape" of the evolutionary process. What is possible in the way of life forms is dictated by the laws of physics, chemistry and information. Evolution is a process of exploring the possibilities. With neo-Darwinism the current population of species has an envelope of variants which inexorably moves slowly out over the range of possible life forms. With horizontal gene transfer the process divides into two quite distinct phases, the static where there is no advantage in evolving and very little, if any, significant changes occur and "explosions" where there is an advantage to evolving and a species mutates rapidly creating a shell of new life forms all radiating away from the population that has encountered the "abandon ship" scenario. The two mechanisms differ considerably in the pattern of appearance of species when a new zone of possible life, such as land or flight is encountered. In neo-Darwinism the first land animal, for example must derive from a population of "almost land animals" as only small changes can occur. New species of land animals will drift slowly in, predominantly from the "almost land animals" until such time as the population of land animals reaches large numbers and can exhibit sufficient variation in its own right. In horizontal gene transfer the first land animal will arise as a result of a substantial random jump from a previous species and the "almost land animals" will not exist. The exigencies of the new environment will result in very rapid radiation out from the initial species until stable forms are found. Thus the ratio of neo-Darwinism to horizontal gene transfer can be estimated from the extent to which new classes of organisms appear as a slowly increasing number deriving from a similar precursor (neo-Darwinism) or as a significant number of species appearing almost simultaneously without any obvious precursor (horizontal gene transfer). The latter form would appear to be the dominant form.
Thus all attempts on estimating the ratio of neo-Darwinian species formation to horizontal gene transfer species formation have foundered, not on the difficulty of finding a signature of horizontal gene transfer but on a difficulty in finding a signature of neo-Darwinian species formation.
Which leads me to the title of the paper. Somebody somewhere must be overlooking something obvious. Maybe it is me, in which case I would be most interested in a analysis that shows why all these different methods of calculation should all fail and gave the same erroneous answer.
Alternatively, just possibly, everyone else is overlooking something obvious.
The logic for assuming that neo-Darwinism is responsible for all evolution of species was always of the smoking gun variety and might, just possibly, not be true.