Antievolution.org - Antievolution.org Discussion Board -Topic::Incongruence in phylogenetic trees

niiicholas

Posts: 319
Joined: May 2002

Posted: Dec. 23 2002,23:39

I think that several considerations have to be added to Hunter's post before serious discussion can be had.

1) "Congruence" and "noncongruence" are not either/or entities, they a matter of degree. Given N species being analyzed, there are something like (2n-3)!/(2n-2(n-2)! hypothetically possible ways of arranging them into a tree (Theobald 2002), and the (dis)similarity between two trees can be rigourously quanitified.

This equation will differ slightly depending on whether the trees are rooted vs. unrooted, binary splits only, etc. Regardless, the number of possible trees gets very big very fast: 4 species = 15 possible trees, 8 species = 135,135 possible trees.

You can randomly generate tree diagrams at this cool page (Phylogeny and Reconstructing Phylogenetic Trees) and get the idea very quickly what the odds are of getting the same tree twice by random chance.

So the question is not whether two phylogenies from different data sources/research labs are congruent or incongruent, full stop, the question is how congruent or incongruent are they? Most of the examples touted as showing "incongruence" are actually quite minor phylogenetic disagreements. E.g., the interrelationships of different groups of bats is a pretty trivial issue in the context of vertebrates or animalia. If the microbats grouped most closely with anthropoid apes, and the macrobats with giraffes, then we'd have a significant disagreement. This kind of thing does not happen in multicellular organisms with protected germ line cells, rather different datasets keep returning highly congruent phylogenies.

So, just like any scientific measurement, there will be noise in input data. The analogy here is to radiometric dating: if two measurement dates of a moon rock return ages of 4.6 and 4.5 billion years, this is very minor disagreement relative to the result (100 million years sounds like alot but is only a 2% disagreement). If someone were to go around saying "geological measurements disagree by 100 million years and this is evidence against an old earth" they would be wrong. Similar minor disagreements, such as Teeling et al.'s 2002 bat study, should not be cited as evidence for Hunter's proposition "there are also plenty of character/species sets that do not produce congruent phylogenies". A real disagreement would occur if all of these different bat species did not group together and instead were randomly associated with the outgroup taxa, but as we can see this did not occur:

[img]http://www.pnas.org/content/vol99/issue3/images/medium/pq0224771001.gif[/[img]

The odds of all these bat species grouping together by chance are astronomical.

2. Scale of the study and range of dataset

As the age-of-the-moon example points out, what is important in considering disagreement in results is not the absolute measurement, but the size of the disagreement relative to the scale of the study. 100 million years sounds like alot but is peanuts in terms of the age of the earth. Such a disagreement would be major, however, in a radiometric dating of dinosaur bones, and a data source with a smaller error would have to be used.

Radiometric datasets have ranges and scales over which they are useful, due essentially to their rate of decay. You use uranium-lead to date the age of the moon, because it has a half-life of hundreds of millions of years, but it would be ridiculous to use it for dating an archeological artifact because the answer you would get (assuming the artifact was, say, something that had been forged by remelting the ore) would be "0 +/- millions of years". Similarly, the half-life of C-14 is only ~5,000 years, so it is excellent for archeology but for anything older than 50,000 years it is useless (a result of "50,000 years old" for a carbon date essentially means "this sample is between 50,000 and infinite years old"). In the first case, the noise is much larger than the signal, and in the second case the signal is much smaller than the noise (these are slightly different, think about it for a sec.).

With molecular sequences the same factors must be taken into account. I don't currently have access to Hunter's cited Balter (1997), " Morphologists learn to live with molecular upstarts", but I would note that there is apparently a contrasting commentary (Mindell 1997) on that very article from the next month of Science, entitled ""Misleading" molecules?". Probably the basic point is that the particular mtDNA sequences being used evolve too quickly (certain mtDNA sequences are, after all, used for tracing migration patterns within the human species), such that sequence similarity is low and therefore "noise" in the form of mutational biases is larger than the signal. Certainly comparing chickens, amphibians, and fish is a long ways from what one normally sees mtDNA used for, e.g. species within a genus.

(Note in passing: not all mtDNA within a mitochondrion is the same. It's possible that the above study used a very slowly evolving mtDNA sequence and similarity between e.g. birds and fish was high, e.g. >75%. But I doubt it. Let's get the Balter and Mindell articles and see what they say, shall we?)

In summary, anytime one sees a cited "incongruence" they must consider the dataset is appropriate for the scale of the analysis. If sequence similarity is approaching randomness then mutational biases are increasingly important to consider.

3. Actual violation of lineal descent. This is commonly the case for single-celled prokaryotes without protected germline DNA. If you like, the tree hypothesis has been falsified, because it is known and has been observed in the lab that they can trade DNA laterally. But this leaves the evidence for the common descent of e.g. all animals unquestioned. Much more can be said here because LGT is itself a nonrandom process and certainly some things are harder to LGT than others, but this is another topic. If we saw the kinds of disagreements in animals that we have in prokaryotes, as we have no mechanism for significant LGT in animals (viral transfers is about it I think), this would be a significant problem for the common descent theory. But we don't. "Disagreements" that I have seen cited for multicellular critters basically fall into the above categories.

In summary, in answer to Hunter's question,

Quote

My point is not to say explanatory mechanisms are out of bounds or that complicating factors should not be expected, but merely to raise the question: At what point does the use of these explanatory mechanisms become ad hoc and do we consider the Step 1 in the syllogism falsified?

...basically, these explanatory mechanisms are allowed when they themselves are well-supported by available data. We can measure mtDNA rates of change and mutational biases. We can observe and explain why LGT occurs in prokaryotes but not in mammals. We can measure the degree of disagreement between trees and determine if the error is equivalent to 100 million years/4.6 billion years or not.

There is a massive literature on all of this, which is why I'm surprised that Hunter thinks that biologists haven't thought about it. The best introduction to it all is Theobald's FAQ at that talkorigins archive, referenced below. It references a lot of articles with titles like "Testing Common Descent" about the probabilities of hitting on congruent trees by chance.

Refs:

Theobald, Doug. 2002. 29 Evidences for Macroevolution

Teeling, Emma C. et al. 2002 Microbat paraphyly and the convergent evolution of a key innovation in Old World rhinolophoid microbats Proc. Natl. Acad. Sci. USA, Vol. 99, Issue 3, 1431-1436.

(bold added below)

Quote

Molecular phylogenies challenge the view that bats belong to the superordinal group Archonta, which also includes primates, tree shrews, and flying lemurs. Some molecular studies also challenge microbat monophyly and instead support an alliance between megabats and representative rhinolophoid microbats from the families Rhinolophidae (horseshoe bats, Old World leaf-nosed bats) and Megadermatidae (false vampire bats). Another molecular study ostensibly contradicts these results and supports traditional microbat monophyly, inclusive of representative rhinolophoids from the family Nycteridae (slit-faced bats). Resolution of the microbat paraphyly/monophyly issue is essential for reconstructing the temporal sequence and deployment of morphological character state changes associated with flight and echolocation in bats. If microbats are paraphyletic, then laryngeal echolocation either evolved more than once in different microbats or was lost in megabats after evolving in the ancestor of all living bats. To examine these issues, we used a 7.1-kb nuclear data set for nine outgroups and twenty bats, including representatives of all rhinolophoid families. Phylogenetic analyses and statistical tests rejected both Archonta and microbat monophyly. Instead, bats are in the superorder Laurasiatheria and microbats are paraphyletic. Further, the superfamily Rhinolophoidea is polyphyletic. The rhinolophoid families Rhinolophidae and Megadermatidae belong to the suborder Yinpterochiroptera along with rhinopomatids and megabats. The rhinolophoid family Nycteridae belongs to the suborder Yangochiroptera along with vespertilionoids, noctilionoids, and emballonuroids. These results resolve the apparent conflict between previous molecular studies that sampled different rhinolophoid families. An important implication of rhinolophoid polyphyly is independent evolution of key anatomical innovations associated with the nasal-emission of echolocation pulses.

Originally posted here:

ICSID thread

Edited by niiicholas on Dec. 23 2002,23:41

theyeti

Posts: 97
Joined: May 2002

(Permalink)

Posted: Jan. 17 2003,13:56

Trends Genet 2002 Sep;18(9):472-9

Genome trees and the tree of life.

Wolf YI, Rogozin IB, Grishin NV, Koonin EV.

Quote

Genome comparisons indicate that horizontal gene transfer and differential gene loss are major evolutionary phenomena that, at least in prokaryotes, involve a large fraction, if not the majority, of genes. The extent of these events casts doubt on the feasibility of constructing a 'Tree of Life', because the trees for different genes often tell different stories. However, alternative approaches to tree construction that attempt to determine tree topology on the basis of comparisons of complete gene sets seem to reveal a phylogenetic signal that supports the three-domain evolutionary scenario and suggests the possibility of delineation of previously undetected major clades of prokaryotes. If the validity of these whole-genome approaches to tree building is confirmed by analyses of numerous new genomes, which are currently being sequenced at an increasing rate, it would seem that the concept of a universal 'species' tree is still appropriate. However, this tree should be reinterpreted as a prevailing trend in the evolution of genome-scale gene sets rather than as a complete picture of evolution.

theyeti

charlie d

Posts: 56
Joined: Oct. 2002

(Permalink)

Posted: Jan. 17 2003,14:23

A good short commentary by Charlebois et al on the issue of microbial phylogenetic trees is found in this week's Nature.
Notable passages:

Quote

We and others have been exploring 'whole-genome trees' as a means of overcoming the noise and bias of single-protein analyses, to extract the bulk phylogenetic signals that are inherent in genomes. ...... Despite some early indications to the contrary, whole-genome trees have now largely converged on the rRNA-sequence tree.

For us .... this convergence means that lateral gene transfer has not undermined descent with modification as the default explanation for microbial biodiversity, nor (as recently suggested by Ford Doolittle) has it thrown microbial classification into disarray. ......

The most enthusiastic lateralists reply, however, that convergence between whole-genome and rRNA trees merely demonstrates that rRNA genes — unlike most individual protein-coding genes, but like the genome as a whole — are but pastiches that are produced by lateral gene transfer.

Fascinating as these conflicts are, the important point is not whether a given tree is right or wrong. Rather, we should use these trees as frameworks upon which to construct and test hypotheses about the rate and mode of microbial evolution, and to improve our analytical methods. Without conflicts, we might all be far more complacent about evolutionary theory. In microbial phylogenomics, the scientific process is alive and well!

niiicholas

Posts: 319
Joined: May 2002

(Permalink)

Posted: Jan. 18 2003,23:54

I noticed Charlesbois's article also, comments are here:

Re-evolution of complex characters

IMO this quote is the key one for putting some balance into discussions where Woese, Doolittle, etc. are cited:

Quote

We and others have been exploring 'whole-genome trees' as a means of overcoming the noise and bias of single-protein analyses, to extract the bulk phylogenetic signals that are inherent in genomes. The input data for genome trees can be the proportions of genes or proteins that genomes hold in common, or (as we prefer) the mean pairwise similarities between shared proteins. Despite some early indications to the contrary, whole-genome trees have now largely converged on the rRNA-sequence tree.

For us — as, presumably, for the verticalists — this convergence means that lateral gene transfer has not undermined descent with modification as the default explanation for microbial biodiversity, nor (as recently suggested by Ford Doolittle) has it thrown microbial classification into disarray. Lateral transfer is not both quantitatively important and directional. One of the few widely accepted instances of lateral gene transfer — the origin of chloroplasts from relatives of cyanobacteria — is clearly visible in our whole-genome trees, and even more so in 'sub-genome trees' based on functional subsets of genomes.

nic

Edited by niiicholas on Jan. 18 2003,23:54

	Antievolution.org :: Antievolution.org Discussion Board The Critic's Resource on Antievolution