Joined: Oct. 2007
|Quote (midwifetoad @ Feb. 29 2012,21:14)|
|I know that Doug Axe dabbled a bit in modifying coding sequences, but has anyone ever proven that any protein coding sequence cannot tolerate change?|
There are 64 different codons (AAA, AAC, AAG, … TTC, TTG, TTT), which code for 20 different amino acids (alanine through valine). Therefore, one would expect that at least a few amino acids are coded for by more than one of the available 64 codons. And one would be correct to expect such.
Coded for by 6 (six) distinct codons: Arginine, Leucine, Serine
Coded for by 4 (four) distinct codons: Alanine, Glycine, Proline, Threonine, Valine
Coded for by 3 (three) distinct codons: Isoleucine
Coded for by 2 (two) distinct codons: Asparagine, Aspartic acid, Cysteine, Glutamic acid, Glutamine, Histidine, Lysine, Phenylalanine, Tyrosine
Coded for by exactly 1 (one) distinct codon: Methionine, Tryptophan
There are also the three 'stop' codons (UAA, UAG, and UGA), which mark the end of a coding sequence; I mention them so nobody will get the idea that I can't add…
But I digress.
If a protein contains exactly one Leucine amino acid, there must be a minimum of six distinct coding sequences which will yield exactly & precisely that specific protein. I say "must" because, even if you assume the entire rest of the sequence yields a single unique result, the bit of the sequence which yields the Leucine could be any of 6 (six) distinct codons, namely, UUA; UUG; CUU; CUC; CUA; and CUG. And of course, all save two of the other amino acids (that pair being Methionine and Tryptophan) are coded for by 2+ distinct codons.
So if you want to know how many different nucleotide sequences are physically capable of yielding a given protein, count up each of the individual amino acids in that protein. Let A = the number of Alanines in the protein; C, the number of Cysteines; D, the number of Aspartic acids; E, Glutamic acids; F, Phenylalanines; G, Glycines; H, Histidines; I, Isoleucines; K, Lysines; L, Leucines; M, Methionines; N, Asparagines; P, Prolines; Q, Glutamines; R, Arginines; S, Serines; T, Threonines; V, Valines; W, Tryptophans; and Y, Tyrosines… and plug those values into this equation:
Number of sequences = 6^(L+R+S) * 4^(A+G+P+T+V) * 3^(I) * 2^(C+D+E+F+H+K+N+Q+Y)
For any protein which does not consist entirely of Methionine and Tryptophan, this calculation will yield a number of possible sequences which is appreciably greater than 1 (one). Hence, the vast majority of protein coding sequences must be able to tolerate some degree of change.