| Quote |
| Despite the practically unlimited number of possible protein sequences, the number of basic shapes in which proteins fold seems not only to be finite, but also to be relatively small, with probably no more than 10,000 folds in existence. Moreover, the distribution of proteins among these folds is highly non-homogeneous -- some folds and superfamilies are extremely abundant, but most are rare. Protein folds and families encoded in diverse genomes show similar size distributions with notable mathematical properties, which also extend to the number of connections between domains in multidomain proteins. All these distributions follow asymptotic power laws, such as have been identified in a wide variety of biological and physical systems, and which are typically associated with scale-free networks. These findings suggest that genome evolution is driven by extremely general mechanisms based on the preferential attachment principle. |
| Quote |
| Intrinsic disorder refers to segments or to whole proteins that fail to self-fold into fixed 3D structure, with such disorder sometimes existing in the native state. Here we report data on the relationships among intrinsic disorder, sequence complexity as measured by Shannon's entropy, and amino acid composition. Intrinsic disorder identified in protein crystal structures, and by nuclear magnetic resonance, circular dichroism, and prediction from amino acid sequence, all exhibit similar complexity distributions that are shifted to lower values compared to, but significantly overlapping with, the distribution for ordered proteins. |
| Quote |
We describe the creation of folded chimaeric proteins by combining a designed polypeptide segment (bait) derived from a beta-sheet of a human antibody variable domain with random polypeptide segments encoded by human cDNA fragments. The repertoire of polypeptides was displayed on the surface of filamentous bacteriophage and folded polypeptides were selected by proteolysis. One of these, 2a6, was readily expressed in the Escherichia coli cytoplasm as a soluble and protease-resistant protein and could be purified after heating the bacterial lysate to 90 degrees C. Soluble 2a6 is dimeric and its CD spectrum is consistent with components of both alpha and beta structure. 2a6 cooperatively and reversibly unfolds by heat or urea with a folding energy of 11.4 kcal mol(-1) for the transition between folded dimer and unfolded monomer and its refolding steps proceed without the formation of detectable aggregates. Its stability and folding properties are therefore typical of native proteins. Sequence analysis revealed that the cDNA segment in 2a6 was recruited from the antisense strand of a human gene, suggesting that antisense sequences can provide a reservoir for the evolution of soluble and stable proteins. |
| Quote |
| A great challenge to biologists is to create proteins with novel folds and tailored functions. As an alternative to de novo protein design, we investigated the structure of a randomly generated protein targeted to bind ATP. The crystal structure reveals a novel alpha/beta fold bound to its ligand, representing both the first protein structure derived from in vitro evolution and the first nucleotide-binding protein stabilized by a zinc ion. |