The 'combinatorial explosion' problem generally refers to the difficulty of locating a unique solution to a given problem when the potential space of solutions to be searched is astronomically large. It is found in many areas of science but most notably in protein folding where it takes the name of "Levinthal's Paradox". Biochemist Cyrus Levinthal pointed out in the 60s that if a given sequence of amino acids were to explore every possible conformation for each of its amino acids, even a small protein of 100 amino acid residues or so would take a time longer than the age of the universe to find the correct folded structure.
The paradox is clearly not a paradox since nature has solved the problem of protein folding countless number of times since life began on this planet (this is the protein-centric version of the anthropic principle). Thus, the combinatorial 'problem' is not a problem so far as we know that a robust and tried-and-tested solution exists and in fact has been used by nature to stunning effect. The problem is really to figure out the devilish details of this solution. In the past 30 years or so scientists have employed a battery of experimental and theoretical techniques to tackle the issue. Many important insights have revealed that understanding the factors that dictate the self-assembly of proteins can lead to great insights into the problem. Probably the foremost among these factors is the hydrophobic effect, which productively buries greasy chemical functionalities in the interior of proteins utilizing the multiple driving engines of favorable desolvation, entropic expulsion of water and weak packing-induced interactions. Other important factors ubiquitously used by nature include hydrogen bonds and salt bridges.
The key insight in tackling the problem has been to realize that protein folding or protein-protein interactions or indeed, all the myriad biomolecular interactions that occur in the cellular milieu, do not arise 'by chance'. Once we get past this stumbling block, things make a lot of sense. Chance events undoubtedly keep on happening, but nature preferentially preserves the consequences of certain events. Thus, similar motifs which have been successfully used for certain proteins are used for others. Nature does not need to keep on searching all of conformational space again and again for generating new structures. The analogy would be in designing a new house based on existing structures like bricks, arches and beams rather than designing it from scratch. A Victorian Englishman coined a word for this process of preservation of favorable elements leading to new biological entities a hundred and fifty years ago- natural selection. Thus, the protein folding problem can be immediately demystified when one realizes that natural selection keeps on using recurring motifs to build new structures. Far from being a chance event, the complexity of life can be explained by the re-use of pre-existing structures to build complexity. It may seem highly improbable and miraculous, but Darwin's genius was to provide a mechanism for precisely explaining this illusion of 'design', both on macro and molecular scales. It no longer seems improbable, but instead offers us a tool of incomparable power to peek into the heart of complex biological phenomena.
From a chemist's point of view, natural selection at the molecular level takes the form of the preservation of low-energy conformations of biomolecules that may possess other qualities such as stability, catalytic proficiency and rapid replication. Such chemical entities (think 'DNA') will persist and proliferate and they will be used in multiple designs. Consider coiled-coil structures with their typical seven-residue amino acid motifs or the catalytic triad that cleaves peptide bonds in proteases. Or think of something that's bleedingly simple- the phosphate group which, by virtue of its remarkable qualities of 'transient stability' to hydrolysis, proves to be the perfect connection for life's lego pieces. Once nature hit upon such designs, they could be easily employed in many different structures, dramatically reducing the amount of functional space to be searched. From a chemical perspective, the key property of these favored motifs is self-assembly which is driven by many well-understood physicochemical factors such as the aforementioned hydrophobic effect. Self-assembly, surely one of the greatest inventions of the laws of physics and chemistry, took the problem of the origin of life from miraculous impossibility to tantalizing possibility.
If nature can use pre-existing functionalities to solve the protein folding problem, why can't we do the same? Indeed, many theoretical approaches to protein folding have adopted this kind of approach. Probably the foremost algorithm for predicting protein folding today is a suite of programs called Rosetta which was originally developed by David Baker's group at the University of Washington. In a competition to predict protein structures in 2001, the program did so well that it was compared by a very famous computational chemist named Peter Kollman to Babe Ruth's world record, when even the second-best competitor was woefully lagging behind.
In the next post we will take a look at this program and why it works so successfully.
2 hours ago in Variety of Life