There are three protein-folding problems in contemporary computational biology. The first one is to predict protein structure from sequence, the second one is to predict protein folding pathways, the third is the biomatic problem of determining the computational-logical consequences of folding pathways.
"Biological protein-protein interactions differ from the more general class of physical interactions; in a biological interaction, both proteins must be in their proper states (e.g. covalently modified state, conformational state, cellular location state, etc.). Also in every biological interaction, one or both interacting molecules undergo a transition to a new state. This regulation of protein states through protein-protein interactions underlies many dynamic biological processes inside cells.
Therefore, understanding biological interactions requires information on protein states. "
Protein folded states are kinetic hubs.
G. Bowman and V. S. Pande.
Proceedings of the National Academy of Sciences, USA 107 10890-10895 (2010)
ABSTRACT.Understanding molecular kinetics, and particularly protein folding, is a classic grand challenge in molecular biophysics. Network models, such as Markov state models (MSMs), are one potential solution to this problem. MSMs have recently yielded quantitative agreement with experimentally derived structures and folding rates for specific systems, leaving them positioned to potentially provide a deeper understanding of molecular kinetics that can lead to experimentally testable hypotheses. Here we use existing MSMs for the villin headpiece and NTL9, which were constructed from atomistic simulations, to accomplish this goal. In addition, we provide simpler, humanly comprehensible networks that capture the essence of molecular kinetics and reproduce qualitative phenomena like the apparent two-state folding often seen in experiments. Together, these models show that protein dynamics are dominated by stochastic jumps between numerous metastable states and that proteins have heterogeneous unfolded states (many unfolded basins that interconvert more rapidly with the native state than with one another) yet often still appear two-state. Most importantly, we find that protein native states are hubs that can be reached quickly from any other state. However, metastability and a web of nonnative states slow the average folding rate. Experimental tests for these findings and their implications for other fields, like protein design, are also discussed.
The Basic construction of the model protein is shown where C-alpha atoms are numbered as 1, 2, 3, etc., whereas the side residues are shown by 1', 2', 3' etc. Note the varying size of the side residues.
Hydrophobic-polar protein folding model
The hydrophobic-polar protein folding model is a highly simplified model for examining protein folds in space. First proposed by Dill in 1985, it is motivated by the observation that hydrophobic interactions between amino acid residues are the driving force for proteins folding into their native state. All amino acid types are classified as either hydrophobic (H) or polar (P), and the folding of a protein sequence is defined as a self-avoiding walk in a 2D or 3D lattice. The HP model imitates the hydrophobic effect by assigning a negative (favorable) weight to interactions between adjacent, non-covalently bound H residues. Proteins that have minimum energy are assumed to be in their native state.
The HP model can be expressed in both two and three dimensions, generally with square lattices, although triangular lattices have been used as well.
Randomized search algorithms are often used to tackle the HP folding problem. This includes stochastic, evolutionary algorithms like the Monte Carlo method, genetic algorithms, and ant colony optimization. While no method has been able to calculate the experimentally determined minimum energetic state for long protein sequences, the most advanced methods today are able to come close.
Recently, a Monte Carlo method, named FRESS, was developed and appears to perform well on HP models.
Structured Pathway across the Transition State for Peptide Folding Revealed by Molecular Dynamics Simulations
The folding dynamics of many small protein/peptides investigated recently are in terms of simple two-state model in which only two populations exist (folded and unfolded), separated by a single free energy barrier with only one kinetically important transition state (TS). However, dynamical characterization of the folding TS is challenging. We have used independent unbiased atomistic molecular dynamics simulations with clear folding-unfolding transitions to characterize structural and dynamical features of transition state ensemble of Peptide 1. A common loop-like topology is observed in all TS structures extracted from multiple simulations. The trajectories were used to examine the mechanism by which the TS is reached and subsequent events in folding pathways. The folding TS is reached and crossed in a directed stagewise process rather than through random fluctuations. Specific structures are formed before, during, and after the transition state, indicating a clear structured folding pathway.
Finite state model
The process of the folding of a protein can be described in terms of Finite State Machine(FSM) theory, one of the mathematical underpinnings of Computer Science.
The Amino Acid Code
The Histone Code
Chaperone proteins within the endoplasmic reticulum play an essential role in facilitating the folding of newly synthesized proteins and in recognizing and segregating misfolded proteins, thereby preventing their transit to the Golgi.
Protein Folding at Stanford
Folding by Rotation about Single Bonds:
Since bond length and angles are fairly invariant in the known protein structures, the key to protein folding lies in the torsion angles of the backbone.
A torsion angles is defined by 4 atoms, A, B, C and D.
When atoms A, B, C and D are mainchain atoms (ie. the carboxylic carbon, C1; the alpha carbon, C2 or C-alpha; and the amide group nitrogen, N), There are THREE repeating torsion angles along the backbone chain called phi, psi and omega.
Prion diseases or transmissible spongiform encephalopathies (TSEs) are a family of rare progressive neurodegenerative disorders that affect both humans and animals. They are distinguished by long incubation periods, characteristic spongiform changes associated with neuronal loss, and a failure to induce inflammatory response.
The causative agent of TSEs is believed to be a prion. A prion is an abnormal, transmissible agent that is able to induce abnormal folding of normal cellular prion proteins in the brain, leading to brain damage and the characteristics signs and symptoms of the disease. Prion diseases are usually rapidly progressive and always fatal.
Amyloids are insoluble fibrous protein aggregates sharing specific structural traits. They arise from at least 18 inappropriately folded versions of proteins and polypeptides present naturally in the body. These misfolded structures alter their proper configuration such that they erroneously interact with one another or other cell components forming insoluble fibrils. They have been associated with the pathology of more than 20 serious human diseases in that, abnormal accumulation of amyloid fibrils in organs may lead to amyloidosis, and may play a role in various neurodegenerative disorders.
Our cells contain tiny molecular clocks that measure out a 24-hour circadian rhythm. This clock decides when we get hungry and when we get sleepy. This clock can sense when the days are getting longer and shorter, and then trigger seasonal changes. Our major clock is housed in a small region of the brain, called the suprachiasmic nuclei. It acts as our central pacemaker, checking the cycles of light and dark outside, and then sending signals to synchronize clocks throughout the rest of the body.
Animal cells use a complex collection of proteins (with fanciful names like Clock, Cryptochrome, and Period) that are rhythmically synthesized and degraded each day. The 24-hour oscillation of the levels of these proteins is controlled by a series of interconnected feedback loops, where the levels of the proteins precisely regulate their own production. A much simpler system has been discovered in cyanobacteria. It is composed of three proteins, KaiA, KaiB and KaiC, that together form a circadian clock. At the beginning of the cycle, KaiA ( PDB entry 1r8j) stimulates the large KaiC hexamer ( PDB entry 2gbl), which then adds phosphate groups to itself. Then, as KaiC fills itself up with phosphates, it binds to KaiB ( PDB entry 1r5p), which inactivates KaiA and allows the phosphates to be slowly removed. As the number of phosphates drops, KaiB falls off and KaiA can start the cycle again.