The Amino Acid Code - Biomatics.org

# The Amino Acid Code

Posttranslational modification (PTM) is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis for many proteins.  A protein (also called a polypeptide) is a chain of amino acids. During protein synthesis, 20 different amino acids can be incorporated in proteins. After translation, the posttranslational modification of amino acids extends the range of functions of the protein by attaching to it other biochemical functional groups such as acetate, phosphate, various lipids and carbohydrates, by changing the chemical nature of an amino acid (e.g. citrullination) or by making structural changes, like the formation of disulfide bridges. This is one category of input code into a protein.  "Amino Acid Code" may have other meanings as well.

Atoms in Amino Acids

## The Twenty Elementary Circuits

The growing similarities between modern microelectronic circuits and living systems increasingly suggest the electronic nature of the latter. Indeed, the ideas of quantum-mechanical electron tunnelling or hopping between localised states developed in physics to explain conduction in non-crystalline solids are readily applicable to biology. Electron tunnelling transitions between energy states localized in biomolecules can explain such vital processes as photosynthesis, respiration and visual reception, and also form a logical framework to explain reactions induced by high energy radiolysis. It is suggested that there is sufficient evidence to support an argument that the same quantum-mechanical tunnelling process will occur through the localised states set up in biopolymers such as the proteins. The consequence would be highly vectorial electron transport along biopolymer pathways, the results of which are shown to have far-reaching implications. There is considerable difficulty in verifying these ideas experimentally and progress has been slow, but it is very probable that electronic physics is now poised to provide a framework for a new understanding of biology.

 Electron density in the amino acid cystein calculated using a quantum-chemistry computer program. The picture shows the surface where the electron density is 0.002 electrons/Å3 (meaning that nearly all electrons are inside the surface). The grey scale shows the electrostatic potential at this surface, darker portions representing negative potential.

### Analog Computation

An analog computer (spelled analogue in British English) is a form of computer that uses continuous physical phenomena such as electrical,[1] mechanical, or hydraulic quantities to model the problem being solved.

The similarity between linear mechanical components, such as springs and dashpots, and electrical components, such as capacitors, inductors, and resistors is striking in terms of mathematics. They can be modeled using equations that are of essentially the same form.

In analog computers, computations are often performed by using properties of electrical resistance, voltages and so on. For example, a simple two variable adder can be created by two current sources in parallel. The first value is set by adjusting the first current source (to say x milliamperes), and the second value is set by adjusting the second current source (say y milliamperes). Measuring the current across the two at their junction to signal ground will give the sum as a current through a resistance to signal ground, i.e., x+y milliamperes. (See Kirchhoff's current law) Other calculations are performed similarly, using operational amplifiers and specially designed circuits for other tasks.

Nonlinear functions and calculations can be constructed to a limited precision (three or four digits) by designing function generators— special circuits of various combinations of capacitance, inductance, resistance, in combination with diodes (e.g., Zener diodes) to provide the nonlinearity. Generally, a nonlinear function is simulated by a nonlinear waveform whose shape varies with voltage (or current). For example, as voltage increases, the total impedance may change as the diodes successively permit current to flow.

Any physical process which models some computation can be interpreted as an analog computer.

Analog circuits are circuits dealing with signals free to vary from zero to full power supply voltage. This stands in contrast to digital circuits, which almost exclusively employ "all or nothing" signals: voltages restricted to values of zero and full supply voltage, with no valid state in between those extreme limits. Analog circuits are often referred to as linear circuits to emphasize the valid continuity of signal range forbidden in digital circuits, but this label is unfortunately misleading. Just because a voltage or current signal is allowed to vary smoothly between the extremes of zero and full power supply limits does not necessarily mean that all mathematical relationships between these signals are linear in the "straight-line" or "proportional" sense of the word.

Perhaps the most versatile and important analog integrated circuit is the operational amplifier, or op-amp. Essentially nothing more than a differential amplifier with very high voltage gain, op-amps are the workhorse of the analog design world. By cleverly applying feedback from the output of an op-amp to one or more of its inputs, a wide variety of behaviors may be obtained from this single device. Many different models of op-amp are available

.

Components

Analog computers often have a complicated framework, but they have, at their core, a set of key components which perform the calculations, which the operator manipulates through the computer's framework.

Key hydraulic components might include pipes, valves or towers; mechanical components might include gears and levers; key electrical components might include:

The core mathematical operations used in an electric analog computer are:

Differentiation with respect to time is not frequently used. It corresponds in the frequency domain to a high-pass filter, which means that high-frequency noise is amplified.

## The Twenty Elemental Algebraic Structures

Representative electron density for amino acid side chains arranged in order of increasing size.
From an experimental electron density map calculated at 1.5 Angstrom resolution.

Stick model of the cyclol C1 protein structure proposed by Dorothy Wrinch. The molecule is a truncated tetrahedron composed of four planar cyclol fabrics, each surrounding one lacuna (48 residues), and joined together pairwise by four residues along each edge (two residues at each corner). Thus, this molecule has 72 amino-acid residues altogether. It is viewed here "face-on", i.e., looking into the lacuna of one cyclol fabric. The side chains (taken here as alanine) all point into the interior of this "cage-like" structure. This hypothetical structure has not been observed in nature.

Model Building

This is the process where the electron density map is interpreted in terms of a set of atomic coordinates. This is more straightforward in the molecular replacement case because we already have a coordinate set to work with. In the case of isomorphous replacement we simply have the map. It is essentially a 3-dimensional jigsaw puzzle with the pieces being the amino acid residues. The normal procedure is to fit a protein backbone first then if the resolution permits, we insert the sequence. The amount of detail that is visible is dependent on the resolution and the quality of the phases. Shown below is a high resolution electron density map with atomic coordinates superposed. Often regions of high flexibility are not visible at all due to static disorder, where the structure varies from one molecule to the next within the crystal, or dynamic disorder, where the region is mobile within the crystal. The latter type of disorder is eradicated in cryogenic data collection.

http://www.jic.ac.uk/staff/david-lawson/xtallog/summary.htm#wdwnac

A Boolean algebra is a mathematical structure that is similar to a Boolean ring, but that is defined using the meet and join operators instead of the usual addition and multiplication operators. Explicitly, a Boolean algebra is the partial order on subsets defined by inclusion (Skiena 1990, p. 207), i.e., the Boolean algebra of a set is the set of subsets of that can be obtained by means of a finite number of the set operations union (OR), intersection (AND), and complementation (NOT) (Comtet 1974, p. 185). A Boolean algebra also forms a lattice (Skiena 1990, p. 170), and each of the elements of is called a Boolean function. There are Boolean functions in a Boolean algebra of order (Comtet 1974, p. 186).

In 1938, Shannon proved that a two-valued Boolean algebra (whose members are most commonly denoted 0 and 1, or false and true) can describe the operation of two-valued electrical switching circuits. In modern times, Boolean algebra and Boolean functions are therefore indispensable in the design of computer chips and integrated circuits.

Boolean algebras have a recursive structure apparent in the Hasse diagrams illustrated above for Boolean algebras of orders , 3, 4, and 5. These figures illustrate the partition between left and right halves of the lattice, each of which is the Boolean algebra on elements (Skiena 1990, pp. 169-170). The Hasse diagram for the Boolean algebra of order is implemented as BooleanAlgebra[n] in the <em>Mathematica</em> package Combinatorica`) . It is isomorphic to the -hypercube graph

The name "lattice" is suggested by the form of the Hasse diagram depicting it. Shown here is the lattice of partitions of a four-element set, ordered by the relation "is a refinement of".

In mathematics, a lattice is a partially ordered set (also called a poset) in which subsets of any two elements have a unique supremum (the elements' least upper bound; called their join) and an infimum (greatest lower bound; called their meet). Lattices can also be characterized as algebraic structures satisfying certain axiomatic identities. Since the two definitions are equivalent, lattice theory draws on both order theory and universal algebra. Semilattices include lattices, which in turn include Heyting and Boolean algebras. These "lattice-like" structures all admit order-theoretic as well as algebraic descriptions.

Various representations of Boolean operations

In electronics, a three-dimensional integrated circuit (3D IC, 3D-IC, or 3-D IC) is a chip with two or more layers of active electronic components, integrated both vertically and horizontally into a single circuit.

3D ICs offer many significant benefits, including:

Footprint – More functionality fits into a small space. This extends Moore’s Law and enables a new generation of tiny but powerful devices.

Speed – The average wire length becomes much shorter. Because propagation delay is proportional to the square of the wire length, overall performance increases.

Power – Keeping a signal on-chip reduces its power consumption by ten to a hundred times.[1] Shorter wires also reduce power consumption by producing less parasitic capacitance. Reducing the power budget leads to less heat generation.

Design – The vertical dimension adds a higher order of connectivity and opens a world of new design possibilities.

Heterogeneous integration – Circuit layers can be built with different processes, or even on different types of wafers. This means that components can be optimized to a much greater degree than if they were built together on a single wafer. Even more interesting, components with completely incompatible manufacturing could be combined in a single device[2].

Circuit security - The stacked structure hinders attempts to reverse engineer the circuitry. Sensitive circuits may also be divided among the layers in such a way as to obscure the function of each layer.[3]

Bandwidth - 3D integration allows large numbers of vertical vias between the layers. This allows construction of wide bandwidth buses between functional blocks in different layers. A typical example would be a processor+memory 3D stack, with the cache memory stacked on top of the processor. This arrangement allows a bus much wider than the typical 128 or 256 bits between the cache and processor. Wide buses in turn alleviate the memory wall problem.[4]

## Dihedral angles of biological molecules

The backbone dihedral angles of proteins are called φ (phi, involving the backbone atoms C'-N-Cα-C'), ψ (psi, involving the backbone atoms N-Cα-C'-N) and ω (omega, involving the backbone atoms Cα-C'-N-Cα). Thus, φ controls the C'-C' distance, ψ controls the N-N distance and ω controls the Cα-Cα distance.

The planarity of the peptide bond usually restricts ω to be 180° (the typical trans case) or 0° (the rare cis case). The distance between the Cα atoms in the trans and cis isomers is approximately 3.8 and 2.9 Å, respectively. The cis isomer is mainly observed in Xaa-Pro peptide bonds (where Xaa is any amino acid).

The sidechain dihedral angles of proteins are denoted as χ15, depending on the distance up the sidechain. The χ1 dihedral angle is defined by atoms N-Cα-Cβ-Cγ, the χ2 dihedral angle is defined by atoms Cα-Cβ-Cγ-Cδ, and so on.

The sidechain dihedral angles tend to cluster near 180°, 60°, and -60°, which are called the trans, gauche+, and gauche- conformations. The choice of sidechain dihedral angles is affected by the neighbouring backbone and sidechain dihedrals; for example, the gauche+ conformation is rarely followed by the gauche+ conformation (and vice versa) because of the increased likelihood of atomic collisions.

Dihedral angles have also been defined by the IUPAC for other molecules, such as the nucleic acids (DNA and RNA) and for polysaccharides.

The interesting discovery of Baranov and Schlag was that any energy difference is not constant but rather strongly dependent on the angle between the adjacent carbamide groups. The N and O orbitals appear to switch. For the ionized species in a small range of ψ and , when the carbonyl groups of the neighboring amino acids are only about 2.87 Å apart, the electronic energy difference reaches a minimum. For the symmetric approach of carbonyls from the other side and the other ion a similar state exists. These two states at this point are isoenergetic with little or no energy barrier between them. At this angle they are also strongly correlated and form one hybridized state. We refer to this as the firing state for charge hopping

Charge transport in a polypeptide. (a) Polypeptide chain. Charge is first created on the donor and then hops through the amino acid chain until reaching the acceptor. (b) On each amino acid, the motion of the rotors is mapped into a Ramachandran plot. Here a simple two-dimensional (2D) area in phase space represents the hinge, i.e. the junction of two amino acids. The exit or gate part (orange) is the charge ratchet position. After the motion of rotors reaches the gate part, charge jumps to the next amino acid. The iteration of the previous procedure makes the charge hop to the final site.

## Conformational Transition Algebra

The above image is a Moncznik (Perry Moncznik) multiplication table of an amino acid  with 4 dynamic  binary sites. According to Mealy-Moore models there are 16 states with 256 possible state transitions (16 x 16) that could represent 16 processes as well as switches for 256 possible processes. This is of course a simplification because the inputs are not all or none but rather graded due to the differing size of molecules able to bind to receptor sites.

In the theory of computation, a Mealy machine is a finite state machine (and more accurately, a finite state transducer) that generates an output based on its current state and an input. This means that the state diagram will include both an input and output signal for each transition edge. In contrast, the output of a Moore finite state machine depends only on the machine's current state; transitions have no output attached. However, for each Mealy machine there is an equivalent Moore machine.

Side Chain Conformation

The side chain atoms of amino acids are named in the Greek alphabet according to this scheme.

The side chain torsion angles are named c1(chi1), c2(chi2), c3 (chi3), etc., as shown below for lysine.

The c1 angle is subject to certain restrictions which arise from steric hindrance between the g side chain atom(s) and the main chain. The different conformations of the side chain as a function of c1 are referred to as gauche(+), trans and gauche(-). These are indicated in the diagrams below in which the amino acid is viewed along the Cb-Ca bond.

The most abundant conformation is gauche(+) in which the g side chain atom is opposite to the residue's main chain carbonyl group when viewed along the Cb-Ca bond.

The second most abundant conformation is trans in which the side chain g atom is opposite the main chain nitrogen.

The least abundant conformation is gauche(-) which occurs when the side chain is opposite the hydrogen substituent on the Ca atom. This conformation is unstable because the g atom is in close contact with the main chain CO and NH groups. The gauche(-) conformation is occasionally adopted by serine or threonine residues in a helix where the steric hindrance is offset by a hydrogen bond between the g oxygen atom and the main chain.

With most amino acids the gauche(+) and trans conformations are adopted with similar abundances although the gauche(+) conformation tends to dominate.

Aliphatic amino acids which are bifurcated at Cb, ie valine and isoleucine, do not adopt the trans conformation very often as this involves one of the Cg atoms being in the unfavourable gauche(-) 'position'.

In general, side chains tend to adopt the same three torsion angles (+/-60 and 180 degrees) about c2 since these correspond to staggered conformations. However, for residues with an sp2 hydridised g atom such as phenylalanine, tyrosine, etc., c2 rarely equals 180 degrees because this would involve an eclipsed conformation. For these side chains the c2 angle is usually close to +/-90 degrees as this minimises close contacts. For residues such as aspartate and asparagine the c2 angles are strongly influenced by the hydrogen bonding capacity of the side chain and its environment. Consequently, these residues adopt a wide range of c2 angles.

Here are some conformations that can be adopted by Arginines:

## Fundamental Frequency

The fundamental frequency, often referred to simply as the fundamental and abbreviated f0 or F0, is defined as the lowest frequency of a periodic waveform. In terms of a superposition of sinusoids (e.g. Fourier series), the fundamental frequency is the lowest frequency sinusoidal in the sum.

All sinusoidal and many non-sinusoidal waveforms are periodic, which is to say they repeat exactly over time. A single period is thus the smallest repeating unit of a signal, and one period describes the signal completely. We can show a waveform is periodic by finding some period T for which the following equation is true:

x(t) = x(t + T) = x(t + 2T) = x(t + 3T) = ...

Where x(t) is the function of the waveform.

This means that for multiples of some period T the value of the signal is always the same. The lowest value of T for which this is true is called the fundamental period (T0) and thus the fundamental frequency (F0) is given by the following equation:

$F_0=\frac{1}{T_0}$

Where F0 is the fundamental frequency and T0 is the fundamental period.

If a protein is viewed as being fixed in position at one end, the amino end, then rotation about the first ramachandran angle is the fundamental frequency of the protein.

## Alanine

Accurate geometries, relative energies, rotational and quartic centrifugal distortion constants, dipole moments, harmonic vibrational frequencies, and infrared intensities have been determined from ab initio calculations for 13 conformers, corresponding to minima on the potential energy surface, of the neutral form of the amino acid α-alanine.

http://theop11.chem.elte.hu/main_index_files/1996_Csaszar_Alanine_JPC_100_3541.pdf

Alanine,like many molecular compounds,exists as a mixture of conformational structures. A full understanding of the properties of this substance requires an accurate description of this mixture. Calculations in the literature indicate that there are ten relatively stable conformers for alanine.

http://www.etown.edu/docs/PhysicsEngineering/projects/projects%202007/ElectronicStructure.pdf

## Lysine

Conformational projection onto a 2D hexagonal grid assuming 2 stable conformations at each bond: