Protein Structure & Conformation

In three dimension, molecular structure is uniquely described by Cartesian coordinates (x,y,z) or internal coordinates (bond length, bond angle and torsion angles) of each of the atom constituting the molecule. The latter description is preferred by theoretical and computational chemists dealing with small molecules. Proteins and biomolecular structures are described by 3D coordinates (x, y, z), available from Protein Data Bank (PDB) repository. which are essentially orthogonal coordinates i.e. coordinates in Euclidean space. In case of small molecules, fractional coordinates (i.e. fraction of unit cell x/a, y/b & z/c, where a, b, c are crystallographic unit cell dimensions) are common. These coordinates are easily interconvertible. Following are four primary database of structures of all types of molecules.

Database of molecular structures

# Database Data type Website
1 PDB (Protein Data Bank) and its variants PDBe, wwPDB, PDBsum etc. Structures of Macromolecules (Protein, DNA, RNA, long polypeptides etc.) www.rcsb.org
2 CSD (Cambridge Structural Database) Structures of organic and metalloorganic compounds ccdc.cam.ac.uk>ccdc.cam.ac.uk
3 ICSD (inorganic crystal structures database) Structure of inorganic compounds https://icsd.products.fiz-karlsruhe.de/en
4 NIST ICSD reference database Structures of inorganics, ceramics, minerals, pure elements, metals, and intermetallic systems https://www.nist.gov/srd/nist-standard-reference-database-3

Protein structure essentially refer to protein conformation

Since bond length and bond angles between atoms cannot vary too much about their equilibrium values (Engh & Huber, 1991; Allen et al., 1987), hence protein structures are essentially characterized by conformational parameters only. The energy to change bond length, angle and torsion angle can be evaluated using molecular mechanics approach easily (Ramachandran & Sasisekharan, 1968).

Bond length variation

The amount of energy required to deform bond length about its equilibrium (normal value) is modeled by harmonic potential as follows:

Vl = 1/2 Kl (Δl)2

Where Kl is a force constant (kcal/mol/Å2), Vl is energy required to deform bond length by Δl (in Å). The Kl values are of the order of 500 - 1200 kcal/mole/Å2, suggesting that to change the bond length by just 0.1 Å, energy of the order of 5 kcal/mol would be required. Therefore, bond strains greater than 0.05 Å are not expected to occur in nature.

Bond Angle Variation

Similarly, the energy required to deform a bond angle by Δτ (in radians) is given by:

Vt = (1/2) Kτ (Δτ)2

where Kτ is the bending force constant (in kcal/mol) and Vt is the bond deformation energy. Arguing in a similar manner, bond angle variation of 5° will give rise to an increase in energy of about 0.3 kcal/mol. Hence, deformations beyond 10° are rarely expected to occur unless compensated by local conformation.

Torsion angle variation

Unlike bond length and angle variation (which follow harmonic motion), the rotation about a single bond behaves like a periodic motion and the energy function can be described as follows (Lewars, 2003):

Etorsion = k0 + ∑r=1n kr [1 + cos(rθ)]

Such barriers are comparatively low, having only a fraction of kcal/mol, and such rotations are expected to occur freely in nature. Given this analysis, it is unlikely that any two protein structures would vary dramatically in terms of bond length and angles; only torsion angles need to be measured. Hence, for all practical purposes, protein structure is described by protein conformation alone.

Torsion Angle Nomenclature

Main Chain Torsion Angles

Every residue has three main chain torsion angles and 0-4 side chain torsion angles. The main chain torsion angles, which involve the atoms N, CA, and C(=O), are:

  • ϕ (phi): Describes the rotation about the N-CA single bond.
  • ψ (psi): Describes the rotation about the CA-C single bond.
  • ω (omega): Describes the torsion angle about the C-N peptide bond, even though there is no rotatable bond. This angle describes the planarity of the peptide bond, which varies between:
    • -10° to 10° (cis conformation of the peptide)
    • -170° to 170° through 180° (trans conformation)

About 5% of conformations occur in cis conformation for the X-Pro peptide bond, and 0.05% for the rest (Stewart et al., 1990).

Side Chain Torsion Angles

Side chain torsion angles are described by χ (chi) and are determined based on the branching of the side chain atoms (e.g., CB, CG, etc.). For example, in a tripeptide Ala(i-1) – Leu(i) – Met(i+1), the torsion angle definitions are provided as follows:

  • Torsion angles of the ith residue, Leu(i):
    • Main chain torsion angles:
    • ϕ: C(i-1)-N(i)-CA(i)-C(i)
    • ψ: N(i)-CA(i)-C(i)-N(i+1)
    • ω: CA(i)-C(i)-N(i+1)-CA(i+1)
    • Side chain torsion angles:
    • χ1: N(i)-CA(i)-CB(i)-CG(i)
    • χ21: CA(i)-CB(i)-CG(i)-CD1(i)
    • χ22: CA(i)-CB(i)-CG(i)-CD2(i)
2D Plot

Apart from ϕ, ψ, and ω, another main chain torsion angle associated with the C=O bond is described by ν. This angle is not required as it is related to ω. Additionally, any other torsion angle in general is described by θ in a positive sense of rotation (right-hand rotation).

Nomenclature of Biopolymers

To define standards for bimolecular usage, IUBMB has recommended several nomenclature scheme with regard to names, atomic label, conformation and terms of amino acids, nucleic acids and other biomolecules. The relevant details are described here:

Residue & atom names, abbreviations & their numbering In polymeric chain

# Data type Website
1 Nomenclature and Symbolism for Amino Acids and Peptides http://www.chem.qmul.ac.uk/iupac/AminoAcid/index.html
2 Nucleic Acids, Polynucleotides and their Constituents http://www.chem.qmul.ac.uk/iupac/misc/naabb.html
3 Nomenclature of Carbohydrates http://www.chem.qmul.ac.uk/iupac/2carb/

Definitions of geometric and conformational parameters and their numberings

# Data type Website
1 Conformation of Polypeptide Chains http://www.chem.qmul.ac.uk/iupac/misc/ppep1.html
2 Conformations of Polynucleotide Chains http://www.chem.qmul.ac.uk/iupac/misc/pnuc1.html
3 Conformation of Polysaccharide Chains http://www.chem.qmul.ac.uk/iupac/misc/psac.html

Definition of Torsion angle nomenclature: Torsion angles in molecular structures can be described using two primary nomenclature systems:

  • Syn & Peri Nomenclature: This system uses the Klyne-Prelog definition, where torsion angles are described inside the innermost circle.
  • g+, g-, and t Nomenclature: Commonly used for describing side chain rotamers. (IUPAC Nomenclature of Organic Chemistry, 1976)
2D Plot
Dihedral vs. Torsion Angles

Often these two terms are used synonymously, but they describe different properties. A dihedral angle is an angle between two planes (di-hedra meaning two planes), such as the plane defined by atoms A-B-C and the plane defined by atoms B-C-D in a four-atom system A-B-C-D.

The angle between these two planes varies in the range of 0-360 degrees, and initially, dihedral angles were proposed to vary within this range. Following the IUBMB recommendations for biochemical nomenclature, these angles, now called torsion angles, are defined in the range of -180 to 180 degrees, allowing the relationship between enantiomeric configurations or conformations to be readily appreciated.

Such angles can be obtained by calculating the normal to the planes A-B-C and B-C-D. Prior to 1969, protein torsion angles were described in the range of 0–360 degrees. Thus, the torsion angle (post-1969) = torsion angle (pre-1969) - 180º (IUPAC Commission on Biochemical Nomenclature, 1971).

References

  1. Ramachandran, G. N., & Sasisekharan, V. (1968). Conformation of polypeptides and proteins. Advances in protein chemistry, 23, 283–438.
  2. IUPAC-IUB Commission on Biochemical Nomenclature. Abbreviations and symbols for the description of the conformation of polypeptide chains. Tentative rules (1969). (1971). The Biochemical journal, 121(4), 577–585.
  3. IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN). Nomenclature and symbolism for amino acids and peptides. Recommendations 1983. (1984). European journal of biochemistry, 138(1), 9–37.
  4. Engh, R. A., & Huber, R. (1991). Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallographica Section A: Foundations of Crystallography, 47(4), 392-400.
  5. Lewars, E. G., & Lewars, E. G. (2011). An Outline of What Computational Chemistry Is All About. Computational Chemistry: Introduction to the Theory and Applications of Molecular and Quantum Mechanics, 1-7.
  6. Allen, F. H., Kennard, O., Watson, D. G., Brammer, L., Orpen, A. G., & Taylor, R. (1987). Tables of bond lengths determined by X-ray and neutron diffraction. Part 1. Bond lengths in organic compounds. Journal of the Chemical Society, Perkin Transactions 2, (12), S1-S19.
  7. Stewart, D. E., Sarkar, A., & Wampler, J. E. (1990). Occurrence and role ofcis peptide bonds in protein structures. Journal of molecular biology, 214(1), 253-260.
  8. Crystal structure analysis: a primer, by Jenny Pickworth Glusker and Kenneth N. Trueblood, IUCr Texts on Crystallography, Vol. 14.
  9. IUPAC COMMISSION ON NOMENCLATRE OF ORGANIC CHEMISTRY (1976), Pure and Appl. Chem., 45, 11-30.