Protein Background

Proteins are biomacromolecules that are complex in structure and function. They are an essential component of a balanced diet, a miracle molecule for food structure and texture, and the workhorses of much of our digestion and metabolism…among many other functions! We will study protein structure in terms of the bonds and different types of intermolecular forces that all contribute to a final, functional protein structure.

The Building Blocks of Proteins

The shape of a protein is essential to its function. For an enzyme, the protein must recognize and bind to a substrate molecule in a way that is favorable to the desired chemical reaction. For a membrane protein, the protein must expose a hydrophobic core that will align with the hydrophobic membrane it resides in. For a structural protein, the dimensions and rigidity of the structure must be suited to its function.

Considering this complexity, proteins are made up of combinations of (usually) only 20 building blocks comprised of (usually) only 5 different elements. The building blocks are called amino acids. In the amino acid shown at the right, the amino group is shown in blue and the acid group is shown in red. This structure is shown in the physiologically relevant form, assuming a near-neutral pH aqueous environment. The amine group (—NH₂) is basic and will accept a proton from water, so is shown here in its protonated, conjugate acid form (—NH₃⁺). The carboxylic acid group (—COOH) is acidic and will have lost its proton to the water, so is shown here as its conjugate base, the carboxylate (—COO^–).

A protein is assembled by forming covalent bonds between amine nitrogens and carbonyl carbons to form amide bonds, aka peptide bonds. This occurs with the loss of water (an –OH from the carboxyl and an –H from the amine) and is known as dehydration synthesis. Proteins are formed from long, linear polymers of covalently bonded amino acids arranged in a particular sequence as dictated by the genetic code for that protein.

The Diversity of Amino Acids

Amino acids are often considered in two portions: the backbone and the sidechains.

The backbone consists of the amide bond (or amide forming amine and carboxyl) as well as the carbon in between each amide, referred to as the α-carbon. The image below highlights the backbone of a sequence of amino acids.

The sidechains are the “R” groups from the amino acid structure on the previous page and they branch off of the α-carbon. In addition, every α-carbon is attached to one hydrogen, where the R group and the hydrogen always have the same arrangement with respect to the rest of the amino acid.

For example, an amino acid is shown below. From this view (with the nitrogen pointing down to the left and the carbonyl pointing down the right), the hydrogen points up and into the page and the R-group (here —CH₂OH) is coming up and out of the page.

There are twenty standard amino acids that are encoded for in our DNA. Amino acid sidechains add chemical diversity that are essential for both protein structure and function. Sidechains vary in their interactions with water (hydrophilic or hydrophobic), their ability to form interactions with one another (hydrogen bonds, ionic interactions, disulfide bonds, etc.), and in their physical shape (large, small; linear, branched, ring shaped; etc.).

Levels of Protein Structure

Protein structure consists of several levels of organization, each of which assembles through specific bonding and/or intermolecular interactions. A list of the amino acids in the chain, connected one after the other as described above, is called the primary structure. This level of protein structure is constructed from the covalent bonds formed between amino acids along the length of the backbone.

Secondary structure is the folding of this strand into well-characterized alpha helices (α-helices; orange) or beta pleated sheets (β-sheets; green) through the formation of hydrogen bonds between the carbonyl oxygen of one amino acid and the amide proton of another. This means that secondary structure is largely formed through interactions of the protein backbone. This has raised some interesting questions about what causes certain amino acid sequences to form these structures when the amino acid sidechains seem to have little to do with the structures themselves.

The following images of secondary structure highlight the backbone (outlined in black) with carbons in gray, nitrogens in blue, and oxygens in red. The essential hydrogen-bonding interactions between amide protons (attached to the blue nitrogen) and carbonyl oxygens (red) are shown as black dotted lines. The sidechains are semitransparent gray.

In α-helices (above), the hydrogen bonding interactions occur 3.5 amino acids apart from each other to form their characteristic tightly wound coils.

In β-sheets (below), the hydrogen bonds occur between residues on separate strands that run anti-parallel to each other and result in a two-dimensional flat sheet-like structure.

Tertiary structure is the folding together of secondary structural elements and other unstructured protein strands, creating a globular protein with a particular shape. The image at right shows the assembled orange α-helices, green β-sheets, and grey unstructured strands in the SH2 protein. Quaternary structure is subsequently formed for some proteins, where multiple tertiary structures from separate protein molecules assemble together into a higher-order structure. Collagen, catalase, and DNA polymerase are examples.

The hydrophobic effect suggests that it is energetically favorable to bury hydrophobic amino acid sidechains inside a protein in order to increase the entropy of the surrounding hydrophilic solvent molecules (e.g. water). This aids in the formation of the tertiary and quaternary globular protein structures. In the image at the left, the hydrophobic amino acids (alanine, glycine, valine, isoleucine, leucine, phenylalanine, and methionine) are shown as sticks to highlight their arrangement toward the interfaces between the secondary structural elements.

Protein folding is a complex process and there can be errors where proteins misfold, collectively termed protein folding diseases. These include Alzheimer’s disease (believed to be caused, at least in part, by the collection of insoluble misfolded proteins, referred to as plaques) and some cancers (such as misfolding of p53 which functions normally as a tumor suppressor).

Proteins can also be unfolded, or denatured, through a variety of means. Proteins can be denatured by a variety of chemical means: by detergents that will disrupt the hydrophobic packing of the protein core, by acids that will neutralize acidic side chains and thus break salt bridge (ionic) interactions, by alcohol which will disrupt hydrogen bonding (which is how alcohol-based hand sanitizers work), or by heavy metal salts or reducing agents which disrupt disulfide bonds. Proteins can also be denatured under high temperature, where the increased kinetic energy will disrupt intermolecular forces. Proteins can also be unfolded using physical force, as is studied by pulling using an atomic-scale tweezer called an optical tweezer.

Examples of Protein Structures

As mentioned in the introduction, protein structures serve a variety of purposes. A few examples of different functions and different familiar proteins are included to demonstrate this diversity.

Catalase is a common protein to study in high school biology classes. It is an enzyme that catalyzes the decomposition of hydrogen peroxide into oxygen and water. Catalase is an example of a protein with a quaternary structure, where the assembly of its four protein subunits are shown in green, red, pink, and blue.

Green fluorescent protein (GFP) has become a famous protein because of its ability to glow green, especially when activated with the appropriate wavelength of light. This is due to a chromophore synthesized in the protein from modified amino acids, shown in the image at the right in ball-and-stick representation. Emission of green light upon stimulation with blue or UV light has made this protein useful to biologists who want to study all sorts of things. This protein can be genetically fused to a protein of interest and that protein can then be tracked in a cell or organism by looking for the green fluorescence produced.

Protein in our Foods

What does it mean to say that some type of food is a “good source of protein?” In part, it means that a high percentage of the mass of the food is protein. It also means that these proteins can provide us with the amino acids we need to survive.

Nutritionally, we break down proteins into amino acids in the acidic environment of the stomach, along with the help of enzymes that function in the stomach and small intestine. The breakdown of proteins occurs through hydrolysis, where the amide bond between two amino acids is broken along with a water molecule, where the –H satisfies the free amine and the –OH is added to the carbonyl carbon to produce a carboxyl group. At physiological temperatures, this reaction only occurs in the presence of a strong acid or with the help of a catalyst.

Proteins are an essential part of our diet because we use the amino acids to build new proteins of use to our bodies. In particular, humans do not synthesize nine of the amino acids, so we must get these amino acids from our diet. Foods from animals (like eggs) contain all of the amino acids we need to get from our diets. Vegetables, legumes, and grains are more likely to lack one or more of the essential amino acids and must be eaten in combination to fulfill our nutritional requirements. Any food or set of foods that provides the necessary amount of all of our essential amino acids is known as a complete protein. The chart at the right demonstrates how a combination of grains and legumes can form a complete protein source without meat or dairy!

In addition to its nutritional value, proteins also provide structure to foods that contribute to their textures. People can be very particular about the texture of their egg whites and yolks and the temperature of cooking affects the final protein structure and rigidity. Soft-cooked egg proteins can be creamy and thick, but if cooked too long they can become rubbery and tough.

In a similar way, the protein in wheat —specifically gluten—provides bread with its chewy, spongy texture. This occurs through a complex process of hydration with water, kneading to stretch out the gluten strands (breaking existing disulfide bonds), resting to allow the gluten strands to crosslink with one another (through formation of new disulfide bonds), and heating to allow the expanding air to swell the space between the gluten strands and set the gluten structure.

Gelatin is a colloid formed from collagen proteins trapping water within their protein network. Meat also derives its chewy texture from proteins, where the protein fibers align on a macroscopic scale based on the way that muscle contracts and these fibers create noticeable texture in the meat. Cooking meat for long periods of time or at a high temperature and pressure in a pressure cooker allow the connective collagen proteins in muscles to start to release their connections to one another, yielding the tender meat of stews, carnitas, or shredded beef. It is interesting to notice that the protein we desire to make our gelatin set is the same one we break down when cooking our meat.

Nucleic Acid Background