3.3 Nucleic Acid Structure and Function
Melissa Hardy and Christelle Sabatier
Learning Objectives
By the end of this section, you will be able to do the following:
- Describe nucleic acids’ structure
- Define the two types of nucleic acids (DNA and RNA)
DNA and RNA
Nucleic acids are the most important macromolecules for the continuity of life. They carry the cell’s genetic blueprint and carry instructions for its functioning.
The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA is the genetic material in all living organisms, ranging from single-celled bacteria to multicellular mammals. It is in the nucleus of eukaryotes, and in the chloroplasts and mitochondria, two membrane-bound organelles. In prokaryotes, the DNA is not enclosed in a membrane-bound organelle.
The other type of nucleic acid, RNA, has many roles in the cell. One type of RNA, called messenger RNA (mRNA) carries information from DNA to ribosomes. Other types of RNA—like ribosomal RNA (rRNA), transfer RNA (tRNA), and microRNA (miRNA) — are involved in protein synthesis and its regulation.
Nucleotides
DNA and RNA are comprised of monomers called nucleotides. The nucleotides combine with each other to form a polynucleotide, DNA or RNA. Three components comprise each nucleotide: a nitrogenous base, a pentose (five-carbon) sugar, and one or more phosphate groups. Each nitrogenous base in a nucleotide is attached to a sugar molecule, which is attached to one or more phosphate groups.

The nitrogenous bases are organic molecules that contain nitrogen. They are bases because they contain an amino group that has the potential of binding an extra hydrogen, and thus decreasing the hydrogen ion concentration in its environment, making it more basic. Each nucleotide in DNA contains one of four possible nitrogenous bases: adenine (A), guanine (G) cytosine (C), and thymine (T).
Scientists classify adenine and guanine as purines. The purine’s primary structure is two carbon-nitrogen rings. Scientists classify cytosine, thymine, and uracil as pyrimidines which have a single carbon-nitrogen ring as their primary structure. Each of these basic carbon-nitrogen rings has different functional groups attached to it. In molecular biology shorthand, we use the symbols A, T, G, C, and U for the nitrogenous bases. DNA contains A, T, G, and C; whereas, RNA contains A, U, G, and C.

The pentose sugar in DNA is deoxyribose, and in RNA, the sugar is ribose. The difference between the sugars is the presence of the hydroxyl group on the ribose’s second carbon and hydrogen on the deoxyribose’s second carbon. The carbon atoms of the sugar molecule are numbered as 1′, 2′, 3′, 4′, and 5′ (1′ is read as “one prime”).
Polynucleotides
A phosphate residue connects the 5′ carbon of one sugar to the 3′ carbon of the sugar of the next nucleotide, which forms a 5′–3′ phosphodiester linkage. A polynucleotide may have thousands or even millions of phosphodiester linkages.
In a polynucleotide, one end of the chain has a free 5′ phosphate, and the other end has a free 3′-OH. These are called the 5′ and 3′ ends of the chain.

The Double Helix
DNA has a double-helix structure. The sugar and phosphate lie on the outside of the helix, forming the DNA’s backbone. The nitrogenous bases are stacked in the interior, like a pair of staircase steps. Hydrogen bonds bind the pairs to each other. Every base pair in the double helix is separated from the next base pair by 0.34 nm. There are about 10 base pairs per turn of the double helix. The helix’s two strands are antiparallel, meaning that they run in opposite directions.

Base pairing is specific. A can pair with T, and G can pair with C. This is the base complementary rule. In other words, the DNA strands are complementary to each other. If the sequence of one strand is 5′-AATTGGCC-3′, the complementary strand would have the sequence 3′-TTAACCGG-5′.
Media Attributions
- Nucleotides © Melissa Hardy is licensed under a Public Domain license
- DNA chemical structure © Thomas Shafee is licensed under a CC BY (Attribution) license
- DNA animation © Brian0918 is licensed under a Public Domain license
Learning Objectives
By the end of this section, you will be able to do the following:
- Describe the role of cells in organisms
- Compare and contrast prokaryotic and eukaryotic cells
- Describe the relative sizes of different cells
- Explain why cells must be small
A cell is the smallest unit of a living thing. Organisms are either comprised of one cell (like bacteria) or many cells (like a human). Thus, cells are the basic building blocks of all organisms.
Several cells of one kind that interconnect with each other and perform a shared function form tissues. These tissues combine to form an organ (your stomach, heart, or brain), and several organs comprise an organ system (such as the digestive system, circulatory system, or nervous system). Several systems that function together form an organism (like a human being). Here, we will examine the structure and function of cells.
There are many types of cells, which scientists group into one of two broad categories: prokaryotic and eukaryotic. For example, we classify both animal and plant cells as eukaryotic cells; whereas, we classify bacterial cells as prokaryotic.
All cells share four common components: 1) a plasma membrane, an outer covering that separates the cell’s interior from its surrounding environment; 2) cytoplasm, consisting of a jelly-like cytosol within the cell in which there are other cellular components; 3) DNA, the cell's genetic material; and 4) ribosomes, which synthesize proteins.
Prokaryotic cells
A prokaryote is a simple, mostly single-celled (unicellular) organism that lacks a nucleus, or any other membrane-bound organelle. We will shortly come to see that this is significantly different in eukaryotes. Prokaryotic DNA is in the cell's central part: the nucleoid (Figure 4.5).
Figure 4.5 This figure shows the generalized structure of a prokaryotic cell. All prokaryotes have chromosomal DNA localized in a nucleoid, ribosomes, a cell membrane, and a cell wall. The other structures shown are present in some, but not all, bacteria.
Most prokaryotes have a peptidoglycan cell wall and many have a polysaccharide capsule (Figure 4.5). The cell wall acts as an extra layer of protection, helps the cell maintain its shape, and prevents dehydration. The capsule enables the cell to attach to surfaces in its environment. Some prokaryotes have flagella, pili, or fimbriae. Flagella are used for locomotion. Pili exchange genetic material during conjugation, the process by which one bacterium transfers genetic material to another through direct contact. Bacteria use fimbriae to attach to a host cell.
Cell size
At 0.1 to 5.0 μm in diameter, prokaryotic cells are significantly smaller than eukaryotic cells, which have diameters ranging from 10 to 100 μm (Figure 4.6). The prokaryotes' small size allows ions and organic molecules that enter them to quickly diffuse to other parts of the cell. Similarly, any wastes produced within a prokaryotic cell can quickly diffuse. This is not the case in eukaryotic cells, which have developed different structural adaptations to enhance intracellular transport.
Figure 4.6 This figure shows relative sizes of microbes on a logarithmic scale (recall that each unit of increase in a logarithmic scale represents a 10-fold increase in the quantity measured).
Small size, in general, is necessary for all cells, whether prokaryotic or eukaryotic. Let’s examine why that is so. First, we’ll consider the area and volume of a typical cell. Not all cells are spherical in shape, but most tend to approximate a sphere. You may remember from your high school geometry course that the formula for the surface area of a sphere is 4πr2, while the formula for its volume is 4πr3/3. Thus, as the radius of a cell increases, its surface area increases as the square of its radius, but its volume increases as the cube of its radius (much more rapidly). Therefore, as a cell increases in size, its surface area-to-volume ratio decreases. This same principle would apply if the cell had a cube shape (Figure 4.7). If the cell grows too large, the plasma membrane will not have sufficient surface area to support the rate of diffusion required for the increased volume. In other words, as a cell grows, it becomes less efficient. One way to become more efficient is to divide. Other ways are to increase surface area by foldings of the cell membrane, become flat or thin and elongated, or develop organelles that perform specific tasks. These adaptations lead to developing more sophisticated cells, which we call eukaryotic cells.
Figure 4.7 Notice that as a cell increases in size, its surface area-to-volume ratio decreases. When there is insufficient surface area to support a cell’s increasing volume, a cell will either divide or die. The cell on the left has a volume of 1 mm3 and a surface area of 6 mm2, with a surface area-to-volume ratio of 6 to 1; whereas, the cell on the right has a volume of 8 mm3 and a surface area of 24 mm2, with a surface area-to-volume ratio of 3 to 1.
Prokaryotic cells are much smaller than eukaryotic cells. What advantages might small cell size confer on a cell? What advantages might large cell size have?
Eukaryotic cells
Unlike prokaryotic cells, eukaryotic cells have: 1) a membrane-bound nucleus; 2) numerous membrane-bound organelles such as the endoplasmic reticulum, Golgi apparatus, chloroplasts, mitochondria, and others; and 3) several, rod-shaped chromosomes. Because a membrane surrounds eukaryotic cell’s nucleus, it has a “true nucleus.” The word “organelle” means “little organ,” and, as we already mentioned, organelles have specialized cellular functions, just as your body's organs have specialized functions.
At this point, it should be clear to you that eukaryotic cells have a more complex structure than prokaryotic cells. Organelles allow different functions to be compartmentalized in different areas of the cell. And not every eukaryotic cell has the same complement of organelles depending on its function. Even within the same organism, different cells will leverage different organells. For example, in plants chloroplasts are primarily found in the parts of the plant above ground that conduct photosynthesis while plastids, which store pigments and starch, are found in below ground structures. You will learn a lot more details about the individual organelles found in eukaryotes as you proceed through the introductory biology courses. For now, let’s first examine two important components of the cell: the plasma membrane and the cytoplasm.
Figure 4.8 These figures show the major organelles and other cell components of (a) a typical animal cell and (b) a typical eukaryotic plant cell. The plant cell has a cell wall, chloroplasts, plastids, and a central vacuole—structures not in animal cells. Most cells do not have lysosomes or centrosomes.
The plasma membrane
Like prokaryotes, eukaryotic cells have a plasma membrane (Figure 4.9), a phospholipid bilayer with embedded proteins that separates the internal contents of the cell from its surrounding environment. A phospholipid is a lipid molecule with two fatty acid chains and a phosphate-containing group. The plasma membrane controls the passage of organic molecules, ions, water, and oxygen into and out of the cell. Wastes (such as carbon dioxide and ammonia) also leave the cell by passing through the plasma membrane.
Figure 4.9 The eukaryotic plasma membrane is a phospholipid bilayer with proteins and cholesterol embedded in it.
The plasma membranes of cells that specialize in absorption fold into fingerlike projections that we call microvilli (singular = microvillus); (Figure 4.10). Such cells typically line the small intestine, the organ that absorbs nutrients from digested food. This is an excellent example of form following function. People with celiac disease have an immune response to gluten, which is a protein in wheat, barley, and rye. The immune response damages microvilli, and thus, afflicted individuals cannot absorb nutrients. This leads to malnutrition, cramping, and diarrhea. Patients suffering from celiac disease must follow a gluten-free diet. [add a section about root hairs in plants]
Figure 4.10 Microvilli, as they appear on cells lining the small intestine, increase the surface area available for absorption. These microvilli are only on the area of the plasma membrane that faces the cavity from which substances will be absorbed. (credit "micrograph": modification of work by Louisa Howard)
In addition, many of the organelles in eukaryotic cells are surrounded by a phospholipid bilayer. This allows very different environments to exist inside each organelle compared to the cytoplasm. Each environment is maximally suited to the specific function of that organelle. For example, lysosomes in animal cells and vacuoles in plant cells are the site where many macromolecules are recycled into their constituent monomers. Proteins are digested into individual amino acids that can then be recycled as new proteins are synthesized in the cytoplasm. The enzymes that are responsible for catalyzing these reactions function at low pH, so the interior of lysosomes and vacuoles are often much more acidic than the cytoplasm. This allows these processes to take place efficiently without interfering with other functions of the cells that might be disrupted by an acidic environment.
The Cytoplasm
The cytoplasm is the cell's entire region between the plasma membrane and the nuclear envelope (a structure we will discuss shortly). It is comprised of organelles suspended in the gel-like cytosol, the cytoskeleton, and various chemicals (Figure 4.8). Even though the cytoplasm consists of 70 to 80 percent water, it has a semi-solid consistency, which comes from the proteins within it. However, proteins are not the only organic molecules in the cytoplasm. Glucose and other simple sugars, polysaccharides, amino acids, nucleic acids, fatty acids, and derivatives of glycerol are also there. Ions of sodium, potassium, calcium, and many other elements also dissolve in the cytoplasm. Many metabolic reactions, including protein synthesis, take place in the cytoplasm.
8.3 Using Light Energy to Make Organic Molecules
Learning Objectives
By the end of this section, you will be able to do the following:
- Describe the Calvin cycle
- Define carbon fixation
- Explain how photosynthesis works in the energy cycle of all living organisms
After the energy from the sun is converted into chemical energy and temporarily stored in ATP and NADPH molecules, the cell has the fuel needed to build carbohydrate molecules for long-term energy storage. The products of the light-dependent reactions, ATP and NADPH, have lifespans in the range of millionths of seconds, whereas the products of the light-independent reactions (carbohydrates and other forms of reduced carbon) can survive almost indefinitely. The carbohydrate molecules made will have a backbone of carbon atoms. But where does the carbon come from? It comes from carbon dioxide—the gas that is a waste product of respiration in microbes, fungi, plants, and animals.
The Calvin Cycle
In plants, carbon dioxide (CO2) enters the leaves through stomata, where it diffuses over short distances through intercellular spaces until it reaches the mesophyll cells. Once in the mesophyll cells, CO2 diffuses into the stroma of the chloroplast—the site of light-independent reactions of photosynthesis. These reactions actually have several names associated with them. Another term, the Calvin cycle, is named for the man who discovered it, and because these reactions function as a cycle. Others call it the Calvin-Benson cycle to include the name of another scientist involved in its discovery. The most outdated name is “dark reaction,” because light is not directly required (Figure 8.18). However, the term dark reaction can be misleading because it implies incorrectly that the reaction only occurs at night or is independent of light, which is why most scientists and instructors no longer use it.
Figure 8.18 Light reactions harness energy from the sun to produce chemical bonds, ATP, and NADPH. These energy-carrying molecules are made in the stroma where carbon fixation takes place. Credit: Rao, A., Ryan, K., Tag, A., Fletcher, S. and Hawkins, A. Department of Biology, Texas A&M University.
The light-independent reactions of the Calvin cycle can be organized into three basic stages: fixation, reduction, and regeneration.
Stage 1: Fixation
In the stroma, in addition to CO2, two other components are present to initiate the light-independent reactions: an enzyme called ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO), and three molecules of ribulose bisphosphate (RuBP), as shown in Figure 8.19. RuBP has five atoms of carbon, flanked by two phosphates.
Visual Connection
Visual Connection
Figure 8.19 The Calvin cycle has three stages. In stage 1, the enzyme RuBisCO incorporates carbon dioxide into an organic molecule, 3-PGA. In stage 2, the organic molecule is reduced using electrons supplied by NADPH. In stage 3, RuBP, the molecule that starts the cycle, is regenerated so that the cycle can continue. Only one carbon dioxide molecule is incorporated at a time, so the cycle must be completed three times to produce a single three-carbon G3P molecule, and six times to produce a six-carbon glucose molecule. Credit: Rao, A., Ryan, K., Tag, A., Fletcher, S. and Hawkins, A. Department of Biology, Texas A&M University.
Which of the following statements is true?
- In photosynthesis, oxygen, carbon dioxide, ATP, and NADPH are reactants. G3P and water are products.
- In photosynthesis, chlorophyll, water, and carbon dioxide are reactants. G3P and oxygen are products.
- In photosynthesis, water, carbon dioxide, ATP, and NADPH are reactants. RuBP and oxygen are products.
- In photosynthesis, water and carbon dioxide are reactants. G3P and oxygen are products.
RuBisCO catalyzes a reaction between CO2 and RuBP. For each CO2 molecule that reacts with one RuBP, two molecules of another compound 3-phospho glyceric acid (3-PGA) form. PGA has three carbons and one phosphate. Each turn of the cycle involves only one RuBP and one carbon dioxide and forms two molecules of 3-PGA. The number of carbon atoms remains the same, as the atoms move to form new bonds during the reactions (3 C atoms from 3CO2 + 15 C atoms from 3RuBP = 18 C atoms in 6 molecules of 3-PGA). This process is called carbon fixation, because CO2 is “fixed” from an inorganic form into organic molecules.
Stage 2: Reduction
ATP and NADPH are used to convert the six molecules of 3-PGA into six molecules of a chemical called glyceraldehyde 3-phosphate (G3P). That is a reduction reaction because it involves the gain of electrons by 3-PGA. (Recall that a reduction is the gain of an electron by an atom or molecule.) Six molecules of both ATP and NADPH are used. For ATP, energy is released with the loss of the terminal phosphate atom, converting it into ADP; for NADPH, both energy and a hydrogen atom are lost, converting it into NADP+. Both of these molecules return to the nearby light-dependent reactions to be reused and re-energized.
Stage 3: Regeneration
Interestingly, at this point, only one of the G3P molecules leaves the Calvin cycle and is sent to the cytoplasm to contribute to the formation of other compounds needed by the plant. Because the G3P exported from the chloroplast has three carbon atoms, it takes three “turns” of the Calvin cycle to fix enough net carbon to export one G3P. But each turn makes two G3Ps, thus three turns make six G3Ps. One is exported while the remaining five G3P molecules remain in the cycle and are used to regenerate RuBP, which enables the system to prepare for more CO2 to be fixed. Three more molecules of ATP are used in these regeneration reactions.
Structural Chemist
Figure X. A photo of Dr. Nogales with one of the commonly used machines in her lab at University of California, Berkley
Dr. Eva Nogales is a Howard Hughes Medical Institute investigator, Senior Faculty Scientist at the Lawrence Berkeley National Laboratory, and Professor of Biochemistry, Molecular Biology, and Structural Biology at the University of California, Berkeley. Dr. Nogales received her PhD in Biophysics at Keele University in the United Kingdom under Dr. Joan Bordas.
Nogales’s current work at the University of California is dedicated to gaining mechanistic insight into eukaryotic biology - the central dogma of replication machinery and cytoskeleton interactions and dynamics in cellular division.Currently, Dr. Nogales and her students are investigating the complex interactions between microtubules, epigenetic regulation, and transcriptional mechanisms. The lab specializes in cryo-electron microscopy and biochemical and biophysical assays inorder to explore these subjects.
Within her lab’s investigations into microtubules, Dr. Nogales is exploring their associated proteins, the kinetochore interface, and the structural basis of their instability. Microtubules play a pivotal role in several cell processes - cell division, intracellular transport, and structural integrity of cells. Gaining insight into how microtubules function and change can unveil the intricacies of the fundamental biological mechanisms and development of therapeutic treatments for diseases such as cancer (where microtubules are often the subject of drug targeting).
Care to learn more? Visit the professors website here to read more about her current explorations, or perhaps, be a part of them.
Structural Chemist
Figure 1. A photo of Dr. Nogales with one of the commonly used machines in her lab at University of California, Berkley
Dr. Eva Nogales is a Howard Hughes Medical Institute investigator, Senior Faculty Scientist at the Lawrence Berkeley National Laboratory, and Professor of Biochemistry, Molecular Biology, and Structural Biology at the University of California, Berkeley. Dr. Nogales received her PhD in Biophysics at Keele University in the United Kingdom under Dr. Joan Bordas.
Nogales’s current work at the University of California is dedicated to gaining mechanistic insight into eukaryotic biology - the central dogma of replication machinery and cytoskeleton interactions and dynamics in cellular division.Currently, Dr. Nogales and her students are investigating the complex interactions between microtubules, epigenetic regulation, and transcriptional mechanisms. The lab specializes in cryo-electron microscopy and biochemical and biophysical assays inorder to explore these subjects.
Within her lab’s investigations into microtubules, Dr. Nogales is exploring their associated proteins, the kinetochore interface, and the structural basis of their instability. Microtubules play a pivotal role in several cell processes - cell division, intracellular transport, and structural integrity of cells. Gaining insight into how microtubules function and change can unveil the intricacies of the fundamental biological mechanisms and development of therapeutic treatments for diseases such as cancer (where microtubules are often the subject of drug targeting).
Care to learn more? Visit the professors website here to read more about her current explorations, or perhaps, be a part of them.
Learning Objectives
By the end of this chapter, you will be able to do the following:
- Predict the functional effects of mutations in β-galactosidase
Proteins are one of the most abundant biological macromolecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective. They may serve in transport, storage, or membranes; or they may be toxins or enzymes. Each cell in a living system may contain thousands of proteins, each with a unique function. Their structures, like their functions, vary greatly, and by investigating their structures, we can make predictions about their functions.
1. Protein structure
A protein's shape is critical to its function. For example, an enzyme can bind to a specific substrate at an active site. If this active site is altered because of local changes or changes in overall protein structure, the enzyme may be unable to bind to the substrate. To understand how the protein gets its final shape or conformation, we need to understand the four levels of protein structure: primary, secondary, tertiary, and quaternary.
Primary Structure
The amino acid sequence in a polypeptide chain is its primary structure. For example, the primary sequence of hemoglobin may be found on Uniprot, entry P69905. The N-terminal amino acid is methionine (Met, M), and the C-terminal amino acid is arginine (Arg, R) (Figure 1). The amino acid sequence of hemoglobin is the same every time it is expressed, and hemoglobin is the only protein that has exactly this sequence of amino acids.
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR
Figure 1: Primary structure of human hemoglobin α chain. The α chain of human hemoglobin has 142 amino acids, all linked together in sequence with peptide bonds.
The gene encoding the protein ultimately determines the unique sequence of amino acids for every protein. A change in nucleotide sequence of the gene’s coding region may lead to adding a different amino acid to the growing polypeptide chain, causing a change in protein structure and sometimes, therefore function. In sickle cell anemia, the hemoglobin β chain (a small portion of which is shown in Figure 2) has a single amino acid substitution, causing a change in the protein's structure and function. Specifically, valine in the β chain is substituted with the amino acid, glutamate. What is most remarkable to consider is that a hemoglobin molecule is comprised of two alpha and two beta chains that each consist of about 150 amino acids. The molecule, therefore, has about 600 amino acids. The structural difference between a normal hemoglobin molecule and a sickle cell molecule – which dramatically decreases life expectancy – is two amino acids of the 600.
Figure 2: Structure and function of hemoglobin. Because of one change in the primary, amino acid sequence of the beta chain of hemoglobin, hemoglobin molecules form long fibers that distort the biconcave, or disc-shaped, red blood cells and causes them to assume a crescent or “sickle” shape, which clogs blood vessels. In normal hemoglobin, the amino acid at position six is glutamate, but in sickle cell hemoglobin, it is valine. (Credit: Rao, A., Tag, A. Ryan, K. and Fletcher, S. Department of Biology, Texas A&M University) [Image Description]
Secondary Structure
The local folding of the polypeptide in some regions gives rise to the secondary structure of the protein. The most common are the α-helix and β-pleated sheet structures (Figure 3). Both structures are held in shape by backbone hydrogen bonds. Hydrogen bonds form between the oxygen atom in the carbonyl group in one amino acid and hydrogen and nitrogen atoms in the amide group of another amino acid that is four amino acids away in sequence.
Figure 3: The α-helix and β-pleated sheet are secondary structures formed in proteins. These structures occur when hydrogen bonds form between the carbonyl oxygen and the amino hydrogen and nitrogen in the peptide backbone of two amino acids in a protein. Black = carbon, White = hydrogen, Blue = nitrogen, and Red = oxygen. Credit: Rao, A., Ryan, K. Fletcher, S. and Tag, A. Department of Biology, Texas A&M University. [Image Description]
Tertiary Structure
The polypeptide's unique three-dimensional structure is its tertiary structure (Figure 4). This structure is primarily due to chemical interactions between the side chains of amino acids in the polypeptide chain. The chemical nature of the side chain in the amino acids involved determine which amino acids are energetically favorable to be next to other amino acids. For example, side chains with like charges repel each other and those with unlike charges are attracted to each other (ionic bonds). The sulfur atoms in cysteine side chains can form disulfide linkages in the presence of oxygen, the only covalent bond that forms during protein folding. When protein folding takes place, the nonpolar amino acids' hydrophobic side chains repel water in the protein's environment and pack into the protein's interior; whereas, the hydrophilic side chains tend position on the surface of the protein, interacting with water. In general, whenever a protein is translated, it always folds into the same tertiary structure, as determined by the primary structure of its amino acids.
Figure 4: A variety of chemical interactions determine the proteins' 3D, tertiary structure. These include hydrophobic interactions, ionic bonding, hydrogen bonding, and disulfide linkages. [Image Description]
Quaternary Structure
In nature, some proteins form from several polypeptides, or subunits, and the interaction of these subunits forms the quaternary structure of the protein. Weak interactions between the subunits help to stabilize the overall structure. For example, the α and β chains of human hemoglobin, a globular protein, fold into a their tertiary structures, and then two copies of the α chain come to interact with two copies of the β chain to form a tetramer of four chains (Figure 5). Silk, a fibrous protein, however, has a β-pleated sheet structure that is the result of hydrogen bonding between many different chains.
Figure 5: Primary, secondary, tertiary, and quaternary structure of hemoglobin. The primary structure of a hemoglobin is its amino acid sequence. It secondary structure is entirely α helices. Its tertiary structure is globular. Four protein chains come together to form the quaternary structure that is the functional hemoglobin protein. Credit: Rao, A. Ryan, K. and Tag, A. Department of Biology, Texas A&M University. [Image Description]
2. Amino acids
Amino acids are the monomers that comprise the polymeric molecules, proteins. Each amino acid has the same fundamental structure, which consists of a central carbon atom, or the alpha carbon (Cα), bonded to an amino group (NH2), a carboxyl group (COOH), and a hydrogen atom. These atoms are considered the backbone of the amino acid. Every amino acid also has another atom or group of atoms bonded to the central Cα atom known as the R group or side chain (Figure 6).
Figure 6: Structure of an amino acid. Amino acids have a central asymmetric carbon (Cα) to which an amino group, a carboxyl group, a hydrogen atom, and a side chain (R group) are covalently bonded. [Image Description]
Scientists use the name "amino acid" because these acids contain both an amino group and a carboxyl-acid-group in their basic structure. As we mentioned, there are 20 common amino acids present in proteins. For each amino acid, the side chain (or R group) is different (Figure 7). The chemical nature of the side chain determines the amino acid's nature (that is, whether it is acidic, basic, polar, or nonpolar). Each amino acid has both a single-letter and a three-letter abbreviation. For example, valine is abbreviated with the letter V or the three-letter symbol, Val.
Figure 7: The 20 common amino acids. The chemical structure for each amino acid is given, grouped by chemical property. The single- and three-letter abbreviations are also provided. Adapted from "Molecular structures of the 21 proteinogenic amino acids.svg" by Dan Cojocari licensed under CC-BY-SA. [Image Description]
The sequence and the number of amino acids ultimately determine the protein's shape, size, and function. A covalent bond, or peptide bond, attaches to each amino acid, which a dehydration reaction forms. One amino acid's carboxyl group and the incoming amino acid's amino group combine, releasing a water molecule. The resulting bond is the peptide bond (Figure 8).
Figure 8: Peptide bond formation. The carboxyl group of one amino acid is linked to the incoming amino acid's amino group. In the process, it releases a water molecule. [Image Description]
The products that such linkages form are peptides. As more amino acids join to this growing chain, the resulting chain is a polypeptide. Each polypeptide has a free amino group at one end. This end is called the N terminus, or the amino terminus, and the other end has a free carboxyl group, also called the C or carboxyl terminus. When a polypeptide is built by the ribosome, amino acids are added from the N terminus to the C terminus. When polypeptide sequences are written out, they are written from N to C terminus. While the terms polypeptide and protein are sometimes used interchangeably, a polypeptide is technically a polymer of amino acids, whereas the term protein is used for a polypeptide that is folded into its functional form.
Each of the 20 most common amino acids has specific chemical characteristics and a unique role in protein structure and function. Based on the propensity of the side chains to be in contact with water (polar environment), amino acids can be classified into three groups:
- Those with polar side chains.
- Those with hydrophobic side chains.
- Those with charged side chains.
Below we look at each of these classes and briefly discuss their role in protein structure and function.
Polar amino acids
When considering polarity, some amino acids are straightforward to define as polar, while in other cases, we may encounter disagreements. For example, serine (Ser, S), threonine (Thr, T), and tyrosine (Tyr, Y) are polar since they carry a hydroxylic (-OH) group (Figure 9). Furthermore, this group can form a hydrogen bond with another polar group by donating or accepting a proton (a table showing donors and acceptors in polar and charged amino acid side chains can be found at the FoldIt site. Tyrosine is also involved in metal binding in many enzymatic sites. Asparagine (Asn, N) and glutamine (Gln, Q) also belong to this group and may donate or accept a hydrogen bond.
Histidine (His, H), on the other hand, depending on the environment and pH, can be polar or carry a charge. It has two –NH groups with a pKa value of around 6. At pHs below 6, when both groups are protonated, the side chain has a charge of +1. Within protein molecules, the pKa may be modulated by the environment so that the side chain may give away a proton and become neutral or accept a proton, becoming charged. This ability makes histidine useful in enzyme active sites when the chemical reaction requires proton extraction.
Figure 9: The polar amino acids. Adapted from "Molecular structures of the 21 proteinogenic amino acids.svg" by Dan Cojocari licensed under CC-BY-SA. [Image Description]
Hydrophobic amino acids
The hydrophobic amino acids include alanine (Ala, A), valine (Val, V), leucine (Leu, L), isoleucine (Ile, I), proline (Pro, P), phenylalanine (Phe, F) and cysteine (Cys, C) (Figure 10). These residues typically form the hydrophobic core of proteins, which is isolated from the polar solvent. The side chains within the core are tightly packed and participate in van der Waals interactions, which are essential for stabilizing the structure. In addition, Cys residues are involved in three-dimensional structure stabilization through the formation of disulfide (S-S) bridges, which sometimes connect different secondary structure elements or different subunits in a complex. Another essential function of Cys is metal binding, sometimes in enzyme active sites and sometimes in structure-stabilizing metal centers.
The aromatic amino acids tryptophan (Trp, W) and Tyr and the non-aromatic methionine (Met, M) are sometimes called amphipathic due to their ability to have both polar and nonpolar character. In protein molecules, these residues are often found close to the interface between a protein and solvent. We should also note here that the side chains of histidine and tyrosine, together with the hydrophobic phenylalanine and tryptophan, can also form weak hydrogen bonds of the types OH−π and CH−O, using electron clouds within their ring structures. A characteristic feature of aromatic residues is that they are often found within the core of a protein structure, with their side chains packed against each other. They are also highly conserved within protein families, with Trp having the highest conservation rate.
Figure 10: The hydrophobic amino acids. Adapted from "Molecular structures of the 21 proteinogenic amino acids.svg" by Dan Cojocari licensed under CC-BY-SA. [Image Description]
Charged amino acids
The charged amino acids at neutral pH (around 7.4) carry a single charge in the side chain. There are four of them; the two basic ones include lysine (Lys, K) and arginine (Arg, R), with a positive charge at neutral pH. The two acidic residues include aspartate (Asp, D) and glutamate (Glu, E), which carry a negative charge at neutral pH (Figure 11). A so-called salt bridge is often formed by the interaction of closely located positively and negatively charged side chains. Such bridges are often involved in stabilizing three-dimensional protein structure, especially in proteins from thermophilic organisms, organisms that live at elevated temperatures, up to 80-90 C, or even higher. The binding of positively charged metal ions is another function of the negatively charged carboxylic groups of Asp and Glu. Metalloproteins and the role of metal centers in protein function is a fascinating field of structural biology research.
Figure 11: The charged amino acids. Adapted from "Molecular structures of the 21 proteinogenic amino acids.svg" by Dan Cojocari licensed under CC-BY-SA. [Image Description]
Glycine & proline
Glycine (Gly), one of the common amino acids, does not have a side chain – its R group is just a hydrogen atom – and is often found at the surface of proteins within loop or coil regions (regions without defined secondary structure), providing high flexibility to the polypeptide chain. This flexibility is required in sharp polypeptide turns in loop structures. Proline (Pro), although considered hydrophobic, is also found at the surface, presumably due to its presence in turn and loop regions. In contrast to Gly, which provides the polypeptide chain high flexibility, Pro provides rigidity by imposing certain torsion angles on the segment of the structure. The reason for this is that its side chain makes a covalent bond with the main chain, which constrains the backbone shape of the polypeptide in this location. Sometimes Pro is called a helix breaker since it is often found at the end of α-helices.
Figure 12: The special amino acids. Adapted from "Molecular structures of the 21 proteinogenic amino acids.svg" by Dan Cojocari licensed under CC-BY-SA. [Image Description]
Figure Descriptions
Figure 2: The image is a comparative illustration of the structural and functional differences between normal hemoglobin and sickle-cell hemoglobin across various levels of protein structure. The layout is divided into two vertical sections labeled "Normal" and "Sickle-Cell," each with subsections depicting the primary, secondary, tertiary, quaternary structures, and function.
- Primary Structure:
- Normal: Seven circular molecules labeled sequentially from 1 to 7 with the respective amino acids: Val, His, Leu, Thr, Pro, Glu, Glu.
- Sickle-Cell: Same seven circular molecules labeled sequentially with the amino acids: Val, His, Leu, Thr, Pro, Val, Glu. The sixth molecule, Glu, is replaced with Val, highlighted in red.
- Secondary and Tertiary Structures:
- Normal: A blue 3D ellipsoid shape representing the normal β subunit.
- Sickle-Cell: A reddish-brown 3D ellipsoid shape representing the sickle-cell β subunit.
- Quaternary Structure:
- Normal: Combination of blue and purple ellipsoid shapes to form normal hemoglobin.
- Sickle-Cell: Combination of reddish-brown and purple ellipsoid shapes to form sickle-cell hemoglobin.
- Function:
- Normal: Depicts individual globular hemoglobin molecules scattered and unassociated, each capable of carrying oxygen.
- Sickle-Cell: Illustrates abnormal aggregation of hemoglobin molecules into fibers, impairing oxygen-carrying capacity.
Figure 3: The image illustrates two types of secondary protein structures against a light blue background: an alpha-helix and a beta-pleated sheet. The illustration is divided horizontally into two sections.
- Top Section: Alpha Helix
- A right-handed helical structure is shown in orange, twisting in a clockwise direction.
- The helix is depicted with a string of colored spheres (atoms) connected by lines (chemical bonds) representing the molecular structure.
- Hydrogen bonds are represented by dashed lines connecting parts of the helix.
- The labels include "α Helix" and "Hydrogen Bond".
- Bottom Section: Beta Pleated Sheet
- Several strands are aligned next to each other, forming a pleated sheet structure in orange.
- Similar to the helix, the strands are composed of colored spheres (atoms) connected by lines (chemical bonds).
- Hydrogen bonds are depicted as dashed lines running perpendicular to the strands, connecting adjacent strands.
- The labels include "β Pleated Sheet," "β Strand," and "Hydrogen Bond".
Figure 4: The image depicts a simplified diagram of a polypeptide backbone, illustrating various interactions and bonds that occur within a protein structure. The backbone is represented by a red, ribbon-like structure that loops and twists, showing the complex folding of the protein.
- Polypeptide Backbone: The main red ribbon represents the polypeptide backbone which loops around the image.
- Ionic Bond: There is a highlighted section showing a segment with a labeled "Ionic Bond," featuring an NH₃⁺ group connected to an O⁻ group.
- Hydrogen Bond: A light blue segment indicates a "Hydrogen bond" between O-H groups.
- Disulfide Linkage: An adjacent part shows a connection labeled "Disulfide linkage" marked by two sulfur atoms connected by a line (represented by "S-S").
- Hydrophobic Interactions: Another section indicates "Hydrophobic interactions," involving CH₃ groups interacting with one another.
Figure 6: The image is a diagram depicting the structure of an amino acid. The diagram is divided into three sections vertically, from left to right, labeled "Amino group," "Side chain," and "Carboxyl group." The amino group section contains a nitrogen atom (N) colored blue at the center, bonded to two hydrogen atoms (H) represented in white and labeled. Moving rightwards, the central section contains a carbon atom (C) depicted in black, bonded to one hydrogen atom (H) in white and to an "R" group representing the side chain. The carbon is also bonded to another carbon atom (C), also in black, positioned to the right in the carboxyl group section. This carbon is double-bonded to an oxygen atom (O) colored in red, and single-bonded to another oxygen (O) with a single hydrogen (H) attached. An arrow points to the central carbon labeled "α carbon." [Return to Figure]
Figure 7: The image is an educational chart titled "20 Common Amino Acids." It is divided into four main sections by backgrounds of different colors: Polar Uncharged (light blue), Hydrophobic (light green), Charged (light pink), and Special Cases (light yellow).
- Polar Uncharged (light blue background):
- Contains six amino acids: Serine (S), Threonine (T), Histidine (H), Asparagine (N), Glutamine (Q), and Tyrosine (Y).
- Each amino acid is depicted with its chemical structure and a red circle indicating its one-letter code inside the circle.
- Hydrophobic (light green background):
- Contains nine amino acids: Alanine (A), Cysteine (C), Valine (V), Isoleucine (I), Leucine (L), Methionine (M), Phenylalanine (F), and Tryptophan (W).
- Each amino acid is depicted with its chemical structure and a red circle indicating its one-letter code inside the circle.
- Charged (light pink background):
- Divided into Positive and Negative sections.
- The Positive section includes Arginine (R) and Lysine (K).
- The Negative section includes Aspartic Acid (D) and Glutamic Acid (E).
- Each amino acid is depicted with its chemical structure and a red circle indicating its one-letter code inside the circle.
- Special Cases (light yellow background):
- Contains two amino acids: Glycine (G) and Proline (P).
- Each amino acid is depicted with its chemical structure and a red circle indicating its one-letter code inside the circle.
- The top left structure represents an amino acid, featuring an amino group (H2N), a central carbon (C) bonded to a hydrogen atom (H), a variable side chain (R), and a carboxyl group (COOH). The hydroxyl group (OH) in the carboxyl group is highlighted in red.
- The top right structure represents another amino acid with a similar structure but differing variable side chains (R).
- The two structures at the top are separated by a space and linked by an arrow pointing to a single structure at the bottom.
- The bottom structure represents the resulting dipeptide with a peptide bond formed. The peptide bond is highlighted within a blue rectangle, showing the linkage between the carbon (C) of one amino acid and the nitrogen (N) of the other amino acid.
- The term "Peptide Bond" is written below the blue rectangle.
Figure 9: The image categorizes polar uncharged amino acids and visually represents their structures. It displays six amino acids: Serine, Threonine, Histidine, Asparagine, Glutamine, and Tyrosine. Each amino acid shows its backbone and distinct side chain. The background is light blue, with the structures depicted in black. Each amino acid name is followed by its three-letter and one-letter code, represented within a red circle. [Return to Figure]
Figure 10: The image is a diagram depicting the molecular structures of eight hydrophobic amino acids. The background is light green, and each amino acid is illustrated with its chemical structure, the three-letter abbreviation, and the single-letter code. The amino acids are aligned horizontally. From left to right, the amino acids are Alanine (Ala, A), Cysteine (Cys, C), Valine (Val, V), Isoleucine (Ile, I), Leucine (Leu, L), Methionine (Met, M), Phenylalanine (Phe, F), and Tryptophan (Trp, W). Each single-letter code is presented in a red circle. [Return to Figure]
Figure 11: The image is a diagram that categorizes amino acids based on their charge properties and atomic structure. The background is a light pink color, and there is a shaded rectangular area in the center where the chemical structures are displayed. The diagram is divided into two main groups labeled “Positive” and “Negative”. Under the “Positive” group, two amino acids are listed: Arginine (Arg) and Lysine (Lys), each represented with their respective chemical structures and a red circle with the letters "R" and "K". Under the “Negative” group, two amino acids are listed: Aspartic Acid (Asp) and Glutamic Acid (Glu), each represented with their respective chemical structures and a red circle with the letters "D" and "E". [Return to Figure]
The image has a yellow background and is titled "Special Cases" at the top in black font. Below the title, there are two sections dedicated to the amino acids Glycine (Gly) and Proline (Pro).
To the left, under the heading "Glycine (Gly)" in black text, there is a red circle with a white uppercase letter "G" inside. Below this, a structural formula of Glycine is depicted within a beige rectangle. The formula shows a carbon atom bonded to an amine group (NH₂), a carboxyl group (COOH), and two hydrogen atoms.
To the right, under the heading "Proline (Pro)" in black text, there is a red circle with a white uppercase letter "P" inside. Below this, a structural formula of Proline is also shown within the same beige rectangle. The Proline structure shows a carbon atom bonded to a carboxyl group (COOH), an amine group in a five-membered ring structure, and single hydrogen atoms.
Licenses and Attributions
"Protein Structure & Function" by Michelle McCully is adapted from "3.4 Proteins" by Mary Ann Clark, Matthew Douglas, Jung Choi for OpenStax Biology 2e under CC-BY 4.0 and "The 20 Amino Acids and Their Role in Protein Structures" by Salam Al-Karadaghi under CC-BY-SA 4.0. "Protein Structure & Function" is licensed under ???.
Structural Chemist
Figure 1. A photo of Dr. Nogales with one of the commonly used machines in her lab at the University of California, Berkeley.
Dr. Eva Nogales is a Howard Hughes Medical Institute investigator, Senior Faculty Scientist at the Lawrence Berkeley National Laboratory, and Professor of Biochemistry, Molecular Biology, and Structural Biology at the University of California, Berkeley. Dr. Nogales received her PhD in Biophysics at Keele University in the United Kingdom under Dr. Joan Bordas.
Nogales’s current work at the University of California is dedicated to gaining mechanistic insight into eukaryotic biology - the central dogma of replication machinery and cytoskeleton interactions and dynamics in cellular division.Currently, Dr. Nogales and her students are investigating the complex interactions between microtubules, epigenetic regulation, and transcriptional mechanisms. The lab specializes in cryo-electron microscopy and biochemical and biophysical assays inorder to explore these subjects.
Within her lab’s investigations into microtubules, Dr. Nogales is exploring their associated proteins, the kinetochore interface, and the structural basis of their instability. Microtubules play a pivotal role in several cell processes - cell division, intracellular transport, and structural integrity of cells. Gaining insight into how microtubules function and change can unveil the intricacies of the fundamental biological mechanisms and development of therapeutic treatments for diseases such as cancer (where microtubules are often the subject of drug targeting).
Care to learn more? Visit the professors website here to read more about her current explorations, or perhaps, be a part of them.
Structural Chemist
Figure 1. A photo of Dr. Nogales with one of the machines in her lab at the University of California, Berkeley.
Dr. Eva Nogales is a Howard Hughes Medical Institute investigator, Senior Faculty Scientist at the Lawrence Berkeley National Laboratory, and Professor of Biochemistry, Molecular Biology, and Structural Biology at the University of California, Berkeley. Dr. Nogales received her PhD in Biophysics at Keele University in the United Kingdom under Dr. Joan Bordas.
Nogales’s current work at the University of California is dedicated to gaining mechanistic insight into eukaryotic biology - the central dogma of replication machinery and cytoskeleton interactions and dynamics in cellular division.Currently, Dr. Nogales and her students are investigating the complex interactions between microtubules, epigenetic regulation, and transcriptional mechanisms. The lab specializes in cryo-electron microscopy and biochemical and biophysical assays inorder to explore these subjects.
Within her lab’s investigations into microtubules, Dr. Nogales is exploring their associated proteins, the kinetochore interface, and the structural basis of their instability. Microtubules play a pivotal role in several cell processes - cell division, intracellular transport, and structural integrity of cells. Gaining insight into how microtubules function and change can unveil the intricacies of the fundamental biological mechanisms and development of therapeutic treatments for diseases such as cancer (where microtubules are often the subject of drug targeting).
Care to learn more? Visit the professors website here to read more about her current explorations, or perhaps, be a part of them.