Unit 6: Gene Expression and Regulation

DNA and RNA: Structure, Function, and Information Flow

Life depends on storing information reliably, copying it accurately, and using it to build molecules that do work. In cells, that information is encoded in nucleic acids (DNA and RNA) using sequences of nucleotides as a kind of chemical “text.” Understanding gene expression and regulation starts with knowing what DNA and RNA are made of and why their structures make them good at information storage and transfer.

Nucleotides, backbone bonds, and directionality

A nucleotide is the building block (monomer) of DNA and RNA. Each nucleotide has a phosphate group, a five-carbon sugar, and a nitrogenous base. In DNA, the sugar is deoxyribose (hence deoxyribonucleic acid); in RNA, the sugar is ribose.

Nitrogenous bases come in two structural categories:

Purines (double-ringed): adenine (A) and guanine (G)
Pyrimidines (single-ringed): cytosine (C), thymine (T), and uracil (U)

Nucleotides link into a strand through covalent bonds in the sugar-phosphate backbone. Specifically, the sugars and phosphates are connected by phosphodiester (phosphate) bonds, creating a scaffold that holds the bases.

This backbone gives each strand directionality. One end exposes a phosphate on the 5′ carbon (5′ end) and the other exposes a hydroxyl group on the 3′ carbon (3′ end). The 5′→3′ direction matters because enzymes that copy or read nucleic acids usually work in only one direction. Direction is chemical, not visual; it’s defined by the sugar’s carbon numbering, not by what side of a page the strand is drawn on.

DNA vs. RNA: what differs and why it matters

DNA is typically double-stranded and optimized for long-term information storage. RNA is usually single-stranded and optimized for flexible roles in gene expression—messenger, structural, catalytic, and regulatory.

Key differences:

Sugar: DNA has deoxyribose; RNA has ribose. The extra hydroxyl on ribose makes RNA less chemically stable, which fits its often short-lived roles.
Bases: DNA uses A, T, C, G; RNA uses A, U, C, G (uracil replaces thymine).
Typical form: DNA forms a double helix; RNA often folds into diverse shapes.

Common RNA types you should recognize:

mRNA (messenger RNA): a temporary RNA “recipe” copied from DNA and delivered to ribosomes
rRNA (ribosomal RNA): a major structural and catalytic component of ribosomes
tRNA (transfer RNA): brings amino acids to the ribosome by matching anticodons to codons

Complementary base pairing, hydrogen bonds, and the double helix

Each DNA molecule consists of two strands wrapped into a double helix (classically deduced in 1953 by Watson, Crick, and Franklin). In double-stranded DNA, bases pair predictably by hydrogen bonding:

A pairs with T via two hydrogen bonds
C pairs with G via three hydrogen bonds

This predictable matching is base pairing; the strands are complementary. The strands run in opposite directions (antiparallel), meaning the 5′ end of one strand is opposite the 3′ end of the other.

Structure is directly tied to function. Complementarity allows each strand to serve as a template for copying, and having two strands improves fidelity because damage on one strand can often be repaired using the other as a template.

Genes, genomes, chromosomes, and plasmids

A gene is a segment of nucleic acid whose sequence is used to produce a functional product (often a protein, sometimes a functional RNA). A genome is all genetic material in an organism. Each separate DNA “unit” in a genome is typically organized as a chromosome.

In addition to chromosomes, prokaryotes and eukaryotes can also contain plasmids, which are small, double-stranded, circular DNA molecules. Plasmids are especially important in biotechnology and in gene transfer in bacteria.

DNA packaging and chromatin states

Eukaryotic DNA is wrapped around proteins called histones. DNA plus histones form repeating units called nucleosomes, which help package DNA and also influence gene accessibility.

Two chromatin states are commonly contrasted:

Euchromatin: looser DNA packaging; genes are generally active/available for transcription
Heterochromatin: highly condensed; genes are generally inactive

This connection between packaging and gene activity becomes central when studying epigenetic regulation.

Exam Focus

Typical question patterns
- Interpret a diagram of DNA/RNA structure (label 5′/3′ ends, identify bonds, compare nucleotides).
- Predict base pairing results (complementary DNA strand; RNA complement to a DNA template).
- Explain why DNA is suited for storage and RNA for messaging/regulation.
- Distinguish purines vs. pyrimidines and relate bonding (2 vs. 3 hydrogen bonds) to base pairing.
- Connect euchromatin/heterochromatin to transcriptional activity.
Common mistakes
- Mixing up 5′/3′ directionality or assuming both DNA strands run the same direction.
- Using T in RNA sequences (RNA uses U).
- Describing base pairing as covalent bonds (bases are held by hydrogen bonds; covalent bonds are in the backbone).
- Treating “genome,” “chromosome,” and “gene” as interchangeable terms.

DNA Replication: Mechanism, Fidelity, and Continuity

DNA replication is how cells pass genetic information to daughter cells. The core challenge is copying a huge molecule accurately while working with the constraints of enzyme chemistry.

Semi-conservative replication: the big idea

Replication is semi-conservative: each new DNA molecule contains one original (parental) strand and one newly synthesized strand. This explains how heredity can be both stable and copyable.

Origins, replication forks, and bidirectional replication

Replication begins at specific sequences called origins of replication. The double helix opens to form replication bubbles with Y-shaped replication forks at each end. Forks move in opposite directions (bidirectional replication). In eukaryotes, multiple origins allow large genomes to be copied efficiently.

As the helix is unwound, the DNA ahead of the fork experiences twisting strain; replication requires managing both strand separation and torsion.

Key enzymes and why they’re needed

Replication uses a coordinated team:

Helicase unwinds the double helix by breaking hydrogen bonds.
Single-strand binding proteins stabilize separated strands so they don’t re-anneal.
Topoisomerase relieves twisting strain by cutting and rejoining the helix to prevent tangling.
Primase (RNA primase) makes a short RNA primer.
DNA polymerase adds nucleotides to the 3′ end of a growing strand (so synthesis proceeds 5′→3′).
DNA ligase seals gaps between DNA fragments by forming phosphodiester bonds.

A classic test point is primer logic: DNA polymerase cannot start from scratch; it requires an existing 3′ hydroxyl. Primase provides that starting point.

After replication, RNA primers are degraded by enzymes and replaced with DNA, so the final product contains only DNA.

Leading vs. lagging strand and Okazaki fragments

Because DNA strands are antiparallel and polymerase synthesizes only 5′→3′:

The leading strand is synthesized continuously toward the replication fork.
The lagging strand is synthesized discontinuously away from the fork as Okazaki fragments.

The lagging strand is built in pieces not because polymerase is “slow,” but because polymerase cannot build 3′→5′. As the helix opens further, new primers are laid down and additional fragments are made; ligase later joins fragments into a continuous strand.

Finally, hydrogen bonds form between complementary bases in the new double helices, producing two DNA molecules with sequences identical to the original (assuming no uncorrected errors).

Telomeres (ends of linear DNA)

In eukaryotic chromosomes, the ends of DNA molecules contain telomeres, described here as repetitive, “unimportant” DNA at the ends. Telomeres protect coding regions from being lost during replication-associated shortening and are important for chromosome stability.

Proofreading and repair: why replication is usually accurate

Many DNA polymerases proofread newly added nucleotides and can remove mismatches (often via exonuclease activity). Additional repair systems act after replication. Errors that escape repair become mutations, which can contribute to evolution but also to disease.

Example: identifying leading vs. lagging from a diagram

If you are shown a replication fork:

Identify fork direction (where the DNA is being unwound).
Find which new strand is being synthesized toward the fork (continuous): the leading strand.
The other new strand will show multiple primers/fragments: the lagging strand.

Exam Focus

Typical question patterns
- Label enzymes at a replication fork and explain their roles.
- Explain why Okazaki fragments form and how they are joined.
- Predict consequences of a malfunction (e.g., defective ligase or primase).
- Explain why synthesis is 5′→3′ and what that implies about template orientation.
Common mistakes
- Saying DNA polymerase synthesizes in the 3′→5′ direction (it reads the template 3′→5′ but synthesizes 5′→3′).
- Confusing helicase with topoisomerase (helicase separates strands; topoisomerase relieves strain).
- Forgetting primers are RNA and are later replaced.

Central Dogma: Connecting Genes to Proteins

Gene expression is often summarized by the central dogma: DNA information is transcribed into RNA, and RNA information is translated into protein. Those proteins then regulate much of what happens in the cell, including additional gene expression.

A simple flow to keep in mind is:

DNA → mRNA via transcription
mRNA → protein via translation

In replication you produce a complete copy of the genome; in transcription you copy only a specific gene (or genes) into RNA.

Exam Focus

Typical question patterns
- Identify which process (replication, transcription, translation) is occurring in a scenario.
- Explain how changes in DNA can affect RNA and protein (and therefore phenotype).
Common mistakes
- Treating “gene expression” as a single step rather than a multi-step pathway.
- Confusing replication with transcription (replication copies all DNA; transcription copies a specific region).

Transcription and RNA Processing: From Gene to Mature RNA

Transcription makes an RNA copy of genetic information. For protein-coding genes, the RNA is typically processed (in eukaryotes) and then used in translation.

The core idea: base pairing transfers information

During transcription, RNA nucleotides base-pair with a DNA template strand. The RNA sequence is complementary to the template and matches the coding strand except that U replaces T.

Useful strand vocabulary (different courses/resources use different terms):

Template strand = antisense strand (the strand RNA polymerase reads)
Coding strand = sense strand (the strand that matches the RNA sequence, with T in place of U)

Initiation, elongation, termination

RNA polymerase binds near a gene’s start and synthesizes RNA 5′→3′ by adding nucleotides to the 3′ end.

Initiation: RNA polymerase binds a promoter (DNA sequence signaling “start transcription”). The official starting nucleotide is the start site.
- In bacteria, sigma factors help RNA polymerase recognize promoters.
- In eukaryotes, transcription factors help recruit RNA polymerase to promoters.
Elongation: polymerase moves along the template strand, building RNA by complementary base pairing.
Termination: transcription ends at termination signals (mechanisms vary).

Promoter recognition and polymerase recruitment are major regulation points: controlling initiation strongly controls whether a gene is expressed at all.

RNA processing (eukaryotes) and transcript organization

Eukaryotic primary transcripts (pre-mRNA) are processed in the nucleus before export.

Key processing steps:

5′ cap: a 5′ GTP cap is added, helping protect the RNA and helping ribosome recognition later.
Poly(A) tail: added to the 3′ end, increasing stability and influencing export and translation.
Splicing: introns (noncoding regions) are removed and exons (expressed regions) are joined by the spliceosome (an RNA-protein complex).

Alternative splicing can produce different mRNA isoforms from the same gene by including/excluding certain exons. Introns are not simply “junk”; they can contain regulatory sequences, and splicing patterns are biologically controlled.

A useful comparison of transcript organization:

Prokaryotic transcripts can be polycistronic: one transcript encodes information to make several proteins (common when genes are arranged in operons).
Eukaryotic transcripts are usually monocistronic: one gene → one mRNA → one protein (as a general model emphasized at the AP level).

Example: template DNA to mRNA

Suppose the template DNA strand includes:

Template: 3′-T A C G G A T T A C-5′

The mRNA made (5′→3′) is:

mRNA: 5′-A U G C C U A A U G-3′

Notice that A pairs with U in RNA, and the RNA is written 5′→3′.

Exam Focus

Typical question patterns
- Convert between template DNA (antisense), coding DNA (sense), and mRNA sequences.
- Interpret diagrams of transcription initiation (promoters, start site, transcription factors).
- Explain how splicing and alternative splicing change protein products.
- Distinguish polycistronic vs. monocistronic transcripts and connect polycistronic transcription to operons.
Common mistakes
- Treating the coding strand as the template strand.
- Forgetting that RNA uses U, not T.
- Claiming prokaryotic mRNA is processed like eukaryotic mRNA (prokaryotes generally lack spliceosomal introns and typical capping/poly-A processing).

Translation: From mRNA Codons to Polypeptides

Translation builds a polypeptide (protein) using instructions in mRNA. Translation occurs on ribosomes in the cytoplasm and on ribosomes attached to the rough endoplasmic reticulum.

Codons and the genetic code

A codon is a three-nucleotide sequence on mRNA. Each codon corresponds to an amino acid or a stop signal.

Key properties:

Unambiguous: each codon specifies only one amino acid.
Redundant (degenerate): many amino acids have multiple codons.
Nearly universal: shared across most organisms, supporting common ancestry.

Redundancy explains why some base changes are silent.

tRNA: anticodons and accurate amino acid delivery

tRNA is the adaptor that links nucleic acid language to protein language. One end carries a specific amino acid; the other end contains an anticodon that base-pairs with an mRNA codon.

Each tRNA must be correctly “charged” with its amino acid by an aminoacyl-tRNA synthetase. Translation accuracy depends heavily on correct tRNA charging.

The third base of a codon is often less strict, leading to wobble pairing. For example, unusual pairings like guanine–uracil (G–U) can occur at the third position, which helps explain redundancy in the genetic code.

Ribosomes, binding sites, and the stages of translation

Ribosomes contain rRNA and proteins; rRNA contributes to catalyzing peptide bond formation.

Ribosomes have three important binding sites:

A site (aminoacyl-tRNA site)
P site (peptidyl-tRNA site)
E site (exit site)

Translation has three phases:

Initiation: the ribosome attaches to the mRNA and assembles at the start codon, usually AUG, which codes for methionine. The tRNA with anticodon UAC is methionine’s “personal shuttle” for pairing with AUG.
Elongation: amino acids are added; as many amino acids link up, a polypeptide forms. The ribosome moves codon-by-codon along the mRNA (5′→3′), and tRNAs cycle through A → P → E.
Termination: translation ends when the ribosome reaches a stop codon (UAA, UAG, or UGA). Stop codons do not code for an amino acid; they recruit release factors that end translation.

Translation is also a major control point: regulating translation can rapidly change protein levels without requiring new transcription.

Example: translating an mRNA segment

Given mRNA: 5′-AUG GGC UUU UGA-3′

AUG = start (Met)
GGC = Gly
UUU = Phe
UGA = stop

Polypeptide: Met–Gly–Phe

If a mutation changes UUU to UUC, the amino acid is still Phe (a silent change due to redundancy).

Exam Focus

Typical question patterns
- Use a codon chart to translate an mRNA sequence.
- Predict how a point mutation changes an amino acid sequence (silent, missense, nonsense).
- Explain the roles of ribosome, tRNA, codons/anticodons, and ribosomal A/P/E sites.
- Explain how wobble pairing relates to redundancy.
Common mistakes
- Reading codons on DNA instead of mRNA (codons are on mRNA).
- Forgetting translation proceeds in the 5′→3′ direction on mRNA.
- Treating stop codons as amino acids (they are signals recognized by release factors).
- Treating AUG as the only methionine in a protein (internal AUG codons can appear too).

Regulation of Gene Expression: How Cells Control When Genes Are Used

Gene regulation explains how cells conserve resources, respond to the environment, and (in multicellular organisms) become specialized. Regulation can occur at many points: DNA accessibility, transcription initiation, RNA processing, mRNA stability, translation, and post-translation.

Why regulate?

Regulation matters for two big reasons. First, resource management: producing proteins costs energy and materials. Second, function and identity: different expression patterns create different cell types.

Timing of regulation: pre-, post-, and post-translational control

A common emphasis is that a major control point is before transcription (often called pre-transcriptional regulation), because if transcription never starts, the cell saves the most resources.

Regulation can also occur:

Post-transcriptionally: the cell makes an RNA but then prevents it from being translated or shortens its lifespan. RNA interference (RNAi) is a key mechanism here.
Post-translationally: the cell has already made a protein but activates/inactivates it via modifications (like phosphorylation or cleavage) or controls its lifespan (for example, tagging proteins for destruction via ubiquitin-proteasome pathways).

Prokaryotic regulation: the operon model

In bacteria, genes with related functions are often organized into an operon, a cluster of genes under control of a single promoter and regulatory sequences.

An operon typically includes:

Promoter: where RNA polymerase binds
Operator: a DNA “switch” where a regulatory protein (often a repressor) binds
Structural genes: code for enzymes/proteins needed for a pathway and are transcribed together
Regulatory gene: encodes a regulatory protein such as a repressor (often nearby, not always part of the operon)

Repressible vs. inducible operons

Two classic logic patterns:

Repressible operon: usually on; turned off when end product is abundant (often anabolic pathways). Classic example: trp operon.
Inducible operon: usually off; turned on when a substrate is present (often catabolic pathways). Classic example: lac operon.

Example: reasoning through an inducible system

If a repressor normally binds the operator and blocks transcription:

Without inducer: repressor binds operator → transcription off
With inducer: inducer binds repressor, changes its shape → repressor cannot bind operator → transcription on

A common trap is claiming the inducer binds DNA; in the standard model, the inducer binds the regulatory protein.

Eukaryotic regulation: chromatin, transcription factors, and RNA-based control

Eukaryotes regulate gene expression extensively because DNA is packaged in chromatin, transcription requires many factors, RNA is processed/exported, and organisms must maintain stable patterns during development.

Epigenetic regulation (chromatin modification)

DNA wraps around histones in nucleosomes. Tightly packed chromatin is less accessible.

Histone acetylation is often associated with more open chromatin and increased transcription.
DNA methylation is often associated with reduced transcription.

Epigenetic changes alter gene expression without changing the nucleotide sequence (they are not mutations).

Transcriptional control: enhancers and silencers

Transcription factors bind promoters and regulatory DNA to influence RNA polymerase recruitment.
Enhancers increase transcription when activators bind.
Silencers decrease transcription when repressors bind.

Enhancers can be far from the gene; DNA looping brings bound proteins into contact with the promoter.

RNA interference (RNAi)

RNAi molecules (including miRNAs and siRNAs) can bind complementary RNA sequences. This complementary binding can create double-stranded RNA regions and typically results in blocking translation and/or promoting mRNA degradation.

Translational control

Even if mRNA exists, cells can regulate translation initiation to quickly change protein output.

Exam Focus

Typical question patterns
- Predict operon expression states given environmental conditions or mutations in operator/repressor.
- Explain how histone acetylation or DNA methylation affects transcription.
- Interpret scenarios involving transcription factors, enhancers/silencers, RNAi, or control points (pre-, post-transcriptional, post-translational).
Common mistakes
- Treating operons as a eukaryotic mechanism (operons are primarily prokaryotic in AP Biology).
- Saying methylation “activates” genes as a universal rule (it is often associated with repression).
- Confusing transcriptional regulation with translational regulation (questions may ask explicitly which step is being regulated).

Cell Specialization and Gene Regulation in Development

Multicellular organisms develop many cell types—skin, liver, neurons—despite (usually) sharing the same genome. The key is differential gene expression: different genes are turned on/off in different cells, producing different proteins and therefore different structures and functions.

Genomic equivalence and differential expression

Most somatic cells contain essentially the same DNA. What differs is which genes are transcribed, which mRNAs are translated, and which proteins are active and how long they persist. Differentiation usually does not require changing the DNA sequence; it requires changing expression patterns.

Developmental pathways, signals, and morphogenesis

During embryonic development, fertilization triggers the zygote to undergo a series of cell divisions. Development involves changing cell shape and organization across stages, a process called morphogenesis.

Cells respond to signals (chemical gradients, cell-cell contact) that activate gene regulatory networks. A signal can activate a transcription factor that turns on sets of genes, including other regulatory genes, creating cascades that push cells down branching fate pathways.

Homeotic genes, Hox genes, and master regulators

Homeotic genes help determine body plan and segment identity, often by coding for transcription factors that regulate many downstream genes. A subset of homeotic genes are Hox genes. Mutations in these master regulators can cause dramatic phenotypic changes because they alter entire developmental programs.

Epigenetic memory in differentiation

Differentiated cells can “remember” which genes should stay on or off through epigenetic marks and chromatin organization. This stabilizes cell identity through many rounds of cell division.

Example: why neurons and muscle cells differ

Neurons express genes for ion channels, neurotransmitter receptors, and synaptic proteins, while muscle cells express genes for actin, myosin, and contraction-control proteins. Both cell types carry both sets of genes, but each cell type activates only the set that matches its function.

Exam Focus

Typical question patterns
- Explain how the same genome can produce different cell types.
- Interpret data showing different mRNA/protein levels across tissues.
- Connect signaling molecules to transcription factor activation and downstream expression changes.
- Describe roles of homeotic/Hox genes in developmental patterning.
Common mistakes
- Claiming differentiation requires mutation or DNA sequence loss (usually false).
- Mixing up “gene present” with “gene expressed.”
- Ignoring transcription factors and regulatory networks (answers that only say “genes turn on” are usually too vague).

Mutations: Sources, Types, and Effects on Gene Products

A mutation is a change in the nucleotide sequence of DNA (often described as an “error” in the genetic code). Mutations can be harmful, neutral, or beneficial depending on context, and they are essential for evolution because they create variation.

Where mutations come from

Mutations arise primarily from:

DNA replication errors that escape proofreading and repair
DNA damage from environmental mutagens (chemicals, radiation such as UV)

Damage is not the same as a mutation: damage is a lesion; it becomes a mutation if misrepaired or miscopied.

Point mutations (base substitutions)

A base substitution (point mutation) replaces one nucleotide with another. In protein-coding regions, outcomes include:

Silent mutation: codon changes but amino acid stays the same
Missense mutation: amino acid changes
Nonsense mutation: codon becomes a stop codon, truncating the protein

Silent mutations can still matter by altering splicing signals or mRNA stability. Missense impacts depend on location and chemical differences. Nonsense mutations often have large effects due to early termination.

Insertions, deletions, and frameshifts

Insertions and deletions (indels) add or remove nucleotides. If not in multiples of three, they cause a frameshift mutation, changing how codons are grouped and often altering every downstream amino acid and introducing early stop codons.

Mutations outside coding regions

Mutations in promoters, enhancers, splice sites, or miRNA binding sites can strongly change gene expression or RNA processing without changing the protein’s amino acid sequence.

Larger-scale gene rearrangements

Mutations can also involve larger chromosome/gene changes:

Duplications: extra copies of genes, often from unequal crossing-over during meiosis or chromosome rearrangements; may contribute to new traits.
Inversions: a chromosomal region flips orientation.
Translocations: segments from different chromosomes break and rejoin; can cause genes to be lost, repeated, interrupted, or moved to new regulatory contexts.
Transposons: mobile DNA segments that can cut/paste themselves through the genome; insertions can interrupt genes and disrupt gene expression.

Example: predicting effects from a sequence change

If an original mRNA segment is:

5′-AUG UAU GGC-3′ (Met–Tyr–Gly)

If a mutation changes UAU to UAA:

UAA is a stop codon → translation stops early → truncated protein

That is a nonsense mutation and is typically high impact.

Exam Focus

Typical question patterns
- Classify mutations (silent/missense/nonsense/frameshift) from sequence changes.
- Predict how a mutation affects protein structure/function or gene expression.
- Explain how mutations provide variation for natural selection.
- Recognize examples of duplications, inversions, translocations, and transposons and infer likely consequences.
Common mistakes
- Assuming all mutations are harmful (many are neutral; effects depend on context).
- Forgetting that mutations in regulatory DNA can matter greatly.
- Translating from the wrong strand or wrong direction when analyzing sequence problems.

Horizontal Gene Transfer and Viruses

Genetic change doesn’t only come from mutation. Especially in microbes, genes can move between organisms, and viruses can introduce new genetic material into cells.

Bacteria: fission and gene exchange

Bacteria are prokaryotes with many shapes and sizes. They divide by binary fission; fission increases cell number but does not inherently increase genetic diversity the way sexual reproduction does.

Bacteria can increase genetic diversity through gene exchange such as conjugation, where cells connect and swap some DNA (often involving plasmids).

Viruses: structure and replication strategies

Viruses are nonliving agents that require a host cell’s machinery to replicate. A virus has two main components:

A protein shell called a capsid
Genetic material made of DNA or RNA

The infected organism/cell is the host.

Bacteriophages (viruses that infect bacteria) can follow:

The lytic cycle, where the virus immediately uses the host machinery to replicate viral genomes and capsid proteins, typically leading to host cell lysis.
The lysogenic cycle, where viral genetic material integrates into the host genome and can remain dormant before later entering lytic replication.

The transfer of DNA between bacterial cells using a lysogenic virus is called transduction.

Some viruses are enveloped viruses, meaning they have a lipid envelope around the capsid.

Retroviruses (e.g., HIV) are RNA viruses that use reverse transcriptase to convert their RNA genome into DNA, which can then be inserted into the host genome.

Exam Focus

Typical question patterns
- Distinguish binary fission from mechanisms that increase bacterial genetic diversity (e.g., conjugation).
- Compare lytic vs. lysogenic cycles and predict outcomes for host cells.
- Explain how transduction can move genes between bacteria.
- Describe how reverse transcriptase changes information flow in retroviruses.
Common mistakes
- Treating viral replication as independent of host machinery.
- Mixing up conjugation (cell-to-cell transfer) with transduction (virus-mediated transfer).
- Assuming all viruses have envelopes (only some do).

Biotechnology: Tools to Analyze and Modify DNA

Biotechnology applies molecular biology techniques to study, compare, and edit genetic material. In AP Biology, the focus is on how methods work and how to interpret the data they produce.

Restriction enzymes, recombinant DNA, and genetic engineering

Restriction enzymes cut DNA at specific recognition sequences (often palindromic), producing sticky ends (overhangs) or blunt ends. DNA ligase can join fragments, enabling recombinant DNA, DNA assembled from different sources into a unique molecule not found in nature.

A common application is inserting a eukaryotic gene of interest into bacteria so bacteria can produce the eukaryotic protein for research or therapy.

The broader practice of creating new organisms or products by transferring genes between cells is genetic engineering.

Plasmids, vectors, transformation, and transfection

A plasmid is a small circular DNA molecule commonly used as a vector. Engineered plasmids often contain:

An origin of replication
A selectable marker (often antibiotic resistance)
A multiple cloning site (many restriction sites)

Transformation is uptake of DNA by a cell (commonly bacteria in lab settings). A related term, transfection, refers to introducing plasmids/DNA into eukaryotic cells.

Gel electrophoresis and DNA movement

Gel electrophoresis separates nucleic acid fragments by size in an electric field.

DNA and RNA are negatively charged (phosphate backbone), so they migrate toward the positive pole.
Smaller fragments travel farther through the gel matrix.

A common mistake is thinking “heavier goes farther”; under the same conditions, smaller goes farther.

Example: interpreting a gel

If Lane A has a band farther down than Lane B, Lane A’s fragment is smaller. You may be asked to compare fragment sizes, infer sample matches, or evaluate whether a restriction digest produced expected fragment lengths.

PCR (polymerase chain reaction)

PCR creates billions of identical copies of a specific DNA region in hours; this is amplification.

PCR cycles:

Denaturation: strands separate (high temperature)
Annealing: primers bind target sequences (cooler temperature)
Extension: heat-stable DNA polymerase extends from primers

Each cycle roughly doubles target DNA, leading to exponential amplification. PCR does not typically copy the whole genome; it amplifies the region bracketed by primers.

DNA profiling: polymorphisms, RFLPs, and STRs

Individuals differ at many DNA sites; these differences are polymorphisms.

When restriction fragments from individuals of the same species are compared, they may differ in length because polymorphisms can create/remove restriction sites. These differences are restriction fragment length polymorphisms (RFLPs).

Another major profiling approach uses short tandem repeats (STRs). STR lengths vary widely among individuals; by analyzing multiple STR loci (often using PCR and electrophoresis), scientists can generate DNA profiles for forensics and paternity testing. The more loci analyzed, the lower the probability that two unrelated individuals share the same overall pattern.

DNA sequencing (conceptual overview)

DNA sequencing determines the nucleotide order in DNA. Sequencing can identify mutations, compare species, diagnose genetic conditions, and support plasmid/design work (scientists can design plasmids to study genes of interest).

CRISPR-Cas9 genome editing

CRISPR-Cas9 (adapted from bacterial defenses) enables targeted editing:

A guide RNA targets a specific DNA sequence
Cas9 makes a double-stranded cut
Cellular repair can disrupt a gene or incorporate a new sequence if a donor template is provided

This has transformed research and raises ethical questions, especially for editing embryos and ecosystems.

Measuring gene expression: microarrays and RNA sequencing

Different cells express different sets of genes. Techniques measure expression by quantifying RNA levels:

Microarrays: hybridization of cDNA to probes on a chip
RNA sequencing (RNA-seq): reads and counts RNA-derived sequences

AP questions may provide expression data and ask which genes are upregulated, or which tissue samples cluster together.

Exam Focus

Typical question patterns
- Interpret gel electrophoresis results (size comparisons, matching patterns, restriction digest outcomes).
- Explain PCR steps and predict which region is amplified from primer placement.
- Describe how plasmids/restriction enzymes/ligase enable cloning or gene insertion.
- Explain DNA profiling logic using STRs and/or RFLPs.
- Distinguish transformation (often bacterial) from transfection (eukaryotic).
Common mistakes
- Saying DNA moves toward the negative electrode (it moves toward positive).
- Claiming PCR works without primers or that it amplifies the whole genome (primers define endpoints).
- Confusing what a gel shows (fragment size) with what it directly identifies (gels show lengths, not gene names).