Chapter 11 - Transcriptional Control of Gene Expression

  • Synthesis of mRNA requires that an RNA polymerase initiate transcription, polymerize ribonucleoside triphosphates complementary to the DNA coding strand, and then terminate transcription

11.1 - Overview of Eukaryotic Gene Control and RNA Polymerases

  • In bacteria, gene control serves mainly to allow a single cell to adjust to changes in its environment so that its growth and division can be optimized

  • In multicellular organisms, environmental changes also induce changes in gene expression

  • In most cases, once a developmental step has been taken by a cell, it is not reversed

  • So these decisions are fundamentally different from the reversible activation and repression of bacterial genes in response to environmental conditions

Most Genes in Higher Eukaryotes Are Regulated by Controlling Their Transcription

  • Direct measurements of the transcription rates of multiple genes in different cell types have shown that regulation of transcription initiation is the most widespread form of gene control in eukaryotes, as it is in bacteria

  • The nascent-chain analysis is a common method for determining the relative rates of transcription of different genes in cultured cells

  • The total radioactive label incorporated into RNA is a measure of the overall transcription rate

  • The fraction of the total labeled RNA produced by transcription of a particular gene—that is

    • Its relative transcription rate—is determined by hybridizing the labeled RNA to the cloned DNA of that gene attached to a membrane

Regulatory Elements in Eukaryotic DNA Often Are Many Kilobases from Start Sites

  • In eukaryotes, as in bacteria, a DNA sequence that specifies where RNA polymerase binds and initiates transcription of a gene is called a promoter

  • Transcription from a particular promoter is controlled by DNA-binding proteins, termed transcription factors, that are equivalent to bacterial repressors and activators

  • By constructing and analyzing a 5-deletion series upstream of the TTR gene, researchers identified two control elements that stimulate reporter-gene expression in hepatocytes, but not in other cell types

  • One region mapped between ≈2.01 and 1.85 kb upstream of the TTR gene start site; the other mapped between ≈200 base pairs upstream and the start site

Three Eukaryotic Polymerases Catalyze Formation of Different RNAs

  • The nuclei of all eukaryotic cells examined so far (e.g., vertebrate, Drosophila, yeast, and plant cells) contain three different RNA Polymerases, designated I, II, and III

    • These enzymes are eluted at different salt concentrations during ion-exchange chromatography and also differ in their sensitivity to -amanitin, a poisonous cyclic octapeptide produced by some mushrooms

  • Each eukaryotic RNA polymerase catalyzes the transcription of genes encoding different classes of RNA. RNA polymerase I, located in the nucleolus, transcribes genes encoding precursor rRNA (pre-rRNA), which is processed into 28S, 5.8S, and 18S rRNAs

  • RNA polymerase III transcribes genes encoding tRNAs, 5S rRNA, and an array of small, stable RNAs

    • Including one involved in RNA splicing (U6) and the RNA component of the signal-recognition particle (SRP) involved in directing nascent proteins to the endoplasmic reticulum

  • The two large subunits (RPB1 and RPB2) of all three eukaryotic RNA polymerases are related to each other and are similar to the E. coli and subunits

    • Likely that all the subunits are necessary for eukaryotic RNA polymerases to function normally

The Largest Subunit in RNA Polymerase II Has an Essential Carboxyl-Terminal Repeat

  • The carboxyl end of the largest subunit of RNA polymerase II (RPB1) contains a stretch of seven amino acids that is nearly precisely repeated multiple times

  • Neither RNA polymerase I nor III contains these repeating units

  • This heptapeptide repeat, with a consensus sequence of Tyr-Ser-Pro-Thr-Ser-Pro-Ser, is known as the carboxyl-terminal domain (CTD)

  • In vitro experiments with model promoters first showed that RNA polymerase II molecules that initiate transcription have an unphosphorylated CTD

  • Once the polymerase initiates transcription and begins to move away from the promoter, many of the serine and some tyrosine residues in the CTD are phosphorylated

RNA Polymerase II Initiates Transcription at DNA Sequences Corresponding to the 5’ Cap of mRNAs

  • Several experimental approaches have been used to identify DNA sequences at which RNA polymerase II initiates transcription

  • Approximate mapping of the transcription start site is possible by exposing cultured cells or isolated nuclei to 32P-labeled ribonucleotides for very brief times

  • The precise base pair where RNA polymerase II initiates transcription in the adenovirus late transcription unit was determined by analyzing the RNAs synthesized

    • During in vitro transcription of adenovirus DNA restriction fragments that extended somewhat upstream and downstream of the approximate initiation region determined by nascent-transcript analysis

  • Similar in vitro transcription assays with other cloned eukaryotic genes have produced similar results

  • In each case, the start site was found to be equivalent to the capped 5’ sequence of the corresponding mRNA

11.2 - Regulatory Sequences in Protein-Coding Genes

  • Expression of eukaryotic protein-coding genes is regulated by multiple protein-binding DNA sequences, generically referred to as transcription-control regions

The TATA Box, Initiators, and CpG Islands Function as Promoters in Eukaryotic DNA

  • The first genes to be sequenced and studied in in vitro transcription systems were viral genes and cellular protein-coding genes that are very actively transcribed either at particular times of the cell cycle or in specific differentiated cell types

  • In all these rapidly transcribed genes, a conserved sequence called the TATA box was found ≈25–35 base pairs upstream of the start site

  • Instead of a TATA box, some eukaryotic genes contain an alternative promoter element called an initiator

  • Most naturally occurring initiator elements have a cytosine (C) at the -1 position and an adenine (A) residue at the transcription start site (+1)

Promoter-Proximal Elements Help Regulate Eukaryotic Genes

  • Recombinant DNA techniques have been used to systematically mutate the nucleotide sequences upstream of the start sites of various eukaryotic genes in order to identify transcription-control regions

  • By now, hundreds of eukaryotic genes have been analyzed, and scores of transcription-control regions have been identified

  • One approach frequently taken to determine the upstream border of a transcription-control region for a mammalian gene involves constructing a set of 5 deletions

  • Once the 5 borders of a transcription-control region is determined, analysis of linker scanning mutations can pinpoint the sequences with regulatory functions that lie between the border and the transcription start site

  • Changes in spacing between the promoter and promoter-proximal control elements of 20 nucleotides or fewer had little effect

  • Insertions of 30 to 50 base pairs between a promoter-proximal element and the TATA box was equivalent to deleting the element

  • Similar analyses of other eukaryotic promoters have also indicated that considerable flexibility in the spacing between promoter-proximal elements is generally tolerated

    • But separations of several tens of base pairs may decrease transcription

Distant Enhancers Often Stimulate Transcription by RNA Polymerase II

  • Transcription from many eukaryotic promoters can be stimulated by control elements located thousands of base pairs away from the start site

  • Such long-distance transcription-control elements, referred to as enhancers, are common in eukaryotic genomes but fairly rare in bacterial genomes

  • Soon after the discovery of the SV40 enhancer, enhancers were identified in other viral genomes and in eukaryotic cellular DNA

  • Some of these control elements are located 50 or more kilobases from the promoter they control

Most Eukaryotic Genes Are Regulated by Multiple Transcription-Control Elements

  • Initially, enhancers and promoter-proximal elements were thought to be distinct types of transcription-control elements

  • As more enhancers and promoter-proximal elements were analyzed, the distinctions between them became less clear

  • The S. cerevisiae genome contains regulatory elements called upstream activating sequences (UASs)

    • Which function similarly to enhancers and promoter-proximal elements in higher eukaryotes

11.3 - Activators and Repressors of Transcription

  • The various transcription-control elements found in eukaryotic DNA are binding sites for regulatory proteins

Footprinting and Gel-Shift Assays Detect Protein-DNA Interactions

  • In yeast, Drosophila, and other genetically tractable eukaryotes, numerous genes encoding transcriptional activators and repressors have been identified by classical genetic analyses

  • Two common techniques for detecting such cognate proteins are DNase I footprinting and the electrophoretic mobility shift assay

  • DNase I footprinting takes advantage of the fact that when a protein is bound to a region of DNA, it protects that DNA sequence from digestion by nucleases

    • Footprinting also identifies the specific DNA sequence to which the transcription factor binds

  • The electrophoretic mobility shift assay (EMSA), also called the gel-shift or band-shift assay, is more useful than the footprinting assay for quantitative analysis of DNA-binding proteins

    • Generally, the electrophoretic mobility of a DNA fragment is reduced when it is complexed to protein, causing a shift in the location of the fragment band

  • In the biochemical isolation of a transcription factor, an extract of cell nuclei commonly is subjected sequentially to several types of column chromatography

  • Once a transcription factor is isolated and purified, its partial amino acid sequence can be determined and used to clone the gene or cDNA encoding it

Activators Are Modular Proteins Composed of Distinct Functional Domains

  • Studies with a yeast transcription activator called GAL4 provided early insight into the domain structure of transcription factors

  • The gene encoding the GAL4 protein, which promotes the expression of enzymes needed to metabolize galactose, was identified by complementation analysis of gal4 mutants

  • A remarkable set of experiments with gal4 deletion mutants demonstrated that the GAL4 transcription factor is composed of separable functional domains: an N-terminal DNA-binding domain

    • Which binds to specific DNA sequences, and a C-terminal activation domain, which interacts with other proteins to stimulate transcription from a nearby promoter

  • The presence of flexible domains connecting the DNA-binding domains to activation domains may explain why alterations in the spacing between control elements are so well-tolerated in eukaryotic control regions

Repressors Are the Functional Converse of Activators

  • Eukaryotic transcription is regulated by repressors as well as activators

  • A type of unregulated, abnormally high expression is called constitutive expression and results from the inactivation of a repressor that normally inhibits the transcription of these genes

  • Repressor-binding sites in DNA have been identified by systematic linker scanning mutation

  • In this type of analysis, mutation of an activator-binding site leads to decreased expression of the linked reporter gene

    • Whereas mutation of a repressor-binding site leads to increased expression of a reporter gene

  • Eukaryotic transcription repressors are the functional converse of activators

  • They can inhibit transcription from a gene they do not normally regulate when their cognate binding sites are placed within a few hundred base pairs of the gene’s start site

DNA-Binding Domains Can Be Classified into Numerous Structural Types

  • The DNA-binding domains of eukaryotic activators and repressors contain a variety of structural motifs that bind specific DNA sequences

  • The ability of DNA-binding proteins to bind to specific DNA sequences commonly results from noncovalent interactions between atoms in an ox helix in the DNA-binding domain and atoms on the edges of the bases within a major groove in the DNA

  • A structural element, which is present in many bacterial repressors, is called a helix-turn-helix motif

  • There are several common classes of DNA-binding proteins whose three-dimensional structures have been determined

  • In all these examples and many other transcription factors, at least one ox helix is inserted into a major groove of DNA

  • Homeodomain Proteins: Many eukaryotic transcription factors that function during development contain a conserved 60-residue DNA-binding motif that is similar to the helix-turn-helix motif of bacterial repressors

  • Zinc-Finger Proteins: A number of different eukaryotic proteins have regions that fold around a central Zn2 ion, producing a compact domain from a relatively short length of the polypeptide chain

  • The C2H2 zinc finger is the most common DNA-binding motif encoded in the human genome and the genomes of most other multicellular animals

  • It is also common in multicellular plants but is not the dominant type of DNA-binding domain in plants as it is in animals

  • The second type of zinc-finger structure, designated the C4 zinc finger (because it has four conserved cysteines in contact with the Zn2), is found in ≈50 human transcription factors

  • A characteristic feature of C4 zinc fingers is the presence of two groups of four critical cysteines, one toward each end of the 55- or 56-residue domain

  • Leucine-Zipper Proteins Another structural motif present in the DNA-binding domains of a large class of transcription factors contains the hydrophobic amino acid leucine at every seventh position in the sequence

  • These proteins bind to DNA as dimers, and mutagenesis of the leucines showed that they were required for dimerization

  • GCN4 forms dimers via hydrophobic interactions between the C-terminal regions of the ox helices, forming a coiled-coil structure

  • This structure is common in proteins containing amphipathic ox helices in which hydrophobic amino acid residues are regularly spaced alternately three or four positions apart in the sequence, forming a stripe down one side of the ox helix

  • The first leucine-zipper transcription factors to be analyzed contained leucine residues at every seventh position in the dimerization region

    • Additional DNA-binding proteins containing other hydrophobic amino acids in these positions subsequently were identified

  • Basic Helix-Loop-Helix (bHLH) Proteins: The DNA-binding domain of another class of dimeric transcription factors contains a structural motif very similar to the basic-zipper motif except that a non-helical loop of the polypeptide chain separates two ox-helical regions in each monomer

Transcription-Factor Interactions Increase Gene-Control Options

  • Two types of DNA-binding proteins discussed in the previous section—basic-zipper proteins and bHLH proteins—often exist in alternative heterodimeric combinations of monomers

  • In some heterodimeric transcription factors, each monomer has a different DNA-binding specificity

  • The resulting combinatorial possibilities increase the number of potential DNA sequences that a family of transcription factors can bind

  • Three different factor monomers theoretically could combine to form six homo- and heterodimeric factors

  • Four different factor monomers could form a total of 10 dimeric factors; five monomers, 16 dimeric factors; and so forth

  • Similar combinatorial transcriptional regulation is achieved through the interaction of structurally unrelated

  • Neither NFAT nor AP1 binds to its site in the IL-2 control region in the absence of the other

  • The affinities of the factors for these particular DNA sequences are too low for the individual factors to form a stable complex with DNA

  • But, when both NFAT and AP1 are present, protein-protein interactions between them stabilize the DNA ternary complex composed of NFAT, AP1, and DNA

  • Cooperative binding by NFAT and AP1 occurs only when their weak binding sites are located at a precise distance, quite close to each other in DNA

  • Recent studies have shown that the requirements for cooperative binding are not so stringent in the case of some other transcription factors and control regions

Structurally Diverse Activation and Repression Domains Regulate Transcription

  • Experiments with fusion proteins composed of the GAL4 DNA-binding domain and random segments of E. coli proteins demonstrated that a diverse group of amino acid sequences can function as activation domains

    • 1% of all E. coli sequences, even though they evolved to perform other functions

  • Biophysical studies indicate that acidic activation domains have an unstructured, random-coil conformation

  • These domains stimulate transcription when they are bound to a protein co-activator

  • The interaction with a co-activator causes the activation domain to assume a more structured -helical conformation in the activation domain–co-activator complex

  • Some activation domains are larger and more highly structured than acidic activation domains

Multiprotein Complexes Form on Enhancers

  • As noted previously, enhancers generally range in length from about 50 to 200 base pairs and include binding sites for several transcription factors

  • The multiple transcription factors that bind to a single enhancer are thought to interact

  • The term enhanceosome has been coined to describe such large nucleoprotein complexes that assemble from transcription factors as they bind cooperatively to their multiple binding sites in an enhancer

  • HMGI binds to the minor groove of DNA regardless of the sequence and, as a result, bends the DNA molecule sharply

11.4 - Transcription Initiation by RNA Polymerase II

General Transcription Factors Position RNA Polymerases II at Start Sites and Assist in Initiation

  • In vitro transcription by purified RNA polymerase II requires the addition of several initiation factors that are separated from the polymerase during purification

  • These initiation factors, which position polymerase molecules at transcription start sites and help to melt the DNA strands so that the template strand can enter the active site of the enzyme, are called general transcription factors

  • The general transcription factors that assist Pol II in the initiation of transcription from most TATA-box promoters in vitro have been isolated and characterized

Sequential Assembly of Proteins Forms the Pol II Transcription Preinitiation Complex in Vitro

  • Detailed biochemical studies revealed how the Pol II preinitiation complex

    • Comprising a Pol II molecule and general transcription factors bound to a promoter region of DNA, is assembled

  • Once TBP has bound to the TATA box, TFIIB can bind

  • TFIIB is a monomeric protein, slightly smaller than TBP

  • The C-terminal domain of TFIIB makes contact with both TBP and DNA on either side of the TATA-box, while its N-terminal domain extends toward the transcription start site

  • The helicase activity of one of the TFIIH subunits uses energy from ATP hydrolysis to unwind the DNA duplex at the start site

  • Allowing Pol II to form an open complex in which the DNA duplex surrounding the start site is melted and the template strand is bound at the polymerase active site

In Vivo Transcription Initiation by Pol II Requires Additional Proteins

  • Although the general transcription factors discussed above allow Pol II to initiate transcription in vitro, another general transcription factor, TFIIA, is required for initiation by Pol II in vivo

  • Purified TFIIA forms a complex with TBP and TATA-box DNA. X-ray crystallography of this complex shows that TFIIA interacts with the side of TBP that is upstream from the direction of transcription

  • The TAF subunits of TFIID appear to play a role in initiating transcription from promoters that lack a TATA box

11.5 - Molecular Mechanisms of Transcription Activation and Repression

Formation of Heterochromatin Silences Gene Expression at Telomeres, near Centromeres, and in Other Regions

  • For many years it has been clear that inactive genes in eukaryotic cells are often associated with heterochromatin

  • Regions of chromatin that are more highly condensed and stain more darkly with DNA dyes than euchromatin, where most transcribed genes are located

  • Regions of chromosomes near the centromeres and telomeres and additional specific regions that vary in different cell types are organized into heterochromatin

  • The promoters and UASs controlling transcription of the a and genes lie near the center of the DNA sequence that is transferred and are identical whether the sequences are at the MAT locus or at one of the silent loci

  • Consequently, the function of the transcription factors that interact with these sequences is somehow blocked at HML and HMR

  • Researchers found that GATC sequences within the MAT locus and most other regions of the genome in these cells were methylated, but not those within the HML and HMR loci

  • These results indicate that the DNA of the silent loci is inaccessible to the E. coli methylase and presumably to proteins in general, including transcription factors and RNA polymerase

  • Genetic studies led to the identification of several proteins, RAP1, and three SIR proteins, that are required for repression of the silent mating-type loci and the telomeres in yeast

  • RAP1 was found to bind within the DNA silencer sequences associated with HML and HMR and to a sequence that is repeated multiple times at each yeast chromosome telomere

Repressors Can Direct Histone Deacetylation at Specific Genes

  • The importance of histone deacetylation in chromatin-mediated gene repression has been further supported by studies of eukaryotic repressors that regulate genes at internal chromosomal positions

  • These proteins are now known to act in part by causing deacetylation of histone tails in nucleosomes that bind to the TATA box and promoter-proximal region of the genes they repress

  • The SIN3-RPD3 complex functions as a co-repressor

  • Co-repressor complexes containing histone deacetylases also have been found associated with many repressors from mammalian cells

  • Some of these complexes contain the mammalian homolog of SIN3 (mSin3), which interacts with the repressor protein

  • Other histone deacetylase complexes identified in mammalian cells appear to contain additional or different repressor-binding proteins

  • The discovery of mSin3-containing histone deacetylase complexes provides an explanation for earlier observations

    • Invertebrates transcriptionally inactive DNA regions often contain the modified cytidine residue 5-methylcytidine (mC) followed immediately by a G, whereas transcriptionally active DNA regions lack mC residues

Activators Can Direct Histone Acetylation at Specific Genes

  • Genetic and biochemical studies in yeast led to the discovery of a large multiprotein complex containing the protein GCN5, which has histone acetylase activity

  • Maximal transcription activation by GCN4 depends on these histone acetylase complexes, which thus function as co-activators

  • A similar activation mechanism operates in higher eukaryotes

  • One domain of CBP binds the phosphorylated acidic activation domain in the CREB transcription factor

  • Other domains of CBP interact with different activation domains in other transcription factors

  • Yet another domain of CBP has histone acetylase activity, and another CBP domain associates with a multiprotein histone acetylase complex that is homologous to the yeast GCN5-containing complex

Modifications of Specific Residues in Histone Tails Control Chromatin Condensation

  • Histone tails in chromatin can undergo reversible phosphorylation of serine and threonine residues

    • Reversible monoubiquitination of a lysine residue in the H2A C-terminal tail, and irreversible methylation of lysine residues

robot