A common problem
in biological research is being faced with a protein that is difficult
to subdivide into functional domains. The problem is intensified
if the essential region of the protein is large and/or sequence
homology to known polypeptides is lacking. Targeted mutagenesis
strategies may be cumbersome or prohibitive under these conditions.
A few years ago, we confronted all of the above difficulties while
attempting to separate functions within the essential ~300 amino
acid N-terminus of Gcr1p. We reasoned that random mutagenesis (we
used error-prone PCR), followed by selection for fully functional
variants, would identify the equivalent of conserved and non-conserved
subsections of the N-terminal region. We recovered 24 functional
alleles of the GCR1 gene, based on the ability of each to
complement the severe glucose non-responsive phenotype resulting from the removal of Gcr1.
The alignment among these variants is similar to one found for a
group of homologous proteins from different organisms. Since in
our approach they are selected in the same organism (ie., in an
isogenic background), we call the method, "unigenic evolution"
(Deminoff et al. 1995).
The 24 variants contained
a total of 315 nucleotide substitutions, 200 of which were missense
mutations. As expected, certain subregions were either over- or
under-represented for missense substitutions. However, the distribution
could have been due to hot or cold spots, respectively, in the template
for mutagenesis. We therefore normalized to the background of silent
mutations, which contribute to the total density of nucleotide substitutions
but have no effect on the protein. We fine-tuned the normalization
to account for the fact that different codons produce a different
spectrum of missense and silent mutations, given all possible single-nucleotide
changes; ordinarily, stop codons would not occur in a functional
allele. We then plotted the expected vs. the observed ratio of missense
to total mutations on average for a moving 20-codon window. The
resulting bar graph (see Figure below) displays regions that are
either hypo- or hyper- mutable, depending on normalized frequency
of missense mutations. We used the chi-square test to assess the
statistical significance of these data. We also found good agreement
between unigenic evolution analysis and independent experiments
that used deletions and site-directed point mutations to probe Gcr1p
function.

Together, the four
hypomutable regions we identified (A, B, C, and D) occupy less than
half of the region we set out to analyze. Further work confirmed
that each contains individual amino acids that are required for
Gcr1p function. Analysis of individual hypomutable regions has yielded
intriguing results. (1) A LexAp fusion to region A, the smallest
of the four hypomutable domains, is able to coimmunoprecipitate
with Gcr2p. (2) Although many inactivating point mutations in region
C do not destabilize the protein, its deletion does, which suggests
that some or many of the hydrophobic residues that predominate in
region C could contribute to the formation of important internal
regions of Gcr1p. (3) Region B can be subdivided into two smaller
hypomutable domains, and has not yet been correlated with known
Gcr1p functions (ie., activation and Rap1p contact). (4) Further
inspection of the primary sequence of region D revealed an excellent
match to leucine zipper motifs, which we showed was essential for
Gcr1p homodimer formation. We have since gone on to characterize
the role this dimerization domain in Gcr1p function (Deminoff et
al. 2001).
Others have adapted
the unigenic evolution technique to their studies (Friedman et
al. 2003; Zeng et al. 2003; San Filippo and Lambowitz
2002; Guo et al. 2000; Friedman and Cech 1999). We feel that
the method should be widely applicable, particularly now, as genomic
studies provide previously uncharacterized genes as candidates for
involvement in various processes. The requirements of the method
are modest: (1) that the gene of interest leads to an observable
phenotype when mutated, (2) an organism that can be transformed
with a library of randomly mutagenized alleles, and (3) that the
latter can be recovered and sequenced. Thus, although it is ideally
suited to genetic studies in Saccharomyces or other easily
manipulated microorganisms, unigenic evolution in higher eukaryotes
can be envisioned.
References:
Friedman KL, Heit
JJ, Long DM, and TR Cech, 2003. N-terminal domain of yeast telomerase
reverse transcriptase: Recruitment of Est3p to the telomerase complex.
Molecular Biology of the Cell 14:1-13.
Zeng X, Zhang D, Dorsey
M, and J Ma, 2003. Hypomutable regions of yeast TFIIB in a unigenic
evolution test represent structural domains. Gene 309:29-56.
San Filippo J and
AM Lambowitz, 2002. Characterization of the C-terminal DNA-binding/DNA
endonuclease region of a group II intron-encoded protein. Journal
of Molecular Biology 324:933-951.
Deminoff SJ, and GM
Santangelo, 2001. Rap1p requires Gcr1p and Gcr2p homodimers to activate
ribosomal protein and glycolytic genes, respectively. Genetics 158:133-43.
Guo H, Karberg M,
Long M, Jones JP 3rd, Sullenger B, and AM Lambowitz, 2000. Group
II introns designed to insert into therapeutically relevant DNA
target sites in human cells. Science 289:452-457.
Friedman KL, and TR
Cech, 1999. Essential functions of amino-terminal domains in the
yeast telomerase catalytic subunit revealed by selection for viable
mutants. Genes Dev. 13:2863-2874.
Deminoff SJ, Tornow,
J., and GM Santangelo, 1995. Unigenic evolution: a novel genetic
method localizes a putative leucine zipper that mediates dimerization
of the Saccharomyces cerevisiae regulator Gcr1p. Genetics 141:1263-74.