Two Hybrid Proteins to Many Hybrid Proteins
Initially rejected, this powerful method for detecting hybrid
protein pairs is being modified for still wider uses in the age of
By now, the "two-hybrid" approach for looking at
interactions between pairs of proteins by expressing them in yeast is
well established and widely used. But not so long ago, it took shape as
a scheme to obtain a grant. Once the idea was in hand, however, we had
to get organized for a new line of experiments to prove that the concept
could be realized.
The idea underlying the two-hybrid approach arrived fully formed late
one afternoon in early 1987 as the answer to the question: What can I
propose for a seed grant in biotechnology that the State University of
New York at Stony Brook would sponsor to bring additional funding to my
laboratory? My plan was to develop a patentable product based on our
studies of transcriptional activation in the yeast Saccharomyces
Eukaryotic Transcription Factors Seemed Simpler in the mid-1980s
At that time, Mark Ptashne and Kevin Struhl and their collaborators
at Harvard University in Cambridge, Mass., were also studying
transcriptional activators in yeast. Such activators were shown by these
laboratories to contain two essential domains: a DNA-binding domain to
contact the specific genes to be activated and a transcriptional
activation domain to recruit and bind the transcriptional machinery.
Roger Brent and Ptashne remarkably demonstrated that the DNA-binding
domain from the yeast Gal4 protein could be swapped with that from the
bacterial repressor LexA to create a hybrid protein that activates
transcription in yeast of a reporter gene regulated by lexA operatorsthus
highlighting the modularity of these factors. Jun Ma and Ptashne soon
identified the critical Gal4 domains, including a potent carboxy-terminal
activation domain of about 100 residues.
Others were gaining additional insights into DNA-binding factors. For
instance, Steven McKnight at the Carnegie Institution of Washington
found that VP16, a herpesvirus protein, binds to DNA and also to a
cellular factor to activate transcription. And Ma and Ptashne later
directly demonstrated that DNA-binding and transcriptional activation
functions could reside on separate molecules that noncovalently
Not until years later would biologists come to appreciate how many
other factors are involved, including mediators, adaptors, histone
acetylases and deacetylases, chromatin remodeling factors, and TATA-binding
protein-associated factors. In retrospect, we were lucky that the
picture then seemed so uncomplicated. Knowledge of that complexity might
have dissuaded us from pursuing so simple an idea as two-hybrid.
Despite a Rejected Proposal, the Two-Hybrid Research Moved Ahead
Based on these transcription results and our desire for fresh funding
support, we proposed exploiting the genetics underlying transcription
factor modularity as an approach to addressing a biochemical problem. In
our 1987 grant proposal, we described plans "to develop a novel
genetic selection in yeast that can detect protein-protein interactions
The goal is to use yeast genetics to identify from a clone bank any
genes encoding proteins that are capable of forming a complex with a
given protein." The major potential of the assayto discover
interactions based on screening thousands of geneswas already
implicit in our concept of the assay. However, the intended scheme (Fig.
1) for reaching this goal was not based on results from other
experiments we had carried out in the laboratory.
The seed grant review panel promptly rejected our proposal. In
fairness, it is difficult to know whether any untried technology will
succeed, and there was then no strong reason to believe that our plan
for a two-hybrid assay would work. But this experience illustrates
contradictory features of the granting process that apply universally:
the need for funding can drive the generation of good ideas, but good
ideas may fail to be recognized as worthy of funding.
I was fortunate in 1987 to have a wonderful department chairman,
Eckard Wimmer, who told me not to worry about funding, but to continue
doing interesting science. So we began to gather reagents to test the
proposed method. Moreover, we repackaged the rejected proposal for
Procter and Gamble's University Exploratory Research Program, and the
company's positive response to our ideas provided a big boost for our
subsequent experiments. Meanwhile, when Marian Carlson from Columbia
University visited our campus, she described how a yeast protein kinase,
Snf1, associates with another yeast protein, Snf4. With so few
well-characterized protein interactions to examine, Carlson and her
colleagues kindly agreed for us to test these proteins in our system.
In July 1988, we used the Snf1-Sn4 pair of proteins in our first
attempt at a two-hybrid assay. That first assay was set up with the lacZ
gene, encoding b -galactosidase, as the reporter gene in our yeast
strain. The idea was that when the Gal4 DNA-binding domain fused to Snf1
was bound upstream of the lacZ gene, it would not turn on
transcription because Snf1 does not possess an activation domain;
similarly, when the Snf4 protein fused to the Gal4 activation domain was
expressed, it would not turn on transcription because it does not
localize to the lacZ gene. Both of these predictions proved true,
as either hybrid alone in the strain produced essentially 0 units of b -galactosidase
activity. The key test, then, was to express the two hybrids in the same
cells and see if the interaction of Snf1 with Snf4 resulted in lacZ
expression via the two-hybrid scheme. Unfortunately, this combination of
hybrids produced a measly 1 unit of activity, barely above the
These first results could be viewed in two lights: the two-hybrid
assay worked, which was terrific, or the assay yielded such a marginal
signal that no one apart from us would be convinced, which was
disappointing. Believing that the assay was working, albeit poorly, we
thought that the low activity was due to expression problems.
Hence, we spent about six months reengineering the plasmids before we
tried the assay again. In that next try, the result was identical1 b
-galactosidase unit. Thinking then that the yeast strain might be the
source of this poor signal problem, we asked Ptashne's collaborator
Grace Gill to send us another strain, GGY1::171, which proved crucial.
In early 1989, using the same combination of Snf1 and Snf4 plasmids in
this yeast strain, the assay yielded an unequivocal result: 180 b -galactosidase
units. Shortly thereafter, we published our first description of the
Two-Hybrid Assay Becomes an Established Procedure
Soon Paul Bartel, who joined my group as a postdoctoral fellow, began
working with Cheng-ting Chien, a graduate student with Rolf Sternglanz
at the State University of New York, Stony Brook, to adapt this assay to
conduct two-hybrid searches. Their efforts led them to construct new
vectors, to generate the first library of activation domain clones, and
to work out protocols for screening such a library. One of the next
tests of this approach was to look for interacting partners of the yeast
Sir4 protein, which was implicated in transcriptional silencing, by
beginning with a DNA-binding domain hybrid of Sir4 and a yeast
activation domain library. Those tests revealed an activation domain
insert that also contained Sir4. These experiments demonstrated the
feasibility of isolating interacting proteins by the two-hybrid
In the early 1990s, the pace of development picked up significantly
as other researchers began using the assay. In particular, Stephen
Elledge at Baylor College of Medicine, Houston, Tex., Harold Weintraub
at the Fred Hutchinson Cancer Research Center, Seattle, Wash., Richard
Treisman at the Imperial College Research Fund, London, Daniel Nathans
at Johns Hopkins University, Baltimore, Md., Roger Brent at
Massachusetts General Hospital, Boston, and their respective
collaborators built a series of new reporter strains, vectors, and
libraries. With multiple laboratories distributing reagents, protocols,
and advice, many other members of the biological community began to use
the two-hybrid assay. In addition, successful early two-hybrid searches
that used key proteins such as human Ras and human immunodeficiency
virus (HIV) gag as the "bait" lent credibility and visibility
to the method. During that period, many of the key reagents needed for
conducting the assay were made available commercially, another
development that encouraged wider use of the technology.
Meanwhile, in mid-1991, Paul Bartel and I began discussing how to
conduct two-hybrid searches in parallel as a way of probing genomes. Our
goal was to determine which proteins encoded by an organism's complete
genome might interact with which other proteins in that organism.
Compiling this information could help toward achieving two complementary
goals. First, proteins with known functions could be further
characterized by finding additional proteins with which they interact,
either as part of defined cellular structures or in particular metabolic
pathways. Second, this approach could help to determine functions of
previously unrecognized proteins that are identified during the course
of DNA sequencing projects, by associating these proteins with other,
To put this parallel-probing scheme to a test, we focused on the Escherichia
coli bacteriophage T7, whose genome encodes about 55 proteins. We
then identified multiple interactions after conducting concurrent
two-hybrid searches using libraries of both DNA-binding domain hybrids
and activation domain hybrids. This approach also proves effective for
identifying interactions between adjacent domains within the same
polypeptidewith the two-hybrid signal sometimes reflecting the
three-dimensional structure of a protein. These experiments also
uncovered an unusual interaction between two proteins encoded in
overlapping reading frames along the same stretch of T7 DNA, suggesting
that adjacent interacting domains of what once encoded a single
polypeptide became separated by frameshift mutations to encode a pair of
Challenge of Studying Large Numbers of Yeast Hybrids Necessitated
In 1995, after moving to the University of Washington (UW), my
colleagues and I began planning a large-scale analysis of yeast
protein-protein interactions. With its estimated complement of about
6,000 proteins100 times more than that of T7Saccharomyces
cerevisiae presents combinatorial possibilities 10,000-fold more
complex than what we faced when analyzing the bacteriophage. However,
the yeast genome sequence was about to be completed; we routinely used
it as the host organism for this assay, meaning the analyses would not
depend on expressing any foreign proteins as hybrids; working with yeast
cells meant that sophisticated genetic strategies were at hand for
interpreting interaction data; and at that time, researchers already had
described more than 300 two-hybrid interactions involving yeast
Although we first planned to scale up the approach we used for T7,
David Botstein of Stanford University among others persuaded us to take
advantage of genomic sequence data in a more directed approach.
Meanwhile, Lee Hood in the UW Department of Molecular Biotechnology made
us aware of the usefulness of an array format from his work with DNA
arrays. However, it was not immediately obvious how to clone the 6,000
genes of yeast into a two-hybrid vector to create a new type of array in
a timely fashion. In addition, we needed to generate inserts in the
appropriate orientation and reading frame if the hybrids were to prove
These challenges helped push us to develop new methods. Working with
Research Genetics, Inc. and with generous support from Amgen, Inc., we
sought to take advantage of a cloning strategy described in the
mid-1980s by Botstein, who was then at the Massachusetts Institute of
Technology in Cambridge, Mass., and his collaborators. To do so, we
designed a set of 6,000 primer pairs that could be used to amplify by
means of PCR each yeast open reading frame (ORF).
A key feature to this approach is that it uses a common 5 flanking
sequence of about 20 nucleotides in each of the forward primers, and
another common 5 flanking sequence of similar length in each of the
back primers (Fig. 2). In a second step, these common sequences enable
us to reamplify the entire set of PCR products with a single pair of
oligonucleotide primers that contain 70 nucleotides (70-mer).
Subsequently, by having these 70-mer sequences in the products from the
second round of amplifications, we can generate separate clones in yeast
simply by introducing these PCR products along with a linearized vector,
whose ends match the 70-mer sequences on all of the PCR products.
This recombinational cloning procedure is efficient and independent
of insert size, while also orientation- and reading frame-specific. It
also lends itself readily to a microtiter format, meaning that about 400
clones can be generated in less than a day. And most importantly, in
light of our thinking about other genomic approaches besides two-hybrid,
the set of PCR products can be cloned into any vector that contains the
short sequences that match the 70-mer flanking sequences. Thus,
additional arrays can be built to create virtually any type of fusion
Initial Applications of New Methods for Studying Yeast Protein
Our first efforts to apply this new strategy to the study of yeast
protein hybrids led us to construct an activation domain hybrid protein
array in one mating type of yeast. We then assayed each of these hybrids
against a single DNA-binding domain hybrid protein by mating the array
to a strain of opposite mating type (Fig. 3). Diploids that grow on
media selective for the two-hybrid reporter gene HIS3 represent
putative protein interactions, and the identity of these positive ORFs
is immediately apparent by their positions in the array. Peter Uetz and
Gerard Cagney, postdoctoral fellows in my laboratory, screened more than
500 yeast proteins against this array and detected hundreds of potential
hybrid protein pairs.
At the same time we assembled and began to test that array, we began
collaborating with researchers at CuraGen Corporation in New Haven,
Conn. They followed a similar strategy, but used two arrays of yeast
ORFs instead of a single array: activation domain hybrids and
DNA-binding domain hybrids. They pooled all of their activation domain
transformants to create a normalized library of full-length ORFs, and
then screened nearly the entire set of DNA-binding domain hybrids one by
one against this library, identifying positives by sequencing the
inserts. This approach also yielded hundreds of putative protein
To visualize and analyze these many interactions involving pairs of S.
cerevisiae proteins, postdoctoral fellow Uetz and computer scientist
Benno Schwikowski used the complete dataset of yeast protein-protein
interactions to assemble simulated networks, each one containing a group
of interacting proteins depicted by protein-protein links. The largest
network contains over 1,500 proteins, and its member proteins are
involved in some 2,300 interactions. Proteins can be highlighted to show
functional annotations for characterized proteins, demonstrating that
proteins with similar annotations tend to cluster in discrete regions of
the large network. The network analysis also allows us to visualize
interactions of proteins within and between different cellular
Other Analytic Possibilities Involving Arrays Abound
Recognizing that the array format is especially useful for
genome-wide analysis of protein functions, we began to consider other
assays that take advantage of this strategy. For instance, the
activation domain array can be used to search for RNA-binding activities
by a three-hybrid assay, and for DNA-binding activities by a one-hybrid
Arrays also allow new genomic selections. For example, we have been
collaborating with Marvin Wickens and his colleagues at the University
of Wisconsin to develop screens to detect proteins involved in the
biogenesis, processing, translation, or decay of RNA.
In its initial format, the activation domain array depends on each
element presented in the context of a living yeast colony. However, we
realized that this array could be converted into an array of purified
proteins. In collaboration with Eric Phizicky and Elizabeth Grayhack and
their colleagues at the University of Rochester, Rochester, N.Y., we
developed a method to rapidly associate biochemical activities with the
genes that encode them. First, an array of glutathione S-transferase
fusions of yeast ORFs is generated by recombinational cloning. Next, the
fusions are purified by glutathione chromatography in 64 sets of 96
fusions each, corresponding to the 64 microtiter plates of PCR products
of the entire set of yeast ORFs. Finally, the resulting 64 pools of
proteins are assayed for various biochemical activities, and the source
of the signal within a positive pool is pinpointed by assaying new pools
corresponding to each of the eight rows and twelve columns of a
microtiter plate. In this way, biochemical activities can be associated
with distinct yeast proteins and the genes that encode them with only a
few days of work, beginning with the purified protein pools.
We envision that biologists will soon have access to complete sets of
S. cerevisiae genes to study with different approaches and in
different formats (Fig. 4). The genes can be used directly in DNA
arrays, for example, to determine genome-wide profiles of transcription.
The genes also can be cloned into expression vectors to allow
overproduction for phenotypic screens, or fusion either to epitope tags
for immunochemical approaches, to purification tags for biochemical
assays, or to green fluorescent protein for localization studies.
These and other arrays will complement the set of gene deletions
produced by a consortium of laboratories. In this way, new genomic
analyses can be conceived and executed by any average-sized yeast
laboratory, not simply those focused on genomics. And what is soon the
case for yeast may prove to be true for other more complex organisms
-including the fruit fly Drosophila with its 14,000 genes, the
roundworm Caenorhabditis elegans with its 18,000 genes, and
humans with as few as 35,000 but possibly many more genes, as cloned
sets of these genes become available.
I thank Mark Johnston and Eric Phizicky for comments
on the manuscript and the former and present members of the laboratory
who participated in these studies. Current support is from the National
Center for Research Resources of the NIH, and I am an investigator of
the Howard Hughes Medical Institute.
Bartel, P. L., J. A. Roecklein, D. SenGupta, and S.
Fields. 1996. A protein linkage map of
Escherichia coli bacteriophage T7. Nature Genet. 12:72-77.
Brachmann, R. K., and J. D. Boeke.
1997. Tag games in yeast: the two-hybrid system and beyond. Curr. Opin.
Brent, R., and R. L. Finley, Jr.
1997. Understanding gene and allele function with two-hybrid methods.
Annu. Rev. Genet. 31:663-704.
Fields, S., and O.-K. Song. 1989.
A novel genetic system to detect protein-protein interactions. Nature 340:245-246.
Frederickson, R. M. 1998.
Macromolecular matchmaking: advances in two-hybrid and related
technologies. Curr. Opin. Biotechnol. 9:90-96.
Hudson, J. R., Jr., E. P. Dawson, K. L. Rushing, C. H.
Jackson, D. Lockshon, D. Conover, C. Lanciault, J. R. Harris, S. J.
Simmons, R. Rothstein, and S. Fields. 1997.
The complete set of predicted genes from Saccharomyces cerevisiae in
a readily usable form. Genome Res. 7:1169-1173.
Martzen, M. R., S. M. McCraith, S. L. Spinelli, F. M.
Torres, S. Fields, E. J. Grayhack, and E. M. Phizicky. 1999.
A biochemical genomics approach for identifying genes by the activity of
their products. Science 286:1153-1155.
Schwikowski, B., P. Uetz, and S. Fields.
2000. A network of protein-protein interactions in yeast. Nature
Uetz, P., L. Giot, G. Cagney, T. A. Mansfield, R. S.
Judson, J. R. Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart,
A. Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G.
Vijayadamodar, M. Yang, M. Johnston, S. Fields, and J. M. Rothberg.
2000. A comprehensive analysis of protein-protein interactions in Saccharomyces
cerevisiae. Nature 403:623-627.
Vidal, M., and P. Legrain.
1999. Yeast forward and reverse `n'-hybrid systems. Nucleic Acids Res. 27:919-929.