Microbial Genomics Grows in Maturity
and Status
Research efforts across broad fronts, including
forensics, are making genomics a powerhouse approach for microbiologists
Carol Potera
Microbial
Genome Sequencing: a Window into Evolution and Physiology
"What a difference a year makes," says Claire
Fraser, referring to the expanded status of microbial genomics since the
first joint meeting of The Institute for Genomic Research (TIGR) and ASM
early in 2001 (see ASM News, June 2001, p. 247). The anthrax
attacks last fall demonstrated the nation's vulnerability to
bioterrorism, highlighting not only shortcomings in understanding Bacillus
anthracis, but also other infectious agents that terrorists might
use (as well as a range of other microorganisms).
"The deficiencies that came to light surrounding
anthrax are just as acute as for any other biowarfare agent," says
Fraser, director of TIGR in Rockville, Md., who cochaired the 2nd ASM
& TIGR Conference on Microbial Genomics, which was held in Las
Vegas, Nev., in February. "So the work we do in microbial genomics
takes on more significance than we ever imagined." That
significance is reflected in President Bush's latest federal budget
proposal, which seeks a whopping $1.3 billion to fund research on
bioterrorism countermeasures at the National Institutes of Health (NIH),
including a substantial scaling up of genomic sequencing projects
focused on microorganisms that might be used for bioterrorism.
To conduct such studies, people will need to be trained,
databases constructed, and improved analytic tools invented to handle
the flood of information that will be generated, Fraser and others point
out. However, she adds, "We have to make sure that research into
biowarfare microbes does not dwarf research into environmental
organisms, which are as important to the health of the planet as
pathogens are to the health of humans."
Judging from the diversity of topics presented during
the conference in Las Vegas, microbial genomics experts are eager to
meet these broad challenges. As the fledgling field matures and genomic
data expand, researchers are finding that breakthroughs from focusing on
one microbe diffuse and generate insights about others. Yet, if sequence
conservation across microbial genera helps to accelerate
microbiologists' understanding of many species, the recent flood of
genomic information is leading experts to reflect on this rush of
progress and to warn against arroganceand gullibility. In other
words, even with a good deal of sequencing work completed, plenty else
is left to be done.
The field of microbial genomics is no longer in its
infancy, and "the high quality of presentations at this meeting
reflects the maturing of this once-obscure field," says George
Weinstock of the Baylor College of Medicine in Houston, Tex., who with
Fraser cochaired the conference and organized its scientific program.
Moreover, he adds, the field has become more vital than ever
"because of the bioterrorism issues."
Genomics Experts Tackle Malaria, Meet Some
Challenges, but Face Many Others
An international consortium of researchers who agreed to
cooperate in determining the genomic sequence of the malaria parasite, Plasmodium
falciparum, serves as a model for what genomics is accomplishing and
what remains to do to eradicate this disease. "After six years of
very hard work, the genome of P. falciparum is nearing
completion," says Malcolm Gardner of TIGR, who coordinated efforts
of consortium researchers working at TIGR, the Sanger Institute in
Cambridge, England, Stanford University in Stanford, Calif., and the
Naval Medical Research Center in Bethesda, Md.
P. falciparum protozoan parasites are transmitted
by mosquitoes and cause 300-500 million new cases of malaria and as many
as 2.7 million deaths per year, particularly affecting children in
Africa. New treatments and a vaccine are needed because the parasite
often proves resistant to chloroquine, a relatively cheap and widely
used drug for treating this disease. Reports describing the sequence of
the parasite's genome, which contains 5,000-6,000 genes assigned to 14
chromosomes, are expected to be published this year. Throughout the
course of this sequencing project, preliminary information was posted
regularly on publicly available websites, providing useful data to other
researchers working on malaria. "There have been dozens of papers
in the last few years that depended on preliminary genome data,"
Gardner says, including the identification of potential new targets for
drugs.
Meanwhile, John Yates and his colleagues at the Scripps
Research Institute in La Jolla, Calif., are studying protein expression
in Plasmodium, using detailed information from the genomic
analysis as a springboard for dealing with this higher-order challenge.
As they began to address this challenge, they developed an automated
method for doing shotgun proteomics that exploits multidimensional
liquid chromatography and mass spectrometry. So far, 1,600 proteins have
been identified in the four life stages of the parasite. The functional
profiles of many of these proteins agree well with what occurs during
these distinct physiological stages. For instance, cell surface antigens
are expressed when the organism is exposed to the human immune system.
"We have a fairly good picture of what's going on, and it will get
more comprehensive as time goes on," Yates says.
Other researchers are mining the genomic database in
search of prospective new drug targets. "When I look at the genomic
data, my excitement comes from knowing that we have a really good chance
of understanding what it takes to make a good drug," says Pradip
Rathod of the University of Washington in Seattle. He debunks the
traditional strategy that calls for starving the parasite to death with
a drug that inhibits an essential enzyme. Based on in-vitro experiments,
Rathod finds that inhibiting an essential enzyme can cause substrates to
accumulate and induce cellular changes, leading to rapid development of
drug resistance in some strains. "It's a much more complicated
picture than is generally painted for drug targets," he says.
The apicoplast, a mysterious cellular organelle within
the parasite that apparently arose from an algal chloroplast, offers
another possible drug target. "This organelle is not present in the
human host, therefore, it offers a wealth of opportunity for potential
drug targets," says David Roos of the University of Pennsylvania in
Philadelphia. Toxoplasma gondii, which causes toxoplasmosis, and
several other parasites also contain apicoplasts. He and his colleagues
exhaustively analyzed Plasmodium genome databases to identify
proteins encoded within the 35-kilobase genome of the apicoplast and
then constructed a hypothetical metabolic pathway "to test
experimentally," he says.
While some prospect for better drug targets, Stephen
Hoffman of Celera in Rockville, Md., believes that genomics will lead to
the development of an effective vaccine for preventing malaria. New
drugs are impractical, he argues, because they are too expensive and
difficult to distribute to those who need them most. Sporozoites, the
stage of the parasite transmitted through the bite of an infected
mosquito to humans, seems a likely point at which a vaccine could block
the infectious cycle. However, unlike other infectious agents that are
cultured and attenuated for incorporation into a vaccine, sporozoites
defy culturing. "This isn't anything like making a vaccine for
measles," he says.
Even with information about the P. falciparum genomic
sequence in hand, "we're not ready to utilize the information yet
to get the best vaccine," Hoffman says. The proteins and epitopes
that might be appropriate targets for vaccines that confer protective
immunity still need to be identified. An effort to develop a database
describing stage-specific expressions of proteins, RNA, DNA, and
single-nucleotide polymorphisms of this parasite needs to be organized,
according to Hoffman. "Everything's left to do for
biologists," he says. He points out that, in 1962, President John
Kennedy promised to eradicate malaria by the end of that decade. Yet
there are more cases of malaria in some parts of the world today than
there were 40 years ago. To bridge the gap from genomics to a vaccine
will take "a lot of hard work and reality checking of where we
stand in our understanding of host and parasite interactions," he
says.
 |
|
Deinococcus
radiodurans is an organism
some genomics researchers are focusing on because of its
remarkable DNA repair abilities. (Photo courtesy of Michael Daly,
Uniformed Services University of the Health Sciences, Bethesda,
Md.) |
Making Sense of Genomic Data
Reality checking also applies to monitoringand
interpretingother channels through which the flood of genomic
information now pours. As databases cascade, it is time to step back and
question whether everything being said about them is true, says Jonathan
Eisen of TIGR. For instance, he questions numerous claims based on
genomic analyses of frequent horizontal gene transfers, in which a gene
or genes from one species supposedly moved into another. According to
the International Genome Sequencing Consortium, hundreds of human genes
appear to have been transferred horizontally from bacteria throughout
evolution. "I do not believe there is any good evidence that it has
occurred," Eisen says.
The methods used to infer horizontal gene transfer are
ambiguous and lack sound scientific proof, according to Eisen. Those
methods include analyses for detecting unusual distribution patterns of
genes, high sequence similarities in genomic segments from distantly
related species, and other unusual patterns of evolution. "These
methods should not be used to infer horizontal gene transfer," he
argues, noting that the same results could be obtained through gene
losses, variation in evolutionary rates, plasmid contamination,
limitations inherent in BLAST analysis, or by using incomplete genome
data. In samples as large as entire genomes, unusual patterns are
expected to occur by chance alone. "But we have no idea how
frequently," he says. Genomic researchers need to design computer
simulations of model systems to determine the probability of horizontal
gene transfer, then devise unequivocal methods for detecting them.
"I hope people in the future think much more carefully about
inferring horizontal gene transfer," he says.
Another challenging hurdle for those interpreting
genomic information is to relate DNA sequence data to protein function
(also, see box, p. 272). For example, Barbara Methe of TIGR is studying
psychrophilic (cold-loving) Colwellia sp. 348, which was
collected in the ocean off Greenland. Although much of the earth is
colonized by such psychrophiles, which grow below 20șC, little is known
about how their enzymes function at near-freezing temperatures.
Psychrophiles hold potential applications for the manufacturing of
detergents, food processing, and drug manufacturing.
Theoretically, proteins from such cold-loving organisms
should be flexible and contain amino acids that interfere with hydrogen
bonds and other interactions that tend to stiffen proteins, according to
Methe. Hence, she looked at amino acid changes in sequences of 70
proteins that partake either in metabolic or in repair and replication
processes in Colwellia, then compared them to sequences for
protein from two closely related relatives, Vibrio cholerae and Shewanella
aneidensis, which grow at higher temperatures.
In Colwellia, levels of the amino acids arginine,
proline, and alanine are lower, whereas lysine, asparagine, and serine
are higher, relative to Vibrio and Shewanella, according
to Methe. "These amino acid changes make sense, based on what we
know about cold adaptation," she says. For instance, both lysine
and arginine are positively charged, but arginine stabilizes proteins
more by forming multivalent hydrogen bonds. Since cold-adapted enzymes
need flexibility, "it makes sense that there's more lysine than
arginine," says Methe. Or the amino acid asparagine is easily
disrupted at high temperatures. Thus, it seems logical that asparagine
is abundant in Colwellia, yet is relatively scarce in
thermophiles. From a teleological viewpoint, the amino acid pattern
"fits with what we know about what cold-loving enzymes need,"
she says.
Sleuths Use Genomics To Investigate Origins of
Anthrax Bioweapons
The incidents last year involving letters laced with B.
anthracis spores that led to a series of deadly anthrax cases
delivered a message that ancient scourges can be transformed into modern
weapons. And, to find out who might have prepared and delivered these
deadly materials, some researchers are adapting sensitive genomics-based
analyses to forensic and epidemiologic investigations (see box, p. 274).
"To do epidemiology, you need to identify regions
of the genome that mutate much more quickly than the standard rate to
distinguish between strains," says Paul Keim of Northern Arizona
State University in Flagstaff, who is studying Yersinia pestis,
which causes bubonic plague, as well as B. anthracis. "So we
glommed onto a method used in eukaryotic genomes," he says. That
method focuses on DNA segments that contain variable numbers of tandem
repeats (VNTR). VNTRs within pathogens tend to be "more
subtle" than those found within eukaryotes, according to Keim.
Measuring mutations with PCR primers based on VNTR, Keim
and his colleagues pass the pathogens through 100,000 generations in
about two months, during which period Y. pestis mutates about
10-fold faster than B. anthracis. When this method was applied to
bubonic plague specimens from 220 natural human cases in Madagascar, 201
genotypes with distinctive geographic clustering were identified. When
this same method was applied to 270 specimens from 270 anthrax cases in
U.S. cattle, Keim and his collaborators found 63 distinctive strains
whose patterns and sites of origin correlate with old cattle trails in
the western United States. "Anthrax appears to have established
itself and spread via cattle trails," he says.
The Ames strain of B. anthracis, which is the one
that was used in the bioterrorist mailings late in 2001, is considered
rare in nature. It was isolated from a cow in Texas in 1980 and later
distributed to researchers worldwide for study. The VNTR method offers a
sensitive approach for differentiating among slightly differing Ames
strains and, thus, is a potential means for tracing the source of the
tainted letters. "If you look deeply into the genome, you can find
regions that mutate faster and use them to solve particular
problems," Keim says. This approach to using VNTR to study Y.
pestis and B. anthracis could also be applied to other
pathogens.
Genomic information can solve lingering mysteries in
clinical medicine, too. Group A streptococcus (GAS) infections cause
sore throats that cost $2 billion yearly to treat in the U.S., and are
also the leading cause of pediatric rheumatic fever globally. The M1
serotype of GAS is the most common cause of disease. James Musser and
his colleagues at the National Institute of Allergy and Infectious
Diseases Rocky Mountain Laboratories in Hamilton, Mont., isolated and
sequenced a close relative, M18, that produces virulence factors, such
as scarlet fever toxins, and is expressed in episodes of rheumatic
fever.
Rheumatic fever occurs at a low frequency in the United
States, with occasional epidemics such as the ones centered in Salt Lake
City during the early 1980s and again in 1997. Musser analyzed samples
collected during those epidemics and used DNA microarrays to determine
if the epidemics were related. "The recurrence, or recycling, of
the same M18 clone caused both epidemics," he says. Identifying M18
as the sole suspect "permits us to focus our efforts on the
long-term search for new therapies."
Sometimes in Genomics, Less Is More
Published reports describing efforts to sequence a
particular organism's genome typically list many authors from several
laboratories. For example, the list of authors for a 1997 report in Nature
that details the sequencing of the 4.2-megabase genome of Bacillus
subtilis recognizes 151 researchers from 46 laboratories. In
contrast, a more recently published report in the Proceedings of the
National Academy of Sciences (99:984-989, January 22, 2002)
describing the genomic sequence of the hyperthermophilic archaeon Pyrobaculum
aerophilum lists only six scientists. "This has to be the
smallest number of authors ever on a sequencing paper," says senior
author Jeffrey H. Miller of the Molecular Biology Institute at the
University of California, Los Angeles (UCLA).
The task of deciphering the 2.2-megabase genome of P.
aerophilum, an extremophile harvested from a boiling marine water
hole in Italy, was spearheaded by Miller's graduate student Sorel Fitz-Gibbon.
Although these efforts involved collaborators in the laboratories of
Karl Stetter of the University of Regensberg, Germany, and Melvin Simon
at the California Institute of Technology in Pasadena, "I was the
only person working full time on the project," Fitz-Gibbon says.
She had access to an automated DNA sequencer instrument on a part-time
basis, and completed the sequencing and annotation within five years.
Miller and Stetter chose to analyze the genomic sequence
of this microrganism, which grows at 104șC, because they consider it a
model for studying archaeal and thermophilic microbiology. Unlike most
other archael thermophiles, P. aerophilum tolerates oxygen,
making it amenable to a range of standard laboratory manipulations.
In addition to the genomic sequence, Fitz-Gibbon, now a
postdoctoral fellow at UCLA's Center for Astrobiology, Miller, and their
colleagues carried out other genetic studies to add perspective to the
sequence data. "This is one of the first really well-documented
examples of an organism living as a mutator as a permanent life
style," Miller says, explaining that this microbe has a higher
mutation rate than normal, probably because it lacks a repair system to
correct DNA replication errors. The organism thrives at temperatures
that denature many proteins, raising many questions about how its
enzymes and structural proteins function, and how it avoids lethal
damage to its DNA.
For biologists, completing a genomic sequence is really
the beginning of broader efforts to decipher the information that is
encoded in that DNA sequence, not the end of a project. "There are
a million things left to do," she says. Archaea are an understudied
branch of life, and studying them could explain phylogenetic
relationships among all life forms. "We're just getting
started," says Fitz-Gibbon.