ASM News
ASM Home Site Map Search ASM Site

Microbial Genomics Grows in Maturity and Status

Research efforts across broad fronts, including forensics, are making genomics a powerhouse approach for microbiologists

Carol Potera

Microbial Genome Sequencing: a Window into Evolution and Physiology

"What a difference a year makes," says Claire Fraser, referring to the expanded status of microbial genomics since the first joint meeting of The Institute for Genomic Research (TIGR) and ASM early in 2001 (see ASM News, June 2001, p. 247). The anthrax attacks last fall demonstrated the nation's vulnerability to bioterrorism, highlighting not only shortcomings in understanding Bacillus anthracis, but also other infectious agents that terrorists might use (as well as a range of other microorganisms).

"The deficiencies that came to light surrounding anthrax are just as acute as for any other biowarfare agent," says Fraser, director of TIGR in Rockville, Md., who cochaired the 2nd ASM & TIGR Conference on Microbial Genomics, which was held in Las Vegas, Nev., in February. "So the work we do in microbial genomics takes on more significance than we ever imagined." That significance is reflected in President Bush's latest federal budget proposal, which seeks a whopping $1.3 billion to fund research on bioterrorism countermeasures at the National Institutes of Health (NIH), including a substantial scaling up of genomic sequencing projects focused on microorganisms that might be used for bioterrorism.

To conduct such studies, people will need to be trained, databases constructed, and improved analytic tools invented to handle the flood of information that will be generated, Fraser and others point out. However, she adds, "We have to make sure that research into biowarfare microbes does not dwarf research into environmental organisms, which are as important to the health of the planet as pathogens are to the health of humans."

Judging from the diversity of topics presented during the conference in Las Vegas, microbial genomics experts are eager to meet these broad challenges. As the fledgling field matures and genomic data expand, researchers are finding that breakthroughs from focusing on one microbe diffuse and generate insights about others. Yet, if sequence conservation across microbial genera helps to accelerate microbiologists' understanding of many species, the recent flood of genomic information is leading experts to reflect on this rush of progress and to warn against arrogance—and gullibility. In other words, even with a good deal of sequencing work completed, plenty else is left to be done.

The field of microbial genomics is no longer in its infancy, and "the high quality of presentations at this meeting reflects the maturing of this once-obscure field," says George Weinstock of the Baylor College of Medicine in Houston, Tex., who with Fraser cochaired the conference and organized its scientific program. Moreover, he adds, the field has become more vital than ever "because of the bioterrorism issues."

Genomics Experts Tackle Malaria, Meet Some Challenges, but Face Many Others

An international consortium of researchers who agreed to cooperate in determining the genomic sequence of the malaria parasite, Plasmodium falciparum, serves as a model for what genomics is accomplishing and what remains to do to eradicate this disease. "After six years of very hard work, the genome of P. falciparum is nearing completion," says Malcolm Gardner of TIGR, who coordinated efforts of consortium researchers working at TIGR, the Sanger Institute in Cambridge, England, Stanford University in Stanford, Calif., and the Naval Medical Research Center in Bethesda, Md.

P. falciparum protozoan parasites are transmitted by mosquitoes and cause 300-500 million new cases of malaria and as many as 2.7 million deaths per year, particularly affecting children in Africa. New treatments and a vaccine are needed because the parasite often proves resistant to chloroquine, a relatively cheap and widely used drug for treating this disease. Reports describing the sequence of the parasite's genome, which contains 5,000-6,000 genes assigned to 14 chromosomes, are expected to be published this year. Throughout the course of this sequencing project, preliminary information was posted regularly on publicly available websites, providing useful data to other researchers working on malaria. "There have been dozens of papers in the last few years that depended on preliminary genome data," Gardner says, including the identification of potential new targets for drugs.

Meanwhile, John Yates and his colleagues at the Scripps Research Institute in La Jolla, Calif., are studying protein expression in Plasmodium, using detailed information from the genomic analysis as a springboard for dealing with this higher-order challenge. As they began to address this challenge, they developed an automated method for doing shotgun proteomics that exploits multidimensional liquid chromatography and mass spectrometry. So far, 1,600 proteins have been identified in the four life stages of the parasite. The functional profiles of many of these proteins agree well with what occurs during these distinct physiological stages. For instance, cell surface antigens are expressed when the organism is exposed to the human immune system. "We have a fairly good picture of what's going on, and it will get more comprehensive as time goes on," Yates says.

Other researchers are mining the genomic database in search of prospective new drug targets. "When I look at the genomic data, my excitement comes from knowing that we have a really good chance of understanding what it takes to make a good drug," says Pradip Rathod of the University of Washington in Seattle. He debunks the traditional strategy that calls for starving the parasite to death with a drug that inhibits an essential enzyme. Based on in-vitro experiments, Rathod finds that inhibiting an essential enzyme can cause substrates to accumulate and induce cellular changes, leading to rapid development of drug resistance in some strains. "It's a much more complicated picture than is generally painted for drug targets," he says.

The apicoplast, a mysterious cellular organelle within the parasite that apparently arose from an algal chloroplast, offers another possible drug target. "This organelle is not present in the human host, therefore, it offers a wealth of opportunity for potential drug targets," says David Roos of the University of Pennsylvania in Philadelphia. Toxoplasma gondii, which causes toxoplasmosis, and several other parasites also contain apicoplasts. He and his colleagues exhaustively analyzed Plasmodium genome databases to identify proteins encoded within the 35-kilobase genome of the apicoplast and then constructed a hypothetical metabolic pathway "to test experimentally," he says.

While some prospect for better drug targets, Stephen Hoffman of Celera in Rockville, Md., believes that genomics will lead to the development of an effective vaccine for preventing malaria. New drugs are impractical, he argues, because they are too expensive and difficult to distribute to those who need them most. Sporozoites, the stage of the parasite transmitted through the bite of an infected mosquito to humans, seems a likely point at which a vaccine could block the infectious cycle. However, unlike other infectious agents that are cultured and attenuated for incorporation into a vaccine, sporozoites defy culturing. "This isn't anything like making a vaccine for measles," he says.

Even with information about the P. falciparum genomic sequence in hand, "we're not ready to utilize the information yet to get the best vaccine," Hoffman says. The proteins and epitopes that might be appropriate targets for vaccines that confer protective immunity still need to be identified. An effort to develop a database describing stage-specific expressions of proteins, RNA, DNA, and single-nucleotide polymorphisms of this parasite needs to be organized, according to Hoffman. "Everything's left to do for biologists," he says. He points out that, in 1962, President John Kennedy promised to eradicate malaria by the end of that decade. Yet there are more cases of malaria in some parts of the world today than there were 40 years ago. To bridge the gap from genomics to a vaccine will take "a lot of hard work and reality checking of where we stand in our understanding of host and parasite interactions," he says.

Deinococcus radiodurans is an organism some genomics researchers are focusing on because of its remarkable DNA repair abilities. (Photo courtesy of Michael Daly, Uniformed Services University of the Health Sciences, Bethesda, Md.)

Making Sense of Genomic Data

Reality checking also applies to monitoring—and interpreting—other channels through which the flood of genomic information now pours. As databases cascade, it is time to step back and question whether everything being said about them is true, says Jonathan Eisen of TIGR. For instance, he questions numerous claims based on genomic analyses of frequent horizontal gene transfers, in which a gene or genes from one species supposedly moved into another. According to the International Genome Sequencing Consortium, hundreds of human genes appear to have been transferred horizontally from bacteria throughout evolution. "I do not believe there is any good evidence that it has occurred," Eisen says.

The methods used to infer horizontal gene transfer are ambiguous and lack sound scientific proof, according to Eisen. Those methods include analyses for detecting unusual distribution patterns of genes, high sequence similarities in genomic segments from distantly related species, and other unusual patterns of evolution. "These methods should not be used to infer horizontal gene transfer," he argues, noting that the same results could be obtained through gene losses, variation in evolutionary rates, plasmid contamination, limitations inherent in BLAST analysis, or by using incomplete genome data. In samples as large as entire genomes, unusual patterns are expected to occur by chance alone. "But we have no idea how frequently," he says. Genomic researchers need to design computer simulations of model systems to determine the probability of horizontal gene transfer, then devise unequivocal methods for detecting them. "I hope people in the future think much more carefully about inferring horizontal gene transfer," he says.

Another challenging hurdle for those interpreting genomic information is to relate DNA sequence data to protein function (also, see box, p. 272). For example, Barbara Methe of TIGR is studying psychrophilic (cold-loving) Colwellia sp. 348, which was collected in the ocean off Greenland. Although much of the earth is colonized by such psychrophiles, which grow below 20șC, little is known about how their enzymes function at near-freezing temperatures. Psychrophiles hold potential applications for the manufacturing of detergents, food processing, and drug manufacturing.

Theoretically, proteins from such cold-loving organisms should be flexible and contain amino acids that interfere with hydrogen bonds and other interactions that tend to stiffen proteins, according to Methe. Hence, she looked at amino acid changes in sequences of 70 proteins that partake either in metabolic or in repair and replication processes in Colwellia, then compared them to sequences for protein from two closely related relatives, Vibrio cholerae and Shewanella aneidensis, which grow at higher temperatures.

In Colwellia, levels of the amino acids arginine, proline, and alanine are lower, whereas lysine, asparagine, and serine are higher, relative to Vibrio and Shewanella, according to Methe. "These amino acid changes make sense, based on what we know about cold adaptation," she says. For instance, both lysine and arginine are positively charged, but arginine stabilizes proteins more by forming multivalent hydrogen bonds. Since cold-adapted enzymes need flexibility, "it makes sense that there's more lysine than arginine," says Methe. Or the amino acid asparagine is easily disrupted at high temperatures. Thus, it seems logical that asparagine is abundant in Colwellia, yet is relatively scarce in thermophiles. From a teleological viewpoint, the amino acid pattern "fits with what we know about what cold-loving enzymes need," she says.

Sleuths Use Genomics To Investigate Origins of Anthrax Bioweapons

The incidents last year involving letters laced with B. anthracis spores that led to a series of deadly anthrax cases delivered a message that ancient scourges can be transformed into modern weapons. And, to find out who might have prepared and delivered these deadly materials, some researchers are adapting sensitive genomics-based analyses to forensic and epidemiologic investigations (see box, p. 274).

"To do epidemiology, you need to identify regions of the genome that mutate much more quickly than the standard rate to distinguish between strains," says Paul Keim of Northern Arizona State University in Flagstaff, who is studying Yersinia pestis, which causes bubonic plague, as well as B. anthracis. "So we glommed onto a method used in eukaryotic genomes," he says. That method focuses on DNA segments that contain variable numbers of tandem repeats (VNTR). VNTRs within pathogens tend to be "more subtle" than those found within eukaryotes, according to Keim.

Measuring mutations with PCR primers based on VNTR, Keim and his colleagues pass the pathogens through 100,000 generations in about two months, during which period Y. pestis mutates about 10-fold faster than B. anthracis. When this method was applied to bubonic plague specimens from 220 natural human cases in Madagascar, 201 genotypes with distinctive geographic clustering were identified. When this same method was applied to 270 specimens from 270 anthrax cases in U.S. cattle, Keim and his collaborators found 63 distinctive strains whose patterns and sites of origin correlate with old cattle trails in the western United States. "Anthrax appears to have established itself and spread via cattle trails," he says.

The Ames strain of B. anthracis, which is the one that was used in the bioterrorist mailings late in 2001, is considered rare in nature. It was isolated from a cow in Texas in 1980 and later distributed to researchers worldwide for study. The VNTR method offers a sensitive approach for differentiating among slightly differing Ames strains and, thus, is a potential means for tracing the source of the tainted letters. "If you look deeply into the genome, you can find regions that mutate faster and use them to solve particular problems," Keim says. This approach to using VNTR to study Y. pestis and B. anthracis could also be applied to other pathogens.

Genomic information can solve lingering mysteries in clinical medicine, too. Group A streptococcus (GAS) infections cause sore throats that cost $2 billion yearly to treat in the U.S., and are also the leading cause of pediatric rheumatic fever globally. The M1 serotype of GAS is the most common cause of disease. James Musser and his colleagues at the National Institute of Allergy and Infectious Diseases Rocky Mountain Laboratories in Hamilton, Mont., isolated and sequenced a close relative, M18, that produces virulence factors, such as scarlet fever toxins, and is expressed in episodes of rheumatic fever.

Rheumatic fever occurs at a low frequency in the United States, with occasional epidemics such as the ones centered in Salt Lake City during the early 1980s and again in 1997. Musser analyzed samples collected during those epidemics and used DNA microarrays to determine if the epidemics were related. "The recurrence, or recycling, of the same M18 clone caused both epidemics," he says. Identifying M18 as the sole suspect "permits us to focus our efforts on the long-term search for new therapies."

Sometimes in Genomics, Less Is More

Published reports describing efforts to sequence a particular organism's genome typically list many authors from several laboratories. For example, the list of authors for a 1997 report in Nature that details the sequencing of the 4.2-megabase genome of Bacillus subtilis recognizes 151 researchers from 46 laboratories. In contrast, a more recently published report in the Proceedings of the National Academy of Sciences (99:984-989, January 22, 2002) describing the genomic sequence of the hyperthermophilic archaeon Pyrobaculum aerophilum lists only six scientists. "This has to be the smallest number of authors ever on a sequencing paper," says senior author Jeffrey H. Miller of the Molecular Biology Institute at the University of California, Los Angeles (UCLA).

The task of deciphering the 2.2-megabase genome of P. aerophilum, an extremophile harvested from a boiling marine water hole in Italy, was spearheaded by Miller's graduate student Sorel Fitz-Gibbon. Although these efforts involved collaborators in the laboratories of Karl Stetter of the University of Regensberg, Germany, and Melvin Simon at the California Institute of Technology in Pasadena, "I was the only person working full time on the project," Fitz-Gibbon says. She had access to an automated DNA sequencer instrument on a part-time basis, and completed the sequencing and annotation within five years.

Miller and Stetter chose to analyze the genomic sequence of this microrganism, which grows at 104șC, because they consider it a model for studying archaeal and thermophilic microbiology. Unlike most other archael thermophiles, P. aerophilum tolerates oxygen, making it amenable to a range of standard laboratory manipulations.

In addition to the genomic sequence, Fitz-Gibbon, now a postdoctoral fellow at UCLA's Center for Astrobiology, Miller, and their colleagues carried out other genetic studies to add perspective to the sequence data. "This is one of the first really well-documented examples of an organism living as a mutator as a permanent life style," Miller says, explaining that this microbe has a higher mutation rate than normal, probably because it lacks a repair system to correct DNA replication errors. The organism thrives at temperatures that denature many proteins, raising many questions about how its enzymes and structural proteins function, and how it avoids lethal damage to its DNA.

For biologists, completing a genomic sequence is really the beginning of broader efforts to decipher the information that is encoded in that DNA sequence, not the end of a project. "There are a million things left to do," she says. Archaea are an understudied branch of life, and studying them could explain phylogenetic relationships among all life forms. "We're just getting started," says Fitz-Gibbon.

Last Modified:June 18, 2002
Email: webmaster@asmusa.org
Copyright © 2002 American Society for Microbiology All rights reserved ASM
HomeSite Map Search ASM Site