Despite advances in sequencing technology and computational methods over the past decade, researchers have discovered genomes for only a small fraction of the Earth’s microbial diversity. Because most bacteria cannot be grown in the laboratory, their genomes cannot be classified using traditional approaches. Identifying and characterizing the microbial diversity of the planet can characterize the role of microorganisms in regulating food cycles, as well as gaining insight into potential applications that they may have in a wide range of research areas.
A public repository of 52,515 microbial line genomes from environmental specimens worldwide, increasing the known bacteriological archaeological diversity by 44% is now available. November 9, Biotechnology of natureKnown as the GEM (Genomes from Earth Microbiome) Catalog, this work is the work of more than 200 scientists, researchers at the US Department of Energy (JOI) Joint Genome Institute, DOE Office of Science Users, Lawrence Berkeley National Laboratory ( ) և DOE Systems Biology Knowledgebase (KBase):
Metagenomics is the study of bacteriological communities of environmental samples without the need for isolation of individual organisms, development, sequencing, and analysis using various methods. “Using a technique called metagenome binning, we were able to reconstruct thousands of metagenome-collected genomes (MAGs) directly from sequential environmental specimens without the need to process bacteria in the laboratory,” said Stephen Naifach, lead author of the study. “What really sets this study apart from previous efforts is the remarkable environmental diversity of the samples we analyzed.”
Emily Eloe-Fadrosh, JGI Metagenome project manager ավագ Senior author of the study, described Naifah’s comments. “This study was designed to cover the widest range of specimens և environments,, including natural և agricultural lands, human և animal hosts, as well as other ocean. Aquatic environments. That’s quite remarkable. “
Adding value beyond the genome sequences
Most of the data was obtained from JGI’s sequenced environmental samples through the Community Science Program և already available on the JGI Integrated Bacteria Genome IM Microbiome (IMG / M) platform. Eloe-Fadrosh noted that it was a good example of “big data” extraction with a deeper understanding of data և increasing value by making data publicly available.
To acknowledge the efforts of sampling investigators, Eloe-Fadrosh contacted more than 200 researchers around the world in accordance with JGI data usage policy. “I thought that significant efforts could be made to recruit and extract it DNA: “Of these samples, many of which come from unique, hard-to-reach environments, these researchers invited co-authors as part of the IMG data consortium,” he said.
Using this data set, Nyfach collected MAGs into groups of 18,000 candidate species, 70% of which were new, compared with the more than 500,000 genomes currently in existence. “Looking through the tree of life, it’s amazing how many uncultivated clans only the MAGs represent,” he said. “Even though the design of these genomes is imperfect, they can still reveal a lot about the biology and diversity of uncultured microbes.”
Researchers have worked on a number of analyzes using the genome repository, and the IMG / M team has developed several և features to demine the GEM catalog. (To learn more about this IMG webinar on Metagenome Bins). One group produced a secondary metabolite database (BGC) of the biosynthetic gene cluster of secondary metabolites, adding those BGCs to the IMG / ABC (Atlas of Biosynthetic Gene Clusters) by 31%. (Listen to this episode of JGI Natural Prodcast on genome extraction). Naifach also worked with another team, predicting host-virus links between all viruses in the IMG / VR (Virus) և GEM catalog, associating 81,000 viruses, 70% of which were not already associated with the host, with 23,000 MAG.
Modeling a new path for metagenomic researchers
Based on these resources, KBase has developed metabolic models for thousands of MAGs, a multi-institutional collaborative knowledge creation and discovery environment designed for biologists and bioinformatics. Models are now available in the public narrative, which provides a shared, reproducible workflow. “Metabolism modeling is a routine analysis for the isolation of genomes, but it has not been done on a large scale for raw bacteria,” said Elo-Fadrosh.
“Just bringing this database to KBase is of immediate value, as people can find high-quality MAG և to use it to inform future analysts,” said Jose P., Accounting Biologist at Argonne National Laboratory. Farian. “The process of building a model of metabolism is simple. You simply select a genome or MAG և click a button to build a model from our database of mapping biochemical reactions և annotations:. “We look at what is described in the genome, the resulting model, to assess the body’s ability to metabolize.” (See this KBase webinar on metabolism modeling).
Elisha Wood-Charlson, KBase User Involvement Manager, added that by demonstrating the ease of generating metabolic models from the GEM database, metagenomic researchers could consider branching out in this area. “Most metagenomic researchers may not want to dive into a whole new field of research [metabolic modeling], but they may be interested in how biochemistry affects what they work on. “The genomics community can now study metabolism using the easy path of KBase from genomes or MAGs to modeling that may not have been observed,” he said.
Community-sponsored research resource
Costas Konstantinidis of the Orgia Institute of Technology, one of the co-authors whose data were part of the catalog. “I do not think that there are many institutions that can do such large-scale metagenomics, they have large-scale capacity. “The beauty of this study is that it was done on a scale that individual laboratories could not do. It gives us a new insight into the function of bacterial diversity.”
He is already finding ways to use the catalog in his own study of how bacteria respond to climate change. “With this database, I can see where each bacterium is, how abundant it is. It’s very useful for my work և for others who do similar research. ” In addition, he is interested in expanding the diversity of the database he is developing, called the Atlas of Bacterial Genomes, to allow for stronger analysis by adding MGs.
“This is a great resource for the community,” Konstantinidis added. “It’s a database that will help with more research in the future. And I hope that JGI and other institutions will continue to carry out such projects. “
Reference. Steven Naifach, Simon Rookie, Rexha Cesadri, Daniel Udwari, Neha Vargeze, Frederick Schultz, Dongying Wu, David Paez-Espino, I-Min Chen, Marcel Huntmann, Krishna Palanoghe , Torben Nielsen, Edward Kirton, José P. Faria, Janaka N. Edirisinghe, Christopher S. Henry, Sean P. Jungbluth, Dylan Chivian, Paramvir Dehal, Elisha M. Wood-Charlson, Adam P. Arkin, Susannah G. Tringe, Axel Visel, IMG / M Data Consortium, Tanja Woyke, Nigel J. Mouncey, Natalia N. Ivanova, Nikos C. Kyrpides and Emiley A. Eloe-Fadrosh, 9 November 2020, Biotechnology of nature,
DOI: 10.1038 / s41587-020-0718-6:
The work also used resources from the National Energy Research Center (NERSC), another DOE science users’ office in Berkeley.