Speaker
Description
Data collection, management, and reuse are increasingly important in the life sciences. In microbial ecology, sequencing has altered our relationship to the invisible microbial world and created massive amounts of reusable microbiome data, but these data are hard to integrate and underused. To foster sequence data reuse, we created the Microbial Community Database (MiCoDa), an open database of 16S rRNA amplicon sequencing data. Through manual curation and dedicated bioinformatics processing, MiCoDa unifies over 35,000 microbiome samples under consistent species definitions, allowing users to examine microbiomes across studies easily and without the need for advanced computational resources or bioinformatics know-how. The MiCoDa web interface allows users to explore, filter, and download ready-to-use species tables and fosters interoperability between INSDC databases, publication DOIs, and soon, the Global Biodiversity Information Facility (GBIF). MiCoDa data was collected from sequence archives as well as through community-oriented data collection and reuse events organized yearly across Latin America, Africa, and soon, Asia. Currently, we are validating AI-driven methods for dataset identification and metadata extraction and working towards an automated update protocol that can keep pace with the accelerating rate of nucleotide data production. Fostering microbiome data reuse globally may not only advance science, but may serve to invert knowledge flows in microbial ecology.
Status Group | Senior Scientist |
---|---|
Poster Presentation Option | Yes, I’m willing to present as a poster. |