The goal of this project is to improve accessibility of marine biodiversity data by more than doubling digitally-accessible marine invertebrate records (excluding molluscs) from non-federal collections. Specifically, our consortium will:
Database and make available online 835K specimen lots, representing 7.5 million specimens.
Mobilize an additional 210K specimen lots that are databased but not yet available online.
Georeference 175K existing and newly digitized station records.
Improve nomenclature in collection databases and synchronize them with WoRMS.
Mobilize or create 464K images of specimens and type specimens.
Improve links between collection database records and sequence data in GenBank and BOLD.
Make extended specimen data broadly accessible online through diverse aggregators.
Unite the marine collections community to develop diverse resources on marine biodiversity based on these data.
Establish and propagate best practices for field-to-digitization workflows.
Collaborate with K-16 educators to develop open and modifiable lessons that build from and add to our digital products.
Develop tools and resources for undergraduate student training, public engagement, and online participation via Zooniverse and other participatory projects.
Why marine invertebrates?
The marine biome is the largest yet least well-known biome on Earth. About 75% of described marine species are invertebrates (Appeltans et al., 2012; WoRMS, 2019). Marine invertebrates dominate multicellular life and are fundamentally important to the ecology of the ocean and, hence, Earth. Yet we cannot even place a robust estimate on the number of marine animal species — although most agree that it is greater than 1 million (Bouchet, 2006, Appeltans et al., 2012).
There is no compilation of the US marine fauna; we do not know how many species live in our coastal waters. Much of what is known was gathered for oceanography and fisheries, focused on selected taxa and areas. This bias is reflected online. OBIS (the Ocean Biodiversity Information System) is the primary aggregator of marine occurrence records based mostly on observational, rather than specimen data. The 57 million records of 125,000 species cover only half of known marine species, and 98% pertain to the most common 20% of these species, mostly vertebrates and plankton.
About 30 North American collections hold 4.3 million specimen lots of non-molluscan marine invertebrates, yet fewer than 2 million are in iDigBio (1.5% of records). Marine biodiversity has a long way to go in the age of digital big data. No previous digitization network has yet focused on living marine organisms. This project will change that with a massive effort to digitize data on 835K lots from 19 collections representing 7.5 million specimens, more than doubling what is available online from non-federal collections. Mollusks are excluded, as they are targeted by another proposal, encompass twice as many lots as other marine invertebrates, and predominantly dry shell collections use different curatorial workflows.
Our core objective is to assemble a robust, extended, vouchered resource on marine biodiversity to power research and education in systematics, ecology, oceanography, and other disciplines (Ball-Damerow et al., 2019; Nelson & Ellis, 2018). We will digitize broadly across taxa and geography because a comprehensive approach will: 1) provide foundational data across these dimensions; 2) allow more efficient workflows; 3) be achievable under a single, if large, TCN; and 4) contribute to democratization of biodiversity data and the diversity of people with access to it (Drew et al., 2017). These data will allow researchers to address diverse problems on global patterns of diversity and distribution, climate-driven range shifts, reconstruction of historical community structure, and biodiversity synthesis.