European Ocean Biodiversity Information System

Data standardisation

Compiling data from different sources, collected under different circumstances and with various purposes requires a minimum of standardisation and quality control before sound and useful integration becomes possible.

EurOBIS continuously strives to keep up with international accepted guidance and developments when it comes to standards and/or quality control procedures. Since 2017 it has also been expanding the implementation of FAIR principles on its database and data systems, allowing for more interoperability between different repositories.

The sections below summarise how EurOBIS data and metadata comply with FAIR principles.

Metadata: IMIS

All EurOBIS' datasets are described in the Integrated Marine Information System (IMIS) - developed and hosted by the Flanders Marine Institute (VLIZ). This allows not only for dataset metadata storage, but can also capture and link information on persons, institutes, projects and publications. In practice, this means that all information between these different modules is linked. When looking at a dataset metadata, one can click on the listed person or institute to get to the contact information. When a dataset has been collected in the framework of a project, a link to the project is made so that more details be retrieved. And when a dataset has led to one or more publications, these will also be listed on the metadata page and - if available - a copy can be requested from the VLIZ library.

IMIS metadata adheres to international standards, such as ISO 19115 - the international standard for geographic information and services. It is also compliant with the European Directory of Marine Environmental Data (EDMED) developed under SeaDataNet. Both ISO and EDMED present standardised lists of items required to provide a comprehensive dataset description.

In addition, IMIS uses the thesaurus of the Aquatic Sciences and Fisheries Abstracts (in short: ASFA thesaurus) to assign searchable keywords to datasets.

Data: OBIS-ENV Schema

The OBIS-ENV Schema is a content specification designed to capture data about the geographical occurrences of species, e.g. the collection or observation of a particular species or other taxonomic group at a particular location. It is the biogeographic data standard used by OBIS and represents an extension of the Darwin Core Version 2 and is designed for marine biodiversity data, specifically to record the capture or observation of a particular taxon at a particular location. It can also be applied to museum specimen data, which can contribute to EurOBIS.

Following this standard allows for seamless data exchange between online databases. The full OBIS Schema can be consulted via the data formats page.

The OBIS-ENV schema used by EurOBIS also includes standardisation at the taxa, geography and measurements and facts level. These not only allow for dataset standardisation but also improve interoperability with other types of data. Below is a summary of the different thesaurus used.

Taxonomy: World Register of Marine Species (WoRMS)

EurOBIS uses the World Register of Marine Species (WoRMS) as an authoritative taxonomic list of species occurring worldwide in the marine environment.

All taxon names are matched with WoRMS, with a unique, persistent LSID, to trace and rule out spelling variations and resolve frequently used synonyms. This way, all taxon names are linked to the currently accepted name, avoiding e.g. a duplication of the same species using different names for diversity calculations.

If a taxon does not have a match in WoRMS, it is matched against other authoritative taxonomic databases, such as e.g. the Interim Register of Marine and Non-marine Genera (IRMNG), the Integrated Taxonomic Information System (ITIS) or the Catalogue of Life (CoL). If a match can be found - and the taxon is considered marine or brackish - the WoRMS taxonomic expert is contacted to add the taxon to the Register. If the taxon is not marine, it will be added to an annotated list.

When taxa cannot be matched to any of the aformentioned authoritative taxonomic databases, it is sent back to the data provider for a secondary check-up. If the data provider can supply a source containing the given taxon name, this information is sent to the taxonomic experts, which can then decide if the taxon can be added to the World Register or not.

The annotated taxon list will help EurOBIS in its taxonomic quality control, and will avoid that taxon names are sent to taxonomic editors twice for clarification.

Geography: Marine Regions

The standardisation of geographic regions is achieved by the use of the Marine Regions gazeteer, which creates a standard, relational list of geographic names which improves access and clarity of the different geographic marine names such as seas, sandbanks, ridges and bays.

Through the use of a Marine Regions Geographic IDentifier (MRGID), a unique persistent and resolvable URI allows for the standardisation of the region where the data were collected.

Vocabularies: BODC NVS

The expansion of the OBIS schema to OBIS-ENV was accomplished by the use of two new core types, Event and extended Measurements or Facts (EMoF). The Event core allows for the association of measurements with nested events, whereas the eMoF allows for the association of the occurrence records with other biotic or abiotic measurements and facts. The terms in the eMoF core can be populated with free text annotation and with controlled vocabularies. EurOBIS, following OBIS' guidelines, has adopted the British Oceanographic Data Centre (BODC) NERC Vocabulary Server.