The New Zealand Biodata Services Stack (BSS) project is to develop a New Zealand "network for connecting and mobilising primary biodiversity data”.
We realize that lots of work has been done in that arena around the globe. We note, however, that most of these initiatives lack an emphasis standard development and have resulted in separate solutions (for example service specific APIs). That means data providers and users are faced with a multitude of separate (and not integrated) services in biodata management. The BSS project therefore focuses its work on developing a national standard for mobilizing biodata which
- is based on existing standards and re-uses existing tools,
- is based on a comprehensive use case analysis of New Zealand stakeholders,
- allows unambiguous, automatic machine-to-machine communication.
Purpose of this ‘strawman’ document is to present our current thinking (based on feedback received during BSS workshops) on the potential structure and format of a BSS Standard for Biodata exchange and to invite feedback.
- Each data source / set is registered in a national registry and publishes data according to BSS standard.
This includes a set of standard metadata describing the data source including the provider agencies / licences / ownership (metadata standard).
This includes a set of standard fields describing the individual observations / records per data source.
That means that authoritative Data Services are discoverable by everybody (‘User knows what is out there’) through a single registry.
The purpose of the registry is to:
- Enable discovery (of what is available)
- Ensuring the consumer that a set of well defined ('BSS') standard data access format are available for access
- BSS supports GBIF compliant data provision.
That means BSS data source can be published in GBIF compliant format and the BSS data source is somehow registered (passed through) to the GBIF registry.
The purpose of this format is to:
- publishing in GBIF portal
- using GBIF tools
- integration with other data: collection, genetic data
- BSS supports OGC compliant data publication.
That means data can be consumed by standard OGC compliant clients (e.g. ArcMap. QGIS, ...); filtering is possible.
The purpose of this format:
- Enabling easy geospatial analyses
- Compatibility with existing tools (in particular GIS)
- integration with other data types published through OGC/OM (e.g. water, climate, oceans - sensor? data)
- Goal of BSS is to ensure that data provided through BSS (from different providers) can be consistently merged together.
That means data can be automatically accessed and integrated from multiple providers.
Note: This will be only applicable to core fields while still being flexible to add service specific information.
The BSS Standard consists of
- Specification and guidelines of standard data format and field names to be used.
- Specification and guidelines of vocabularies and conventions to be used for content.
- Specification and guidelines for mechanisms of the data publication (incl service registration).
BSS Standard – general data format and fields
Data exchange through BSS is done through data providers providing "Data Service" (or Data Sources / Data Sets).
Each data source consists of a series of observational records, one record describes observations related to one species at one place and time (Some people call a record a "sample").
For each data source / data set a series of standardized attributes / fields shall be described / archived / provided. These fields describe core human readable metadata for each service like title, creator, etc
For each observation / record a series of standardized attributes / fields shall be described / archived / provided. These fields describe:
- Observation location information,
- Observation temporal information,
- Observed species / taxa,
- Observed species abundance,
- (Other recorded information about the observed species), and
- Metadata related to the observation (Note: key metadata fields should be captured as data source metadata, not for the individual observations).
Independent of the data exchange mechanism and data exchange format, the field name conventions as per attached strawman are proposed.
(Note: Darwin Core is used as much as possible)
We plan to develop this strawman into a data model and specification for implementation.
BSS Standard – Vocabularies and Conventions
For content / data conventions and vocabularies as per attached strawman are proposed.
Key national vocabularies to be developed, supported, and used are:
- species / taxa. We propose to enable the New Zealand Organisms Register NZOR to fulfill that role.This might require development of new tools.
- method / protocols. A new register needs to be developed.
Other conventions include:
- Site identifiers - need to develop best practice and guidelines
- Locality / placename - best practice is NZ placename gazetter
- Method for coordinate establishment - need convention
- Date/time according to ISO8601 - need guideline
- Abundance_category - need convention
BSS Standard – Data Publishing
- Each participating organisation shall ensure its bio data management procedures enable data to be managed according to above fields, field formats, and vocabularies.
- Each participating organisation shall establish a GBIF compliant server for the data as described in this standard.
- Each organisation shall establish a server providing OGC services as described in this standard.
- Service end points are registered by the following mechanisms: TBD?
This picture describes how the BSS data publishing and consumption system works: