Child pages
  • The NEMO Data Model and XML Representation
Skip to end of metadata
Go to start of metadata

 

The NEMO Data Model and XML Representation

NEMO Data Model

NEMO - NIWA Environmental Measurements and Observations is an aggregate data system that is designed to hold a wide and varied range of environmental field survey results that brings together information from source systems such as the Freshwater and Marine Biodiversity databases, The Freshwater Fish Database, The Aquatic Plants database and the National Rivers Water Quality database as well as smaller sources found in spreadsheet and desktop scale databases. The NEMO system provides web service layers for input (ingesting) and output of data records and where appropriate provides industry standard Web Service protocols such as OGC WFS. We would like to be able to provide OGC SOS support for this system and make content data available through OM.

The NEMO Data model implements a version of the Abstract Biodiversity Model given above but for historical reasons some of the terms are different. 

The NEMO data model also has some similarities with the OM abstract model but is designed primarily as a storage model rather than an information transfer model. The design of the system is intended make discovery and access to the data simple and efficient for end users.

NEMO has as its basic unit of information the sampling_event. This represents all the data gathered at a specific place and time according to one or more survey methods. Beyond a few core fields all data associate with the observations and measurements made at the sampling event are held in user definable attribute value tables. This makes the system fully extensible however the details of an attribute definition are not currently as flexible as the phenomenon or result types used in OM.

  • A group of sampling events gathered using a consistent methodology or survey type make up a dataset.
  • A sampling_event has a location (spatial extent) and a time (Temporal extent) and a set of attributes that describe the observations and measurements made at the location and time.
  • A sampling event may relate to a number of samples taken at the site. Each sample in turn will have its own set of attributes.

For example at a river bank site a survey might record aspects of the habitat and land use, the surveyor might take samples of fish - using various methods and record information about each fish - taxon, size or about categories such as count of each taxon found.
Some information recorded might be regarded as a measurement - water flow for example, or might be regarded as metadata related to the sampling. e.g. fishing method. NEMO does not distinguish directly between these interpretations treating all information recorded - observations, feature descriptions, metadata etc as attributes.

As a survey methodology often results in a group of measurements being closely related to each other we support the concept of attribute sets - these are groupings that might map onto a page of a data entry form, or a single table in a traditionally designed relational database. These capture the idea that stream characteristics might be made up of a list of descriptive elements such as whether the water is permanent,tidal, water level etc.

NEMO Class Diagram

NEMO-WFS

NEMO already uses the OGC WFS protocol to publish information about sampling events held in the database. This gives access to a summary level of data about each sampling event - but not the underlying measurements and observations.
The WFS feed is available at http://fbis-wfs.niwa.co.nz/

The service provides two key feature sets

  • locality - a collection of all the places where sampling events have taken place
  • samplingevents - the sampling events themselves - which may repeatedly use a locality.

Example FBIS Sampling Event Feature XML

Error rendering macro 'xslt'

Server reported an error: fbis-wfs.niwa.co.nz.

Note that for practical reasons we have blurred the concept of what constitutes a feature attribute or an observation. In the above XML we list as an attribute of the sampling event the list of species found at that location. This is clearly an observation but is provided as it allows end users to search for sampling events based on area, date and species and thus leads them to discover the sampling events of interest.

Using SOS this would not be required as we would be able to search or filter on the values of observations but it is a hint that taxonomic concept is as important a dimension in biodiversity as time or space.

Example FBIS Locality Feature XML

Error rendering macro 'xslt'

Server reported an error: fbis-wfs.niwa.co.nz.

NEMO XML Document for Sampling Event

Having made sampling location and event information available through the WFS Protocol the next stage would be to make the observations information available through the SOS Protocol using OM as the data format. However this requires some agreement from the biodiversity information community on how the information should be represented.

In the mean time we have created a simple XML format that directly represents the information in the data system. This can be used in the short term to present data through existing portals such as the NIWA Environmental Information Browser http://ei.niwa.co.nz and as a source document here for discussion about the issues of mapping biodiversity data to OM.

This XML representation of a NEMO sampling event is not intended to be a final published XML document. Rather it is a low level capturing of the available information in an easy to read form that will assist in identifying the mapping to OMXML.

<?xml-stylesheet type="text/xsl" href="/css/NemoRecord.xsl"?>
<sampling_event>
  
<id description="The event id" display_name="Event ID">42216</id>
  
<start_date description="The first/start date of sampling" display_name="Sample start date">1997-11-20</start_date>
  
<end_date description="The end date of sampling" display_name="Sample End Date" />
  
<date_loaded>2010-12-21</date_loaded>
  
<master_record_link description="Unique identifier from data source. This and the data source identifier uniquely identify the sampling effort in the FBIS database" display_name="Original master record">3554 Fish_JR</master_record_link>
  
<access description="Who can see the data - NIWA or Public?" display_name="Access">P</access>
  
<location>
    
<id description="Location Composite ID" display_name="Location Composite ID">19148</id>
    
<point description="Core data about the raw position of the sampling event" display_name="Sampling Event position">
      
<id description="Spatial Point ID" display_name="Spatial Point ID">18483</id>
      
<x description="X coordinate for position of sampling" display_name="X Coordinate">2693700.0</x>
      
<y description="Y coordinate for position of sampling" display_name="Y Coordinate">6488100.0</y>
      
<srid description="Spatial Reference System Identifier - to describe the entered coordinates" display_name="Spatial Reference System">27200</srid>
      
<altitude description="Height in meters for position of sampling" display_name="Altitude">10</altitude>
    
</point>
    
<attribute_sets>
      
<p_old_location description="Additional data about the location of the sampling event (old FBIS)" display_name="Sampling Event Location (old FBIS)">
        
<id description="location_attribute_set id" display_name="location_attribute_set id">1699</id>
        
<catchment_old_id description="The FBIS catchment code for the sampling location. From a standard list of catchment codes." display_name="FBIS Catchment Reference">2296</catchment_old_id>
      
</p_old_location>
      
<p_location description="Additional data about the location of the sampling event" display_name="Sampling Event Location">
        
<id description="location_attribute_set id" display_name="location_attribute_set id">17125</id>
        
<nzreachid description="The reference of the river reach from the River Environment Classification dataset" display_name="NZ Reach ID">2005143</nzreachid>
        
<pos_ref_note description="Note regarding the position data. E.g. the original reference before conversion or the sheet reference" display_name="Position reference note">s11</pos_ref_note>
        
<site_desc description="Name, location, description of the sampling site" display_name="Site description">Tawaipareira Creek</site_desc>
        
<catchment_niwa_id description="The catchment code for the sampling location. From a standard list of catchment codes." display_name="Catchment NIWA Reference">2296</catchment_niwa_id>
      
</p_location>
    
</attribute_sets>
  
</location>
  
<datasource>
    
<id>23</id>
    
<source description="Identifies the original source of the data" display_name="Data source">NZFFD OLD</source>
    
<metadata_uuid description="Meta Data Link" display_name="Meta Data UUID">8580f3c7-d79c-4a00-bbd1-26e6f69f2f2f</metadata_uuid>
    
<dataset>
      
<id description="dataset id">2</id>
      
<code description="dataset code">FBIS</code>
      
<description description="dataset description">Freshwater Biodata Information System</description>
    
</dataset>
  
</datasource>
  
<samples>
    
<sample>
      
<id>46389</id>
      
<subject>Fish</subject>
      
<attribute_sets>
        
<fsm_presence_est description="Sample presence estimate" display_name="Presence estimate">
          
<id description="sample_attribute_set id" display_name="sample_attribute_set id">65644</id>
          
<estimate_type description="Presence estimate type" display_name="Presence Estimate Type">Commercial</estimate_type>
          
<presence_estimate description="Presence estimate - values dependant on the presence estimate type" display_name="Presence estimate">1</presence_estimate>
        
</fsm_presence_est>
        
<sm_status description="Sample status" display_name="Sample status">
          
<id description="sample_attribute_set id" display_name="sample_attribute_set id">88831</id>
          
<species_id_confirmed description="True if the determination of the sample has been confirmed" display_name="Species ID Confirmed">f</species_id_confirmed>
        
</sm_status>
        
<fsm_flora_depth description="Flora Depth" display_name="Flora Depth">
          
<id description="sample_attribute_set id" display_name="sample_attribute_set id">70688</id>
          
<max_flora_depth description="Maximum flora depth (m)" display_name="Maximum flora depth (m)">0.5</max_flora_depth>
        
</fsm_flora_depth>
        
<taxa description="Taxa" display_name="Sample Taxa">
          
<id description="sample_attribute_set id" display_name="sample_attribute_set id">68211</id>
          
<taxon_niwa_id description="Reference to NIWA Taxa Determination" display_name="Taxa identification">14</taxon_niwa_id>
          
<life_stage description="Life stage of the sample" display_name="Life stage">j</life_stage>
        
</taxa>
        
<fsm_habitat description="Additional habitat data for the particular sample" display_name="Sample Habitat">
          
<id description="sample_attribute_set id" display_name="sample_attribute_set id">65877</id>
          
<sm_water_flow description="Water flow associated with the sample" display_name="Sample Water Flow">run</sm_water_flow>
        
</fsm_habitat>
        
<old_taxa description="Old FBIS Taxa" display_name="Old FBIS Sample Taxa">
          
<id description="sample_attribute_set id" display_name="sample_attribute_set id">65897</id>
          
<taxon_old_id description="Reference to old FBIS Determination ID" display_name="Original Taxa Identification">14</taxon_old_id>
        
</old_taxa>
        
<fsm_fauna_len description="Fauna length" display_name="Fauna Length">
          
<id description="sample_attribute_set id" display_name="sample_attribute_set id">78666</id>
          
<min_fauna_len description="Minimum length of fauna (mm)" display_name="Minimum fauna length (mm)">100</min_fauna_len>
          
<max_fauna_len description="Maximum length of fauna (mm)" display_name="Maximum fauna length (mm)">200</max_fauna_len>
        
</fsm_fauna_len>
      
</attribute_sets>
    
</sample>
  
</samples>
  
<attribute_sets>
    
<fse_water_char description="Water characteristics - colour, clarity etc" display_name="Water characteristics">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">31253</id>
      
<water_clarity description="Water body clarity" display_name="Water Clarity">c</water_clarity>
      
<ave_depth description="Average depth of area sampled (m)" display_name="Water Average depth">0.2</ave_depth>
      
<ave_width description="Average width of stream (m)" display_name="Water Average width">0.2</ave_width>
      
<water_colour description="Water body colour" display_name="Water colour">u</water_colour>
    
</fse_water_char>
    
<fse_stream description="Stream characteristics - flow, water type, pollution, fauna etc" display_name="Stream characteristics">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">60083</id>
      
<water_type description="Water body type description" display_name="Water type">fdrssn</water_type>
      
<btm_fauna_species description="The predominant species group" display_name="Bottom Fauna Type">s</btm_fauna_species>
      
<permanent description="Is there always water at the site?" display_name="Water body is Permanent?">t</permanent>
      
<water_level description="Water body level" display_name="Hydrological condition of stream">l</water_level>
      
<tidal description="Whether the sample site is tidal" display_name="Water body is Tidal?">f</tidal>
      
<btm_fauna_numbers description="Observed abundance of small benthic fauna" display_name="Bottom Fauna Number">m</btm_fauna_numbers>
    
</fse_stream>
    
<fse_method description="The method use to conduct the sampling effort" display_name="Sampling method">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">51035</id>
      
<collection_method description="Describes the collection method used to collect samples" display_name="Collection method">efp</collection_method>
    
</fse_method>
    
<fse_method_detail description="Details about the method used for sampling" display_name="Sampling method detail">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">46051</id>
      
<reach_length_m description="The length of the stream sampled in metres" display_name="Reach length (m)">15</reach_length_m>
      
<sample_attempts description="The number of sampling attempts made" display_name="Sample attempts">1</sample_attempts>
    
</fse_method_detail>
    
<fse_landuse_multi>
      
<fse_landuse description="Surrounding land use descriptors" display_name="Surrounding land use">
        
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">49584</id>
        
<perc_landuse description="The percentage of a particular land type at a sampling location" display_name="Land use %">60</perc_landuse>
        
<landuse description="Type of land use" display_name="Land use">Farming</landuse>
      
</fse_landuse>
      
<fse_landuse description="Surrounding land use descriptors" display_name="Surrounding land use">
        
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">49545</id>
        
<landuse description="Type of land use" display_name="Land use">Native forest</landuse>
        
<perc_landuse description="The percentage of a particular land type at a sampling location" display_name="Land use %">40</perc_landuse>
      
</fse_landuse>
    
</fse_landuse_multi>
    
<fse_flow description="Water flow descriptors" display_name="Water flow descriptors">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">43125</id>
      
<perc_cascade description="Percentage cascade in water body" display_name="Water cascade %">0</perc_cascade>
      
<perc_rapid description="Percentage rapid in water body" display_name="Water rapid %">0</perc_rapid>
      
<perc_riffle description="Percentage riffle in water body" display_name="Water riffle %">0</perc_riffle>
      
<perc_runs description="Percentage runs in water body" display_name="Water runs %">100</perc_runs>
      
<perc_pool description="Percentage pool in water body" display_name="Water Pool %">0</perc_pool>
      
<perc_backwater description="Percentage backwater in water body" display_name="Water backwater %">0</perc_backwater>
      
<perc_still description="Percentage still water in water body" display_name="Water Still %">0</perc_still>
    
</fse_flow>
    
<fse_vegetation_multi>
      
<fse_vegetation description="Riparian vegetation descriptors" display_name="Riparian Vegetation">
        
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">41784</id>
        
<perc_riparian_vegtype description="The percentage of a particular vegetation type within 5m of the water's edge of the sampled water" display_name="Riparian Vegetation %">100</perc_riparian_vegtype>
        
<vegtype description="Vegetation type" display_name="Vegetation Type">Grass/Tussock</vegtype>
      
</fse_vegetation>
    
</fse_vegetation_multi>
    
<fse_measurement_multi>
      
<fse_measurement description="Measurements associated with the sampling event" display_name="Measurement">
        
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">56635</id>
        
<se_measure_value description="The result of a measurement associated with sampling effort" display_name="Measurement">17.0</se_measure_value>
        
<se_measure_location description="The location of a measurement associated with sampling effort" display_name="Measurement location">in the field</se_measure_location>
        
<se_measure_unit description="Unit of measurement. Please use SI units" display_name="Measurement unit">oC</se_measure_unit>
        
<se_measure_type description="Type of measurement" display_name="Measurement Type">temp</se_measure_type>
      
</fse_measurement>
    
</fse_measurement_multi>
    
<fse_fish_cover description="What type of fish cover is available" display_name="Fish cover">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">22890</id>
      
<bed_bank_veg description="Fish cover provided by bank vegetation" display_name="Fish cover bank vegetation?">t</bed_bank_veg>
      
<bed_undercut_bank description="Fish cover provided by undercut bank" display_name="Fish cover undercut bank?">f</bed_undercut_bank>
      
<bed_debris description="Fish cover provided by debris" display_name="Fish cover debris?">f</bed_debris>
      
<bed_weed description="Fish cover provided by weeds" display_name="Fish cover weeds?">f</bed_weed>
    
</fse_fish_cover>
    
<se_time description="The time or time of day the sampling took place" display_name="Sampling time">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">42364</id>
      
<start_time description="The time the sampling effort started. E.g. hh.mm. Or day, night" display_name="Sample start time">10:00</start_time>
    
</se_time>
    
<fse_substrate description="Substrate descriptors" display_name="Substrate descriptors">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">41475</id>
      
<perc_bedrock description="Percentage substrate is bedrock" display_name="Substrate bedrock %">0</perc_bedrock>
      
<perc_boulders description="Percentage substrate is boulders" display_name="Substrate boulders %">0</perc_boulders>
      
<perc_cobbles description="Percentage substrate is cobbles" display_name="Substrate cobbles %">0</perc_cobbles>
      
<perc_coarse_gravel description="Percentage substrate is coarse gravel" display_name="Substrate coarse gravel %">0</perc_coarse_gravel>
      
<perc_fine_gravel description="Percentage substrate is fine gravel" display_name="Substrate fine gravel %">0</perc_fine_gravel>
      
<perc_sand description="Percentage substrate is sand" display_name="Substrate sand %">0</perc_sand>
      
<perc_mud description="Percentage substrate is mud" display_name="Substrate mud %">100</perc_mud>
    
</fse_substrate>
    
<se_sampler_multi>
      
<se_sampler description="The person or organisation undertaking the sampling" display_name="Sampler">
        
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">51009</id>
        
<sampler_initials description="The initials of person undertaking the sampling" display_name="Sampler – Initials">wfd</sampler_initials>
        
<sampler_is_diver description="The person undertaking the sampling is the diver involved" display_name="Sampler is diver">f</sampler_is_diver>
        
<sampler_organisation description="The organisation undertaking the sampling" display_name="Sampler - organisation">bioresearches</sampler_organisation>
        
<sampler_person description="The person undertaking the sampling" display_name="Sampler - Person">wayne donovan</sampler_person>
      
</se_sampler>
    
</se_sampler_multi>
    
<fse_distance description="Distance data associated with the sampling event" display_name="Sampling distance data">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">41831</id>
      
<dist_inland description="The distance inland of the sampling location, in kilometres" display_name="Distance Inland (kms)">0</dist_inland>
    
</fse_distance>
    
<se_status description="Sampling event status" display_name="Sampling Status">
      
<id description="sampling_event_attribute_set id" display_name="sampling_event_attribute_set id">49682</id>
      
<se_verified description="Sampling effort verified" display_name="Sampling effort verified">false</se_verified>
    
</se_status>
  
</attribute_sets>
</sampling_event>


Note that we have elected to use the notation

 

<type>value</type>
e.g <min_fauna_len>100</min_fauna_len>

for attribute value pairs rather than

<attribute type="name">value</attribute>
e.g
<attribute type="min_fauna_len">100</attribute>

The two forms are functionally equivalent and most XML processing can operate on either without any issues. However the former is more difficult to express as an XML schema document as the element names are subject to change and extension.

Also for clarity these examples do not include namespaces.

  • No labels