Minutes – Framework for Interoperable Freshwater Models Project Meeting
Draft 17 Jan 2011
Meeting held 13 Dec 2010, NIWA Christchurch
Sandy Elliott, Ude Shankar, Ross Woods, Andrew Watson (NIWA)
Val Snow, Ben Jolly, Greg Peyroux (AgResearch)
Robert Gibb, Linda Lilburne (Landcare Research)
Summary of Project Goals and Timeline (Sandy)
- See power point presentation below, presented to FRST Freshwater Research Stakeholders’ Meeting
- Initial project meeting was held with project leaders has been held (Nov 2010) to flesh out the project plan for the first year and to confirm project plan
- Steering Group has been established [list will be placed on the project wiki]. Good uptake of invitations. Still need to find someone from MfE. Initial meeting proposed for March 2011.
- Subcontracts have been agreed
- IP provisions in subcontracts have been agreed
- Although socio-economics are excluded from the contract scope, we need to be aware of how such models could plug into the framework and to document this.
- We should probably include a small subcontract in the project to get further input on groundwater models
- IT people from AgR and LC are keen on distributed models, ability to run models locally is also important for ‘power’ runs where there might be large data flows.
- Approach MfE to join group (Sandy)
- Put list onto collaboration site (Sandy)
- Set up first meeting with group (next project leaders’ meeting)
- Schedule next project leaders meeting (Sandy)
Overview of attributes of current freshwater quality models
- Sandy listed attributes of a range of freshwater models currently in use and which might be linked into the framework (see whiteboard notes below). This was not exhaustive, but was useful for bringing the IT people up to speed.
- See also next section (hydrology models)
- Are there existing metadata schema’s for models [RW1] . What about ontologies and semantics? The Seamless project spent a lot of time on metadata, and we might be able to learn from them, although they tended to get bogged down in the process.
- We need to include key users in the database
- Model attributes to capture (from Andrew): How much scope to modify the model (code availability, institutional constraints, model versions/standardisation); Licencing required to run the code (e.g. IDL needs a licence to run on a server, Matlab [RW2] or old Delphi projects may have third-party components where that require a licence).
- Establish and circulated an initial list of model attributes that will be captured (Sandy)
- Linda Lilburne to liaise with Sandy for LCR contributions
- Set up a template in the Confluence collaboration site to capture this (Sandy and NIWA IST)
- Draw up list of models to be documented, and responsibilities (Sandy)
- Document frameworks (various, as allocated above)
Summary of NIWA hydrology models (Ross)
- Topnet. Fotran 90 code. No GUI. Inputs are netCDF data files with ascii control files. Outputs are netCDF files. No API or plans for it. Being used for flood forecasting for Rangitaiki. Also being used for climate change  and land use change  studies in NZ.
- Ton Snelder and co at NIWA are developing a new tool for Environmental Flow Strategic Assessment Programme (EFSAP). Influence of water allocation on flows. Regional scale, new programme, which has 20-page planning document. Jani Diettrich is the likely programmer, likely to be in Matlab. Not interested in time-stepping model, just flow-duration curves. Is not being designed for interoperability, but could be.
- RC’s are interested in soil water balance models, relevant to irrigation demand (will proposed water allocation meet the demand?). Proposal to develop the RAT tool (River Analysis Tool) is at the concept stage. To be built under the NIWA WaterScape programme.
- It’s good that TopNet can be run separately from the user interface. Difficult when the calculation engine is closely coupled with the interface
- Is there potential to extend the RAT tool to pasture/crop production
- Follow up with Ton re potential interoperability of EFSAP (Ross)
NIWA Information Systems Team capability/ideas (Andrew Watkins)
- Try to use available standards and design patterns where available, rather than re-inventing. Lots of the distributed computing problems and data pipeline have been ‘solved’ by others and we can use those technologies. E.g in astronomy, NASA. OGC spatial data protocols. NetCDF and DAT for matrix-type information. XML for message passing and control information. Time series and data serving software, WPS, SOAP.
- Most web development in Java and PHP.
- EcoConnect (NIWA weather and associated environmental model software). Not really based on interoperability. More of a data stream between models with feed to the user. For new Samoa applications, making EcoConnect operate in more of a services management environment. Questions whether it is better to have a central services control, or whether the control should be customised an run from the user desktop. Could potentially be moved to a web server. EcoConnect uses continuous test and build. After code control, the system runs a set of tests on a parallel test system before being published.
- RiskScape. Joint NIWA/GNS project used for establishing environmental hazards and consequences. Can download various ‘modules’ (a map or model) from a central server, after which everything is run on a desktop. In the future the system could potentially be moved to a web server approach rather than desktop. Built with Java architecture. Fairly tightly coupled but doesn’t need to run all the components. Each component model has be re-written in a common language get the coupling. There is careful separation of the user interface and the models themselves.
- Interfaces: Design these as if in a web browser, both for model input and for data visualisation.
- Metadata: Can be used to help establish whether the available data inputs (/services) are appropriate. If not, then call and adapter for conversion. Metadata published information about a model or an adapter.
- To be really ambitious, we could aim to build our framework on an existing cloud network such as Google Engine.
- Some key questions: Where will the code live, where will the data live, how to connect the models.
- Andrew is in favour of “unencumbered use” approach to licencing, otherwise licencing can become complex and restrictive
- Can we get some consistency around data visualisation; stardardisation would help.
- Do models in the freshwater area need to be strongly-coupled (e.g. crop growth affects hydrology which affects irrigation which affects crop growth) , or can they be weakly-coupled, i.e. work more on a sequential/cascade approach (crop growth affects water availability).
- APSIM is similar to EcoConnect, except that the modules can be written in a variety of languages. However, this is about to change, with all modules being transferred to vb.net to speed run-time up.
- Shankar likes the Arc-Hydro standard because it is built for node-link systems which are relevant to freshwater. A drawback is that it has to run in ESRI.
- Document ideas on collaboration site (NIWA IST)
Landcare Research geospatial capabilities/ideas (Robert Gibb)
- Trying to remove/reduce the heterogeneity of spatial databases e.g. there are about 5 different land cover databases. Organise these semantically and use time stamping
- WFS/WMS web standards for spatial data become very cumbersome for large datasets. The data streams and communications become too slow. A web services system for freshwater models based on these standards would be too slow.
- NESSIE is a new project for developing distributed high-performance computing (super-cluster) that Robert is involved in. This could be useful for running models and providing software infrastructure.
- Looking at spatial data frameworks that could be used at different resolutions (like raster pyramids). There are some 3-d equal-area multi-resolution projection systems based on a cube that could be useful for this purpose, or a hexagonal mesh (PYXIS Innovation with associated WorldView GeoWeb browser), or the Healpix system used in astronomy. Robert is investigating some of these for the NESSIE project. They could facilitate handling of large datasets (such as LIDAR) with disparate resolution.
- Need a data warehouse approach for efficient data retrieval. Use and extract-transform-load operation for data retrieval. This could be part of NESSIE.
Other technologies used by the Landcare Research geospatial group include
geospatial uses of the Apache Cassandra and keyspace for replicated databases technology; use of open-source GDAL translator libraries an array reader.
- Andrew, Greg, and Ben (all the IT boffins) were very supportive of the data warehouse approach. Not a distributed database, but can extract data from a range of sources into a common location, ready for rapid retrieval.
- Greg : AgResearch are getting their own data warehouse. They looked at the web services approach, but realised that there needed to be a central compute resource to make things work well – too hard to do over the web. They have their lab data analysis/workflow system in a centralised registry. The data can be protected in the warehouse by using suitable access controls. Download of data into the warehouse could still be done using slower but standard technologies such as wfs, which would be done as a background process.
- Google fusion????
Document ideas on collaboration site (Robert)
APSIM (Ben Jolly)
- APSIM is a communication framework for models. It uses XML for the interface, then builds command-like instructions to control the running. It is quite generic, is dynamic with uniform time steps, but is not inherently spatial. The new version is in .net. It would be hard to re-write some of the older models originally written in other languages into .net, but wrappers can be built for them if necessary. The new code for communication is fairly small, and uses .net reflection (Java and Python could do a similar thing). Requires introspection. People could use the system ‘wrongly’ because there does not need to be meaning to the connections.
- There is quite a bit of jargon in this field. We should consider building up a glossary for the collaboration wiki.
- It would be good to have a demonstration of Riskscape and APSIM at subsequent meetings
- Document ideas on collaboration site (Greg)
- Daniel was going to lead the discussion of particular frameworks and how to collate information about them, but due to his absence and the limited time, it was decided to defer this item. However, the various IT presentations did cover aspects of existing frameworks, and there was some genera discussion.
- We already have a list of frameworks from the project leaders’ meeting. Val has done some preliminary investigation of some of these, and a shadow short-list. These need to be documented in a systematic way on the collaboration site. Daniel has found another framework, and the eSource from Australian CRC has also cropped up.
- The NIWA urban people are integrating urban models (Jonathan Moores).
Actions (Val and Daniel)
- Draw up list of frameworks to document
- Draw up list of attributes to document and set up template for the collaboration site
- Allocate write-ups to individuals
- Document frameworks (various, as allocated above)
- Follow up on what the urban people are doing (Sandy)
- For the workshop, need to keep things simple. Gather information on what types of things people [DTR4] want to be able to do. Also, what sort of timelines people are expecting, and what sort of constraints might apply. Also, set up some storylines or user scenarios, and question the users on how they would envisage using a modelling framework in that situation.
[RW1] The WISE project generated some metadata for the model components. This might be of interest to the IT folks, if that information is available.
[RW2] If we take Matlab code and compile it (NIWA has Matlab compiler) into an executable, then we don’t need a license to run that executable.
You can make Matlab executables that are called from CGI scripts. There are Matlab Builder products that can be yused to turn Matlab ionto .net or java classes.
[DTR3] Completely agree and cannot stress this enough. The question is what do we want the framework to do for “us” where us = researchers and end-users.
[DTR4] Need to be specific here. “People” is to vague term. Need to define various roles and responsibilities and how they would interact with the framework, e.g., data contributors, model contributors, model users, etc. Some may be more than one (or all). Especially important to get more detail on how end-users want to interact with the framework, as discussed above.