Soil moisture data explorer

This service will produce a new soil moisture product for the UK, taken from recently re-discovered and digitised records.

The following presentation on this service was given at the MELODIES: Exploiting Open Data conference in October 2016. Slides are available here.

Service Q&A

Who are the target users of this service?

Researchers. We are specifically speaking to scientists developing a new methodology to assess nitrous oxide emissions in the UK - a calculation which requires soil moisture as an input. The long-term aim is to reach a wider group with an interest in soil moisture variability in the UK, e.g. land surface modellers, ecologists etc.

How will these users access this service?

The soil moisture explorer will be accessed via the CEH Environmental Information Platform.

What products does this service provide?

Real time and historical (1970-present) in-situ and model-derived soil moisture data for the UK. The in-situ data is drawn from over 100 sites and is recovered from old tape archives and the model-derived data comes from pre-existing (non open-access) datasets. These are complemented with context-providing land cover data. This data is included in the soil moisture datasets as metadata and is from the land use/land cover product from our Land cover and land use information to support UNFCCC emissions reporting work.

The soil moisture explorer will allow users to explore the soil moisture maps in detail. Clicking on a point displays a graph of the model outputs as a function of time and includes any in-situ data that corresponds to the selected point. Users may select an area for averaging. Information will be downloadable in CSV format, or similar.

How will these products benefit users?

The data has been painstakingly recovered from forgotten paper archives, and this project brings them together online for the first time. Researchers can use the soil moisture explorer to quickly and easily explore and compare the available data all in one place before moving on to the analysis phase of their work.

Which Open Data sources drive this service?

None, we are producing Open Data.

What processing is performed on this data?

The in-situ data at the foundation of our products are recovered from tapes and paper archives and other more recent data sources (1990-present). These are combined and uploaded into an Oracle relational database. Site and measurement information are extracted from this database and converted into RDF triples before being uploaded to the Strabon spatio-temporal linked data store on the Shared Platform.

How does this service use Linked Open Data?

The soil moisture database will be made available by the end of the project as linked open data.

We have worked closely with the University of Athens to develop an ontology for the in-situ soil moisture data. This ontology can be linked to an Observations and Measurements ontology based on the Observations and Measurements OGC Standard.

How Open Data has improved this service

By providing a framework by which we can maximise the opportunity for re-use of the in-situ data for a wide range of purposes.

How the Shared Platform has improved this service

By providing easy and supported access to Sextant in a high-performance environment.

How LOD and/or visualisation tools have improved this service

During the development of our service, we have been able to prototype SPARQL queries on Strabon to link in-situ soil moisture data with gridded soil moisture, land cover and river catchments datasets. We have used the linked data visualisation tool Sextant to join, display and overlay spatio-temporal linked datasets on a map in a prototype portal.

Our biggest challenges so far...

To use gridded raster data - such as modelled soil moisture data, and our land cover product - as linked data with the tools we currently use, we have to first convert the datasets to a vector format. This is very inefficient, particularly for high-resolution datasets. Other partners in MELODIES are working on a technical solution for using raster data as linked data, which our service may be able to use.

Making SPARQL queries on real-time applications can be slow, particularly when performing complex spatial joins. We have two possible solutions on the table to improve performance:

  1. The Silk linked data discovery framework tool. This allows the pre-computing of spatial or temporal relationships to speed-up SPARQL queries
  2. The Ontop Spatial tool. This enables virtual geospatial RDF to be created on top of a relational database without having to convert the data to RDF first.