SUMMARY: Near-real-time streaming sensor data can feed the CZO Data Visualization Portal for interactive public assessment of current conditions. CZO investigators can benefit from such streaming data services to better plan storm sampling, to better maintain their sensor networks, and for assimilation for near-real-time model predictions. Several implementation options are available for CZOs that choose to provide access to near-real-time streaming data.
The CUAHSI Hydrologic Information System, which specializes in publishing time series of hydrologic observations derived from sensors, relies on web services as the mechanism for publishing sensor data on the Internet. The CUAHSI HIS project has established a standard web service interface called WaterOneFlow that consists of several query functions that allow users to write code to retrieve time series data via the web service. In its standard configuration, a WateroneFlow web service connects to an Observations Data Model (ODM) Version 1.1.1 database and delivers data stored in the ODM database upon user’s requests in Water Markup Language (WaterML) format. Requests can be made from any modern programming language that supports SOAP or RESTful web services.
Many research groups have implemented ODM and WaterOneFlow web services for providing users with near-real-time access to sensor data. It is possible for individual CZOs to establish and host their own CUAHSI HIS-compliant observational data web services.
Benefits of providing web services for near-real-time or streaming datasets include:
The CZO Data Visualization Portal is capable of providing interactive public access to current conditions via published web services.
Web services can provide a basic, public, yet secure interface to an operational sensor database. This can potentially broaden the availability of near-real-time data for several types of users, while protecting the integrity of the operational database.
IMPORTANT NOTE: Establishing operational sensor databases and CUAHSI-HIS compliant WaterOneFlow web services does not replace the need to process (e.g., quality control) and archive sensor datasets. Our expectation is that all time series datasets that have been processed for quality assurance/quality control (i.e., Processing Level 1-2) should be archived via CZODisplay/YODA files, preferably in Hydroshare or another repository where they can receive a DOI and be referenced by a CZO Dataset Listing. The purpose for this is to ensure that datasets may be referenced in papers, promoting the reproducibility of science as the data is used by others. CZOs may also choose to archive raw (Level 0) data as CZODisplay/YODA files (this is a best practice).
There are currently several options available for creating near-real-time access to sensor data using web services. The following sections describe each of these options.
Operational web services for near-real-time sensor data can be established using the HydroServer software stack developed as part of the CUAHSI Hydrologic Information Systems (HIS) project and now supported by the CUAHSI Water Data Center. The HydroServer software stack includes the Observations Data Model (ODM) Version 1.1.1 for storing observational data derived from sensors, a Streaming Data Loader software application for automating the process of loading streaming sensor data into the ODM database, and a deployable WaterOneFlow web application that connects to the ODM database and provides the WaterOneFlow web service interface. The latest versions of all of these software tools can be downloaded for free from the CUAHSI Water Data Center website. See: https://www.cuahsi.org/OptionsforDataPublication.
This option requires hosting and maintaining a Microsoft Windows Server onto which the software can be deployed. The CZOData Team can assist CZO Data Managers in implementing this option. Additionally, the CUAHSI Water Data Center employs a user support specialist (Jon Pollak) who can provide assistance and guidance in deploying and using the HydroServer Software stack. At a high level, the following steps are required to set up a HydroServer:
Establish a Microsoft Windows server. It can be a physical or virtual machine. Microsoft Windows 2008 Server R2 or Windows 2012 Server are recommended.
Install Microsoft SQL Server.
Download and deploy one or more ODM databases in SQL Server.
Use the Streaming Data Loader software or other tools to automate loading of time series data into ODM databases.
Deploy the WaterOneFlow web service application on the server.
The CUAHSI Water Data Center is willing to partner with academic research groups to provide data services via the Water Data Center. Under this model, the CUAHSI WDC establishes an ODM database and deploys and hosts the WaterOneFlow web services for you. It is then your responsibility to load data into the database using available tools.
If you are interested in this option, you should contact Jon Pollak, the CUAHSI Water Data Center user support specialist. See: https://www.cuahsi.org/OptionsforDataPublication.
Several alternative software tools have become available over the past several years for providing WaterOneFlow web services. These include HydroServer Lite (a PHP/MySQL version of elements of the HydroServer software stack) and WOFPy (a Python package that implements the WaterOneFlow web service interface). In general, if you need to run a Linux server but are still interested in these tools, HydroServer Lite or WOFPy are options that you might consider.
Additionally, some groups have developed their own WaterOneFlow web service interface to their existing data system. This option would be useful in the case where a CZO has an effective data system already but just wants to increase the interoperability of their data by exposing their time series data using WaterOneFlow and WaterML. Under this scenario, the CZO would have to develop the code to interface with their particular data system, implement the WaterOneFlow web service methods, and establish a mapping between the metadata in their data system and the requirements of WaterML.
Finally, using Version 2 of the Observations Data Model (ODM2) is also an option. The supporting software stack for ODM2 is currently under active development. An implementation of ODM2 for three major relational database management systems (i.e., Microsoft SQL Server, MySQL, and PostgreSQL) is already available for use. We are currently refining and testing a prototype for a Python-based Streaming Data Loader that can automate the loading of sensor data into an ODM2 database. Additionally, we have completed preliminary work on developing WaterOneFlow web services for ODM2. At this time only the ODM2 database schema is ready for widespread use, but the associated software tools should be completed within the next several months.