SUMMARY: The CZOData team has been developing the new YAML Observations Data Archive and Exchange (YODA) file format to extend the original CZO Display File specification to accommodate the full diversity of critical zone science data -- such as hydrological time series, soil profile geochemistry, biodiversity transects, etc. -- that can be organized with the Observations Data Model v2 (ODM2). As an implementation of ODM2, YODA will serve as a text-file encoding both for archiving observational data at recommended data centers and for integrating diverse data from multiple sources using ODM2-based cyberinfrastructure at these data centers and the CZO Central system.
CZOData Team Contacts: Anthony Aufdenkampe and Jeffery Horsburgh
Questions? Email the CZOData Project team
Data Policies & Guidelines > Data Sharing Guidelines > YODA files
The new YAML Observations Data Archive and Exchange (YODA) file format was specifically designed by the CZOData team in collaboration with CZO data managers to substantially extend and replace the existing CZO Display File format, which was only capable of encoding hydrologic time series. YODA provides the capability to encode both sensor time series datasets and specimen-based laboratory datasets. In addition, the YODA File will meet the following requirements:
CZO investigators, data managers and data users will benefit from the following:
Details of the YODA file format and associated Excel data entry templates can be found at the YODA-File Github source code repository: https://github.com/ODM2/YODA-File.
In general, there are two workflows for creating YODA files. Details are provided in the following sections.
The CZOData Team has created two Microsoft Excel data entry templates that can be used by Data Managers to create YODA files. These include a Time Series YODA template, and a Specimen YODA template. These Excel templates provide pre-formatted tables into which CZO investigators and CZO data managers can paste or type metadata and data values for a particular dataset. The templates provide access directly to ODM2 Controlled Vocabulary terms directly within the template files (e.g., users can choose terms from pre-populated lists to populate metadata fields rather than typing in their own terms). Once data entry is complete, an automated script within the template files can be executed to export a valid YODA file that can then be listed at CriticalZone.org.
We are currently finishing up development of the Excel template files, but prototypes are available for download from the YODA-File GitHub repository https://github.com/ODM2/YODA-File.
In some cases, Data Managers may need to generate large numbers of YODA files, or they may want to automate the process of YODA file creation. YODA is a text file specification based on the data serialization and interchange format of YAML (YAML Ain't Markup Language), a superset of JSON (JavaScript Object Notation). YAML can be readily generated or parsed by any modern computer language using well-tested libraries (see Projects list at http://yaml.org). Therefore, one valid option for creating YODA files is for CZO Data Managers to develop scripts or other code to interact with their underlying data system to automatically generate YODA files. Thus, data can be managed according to the current system used by the CZO, but exported to the YODA format for exchange with CZOData Central. In the case where a CZO adopts ODM2 as part of their underlying data management infrastructure, the CZOData Team is developing tools for exporting YODA files directly from ODM2 databases (see following section).
The CZOData Team is currently working on a set of Python-based tools for working with YODA files. These tools are being developed within an open-source GitHub repository at: https://github.com/ODM2/YODA-Tools. Tools under development include a YODA file validator - i.e., a Python-based utility that will parse a YODA file and ensure that it is complete, conformant with ODM2 controlled vocabularies, and ready for posting at CriticalZone.org.
Additional relevant tools related to YODA files include code for parsing YODA files into a Python-based object structure for loading datasets into an ODM2 database. The same object structure can be used to query a dataset out of an ODM2 database for export to a YODA file. This Python-based object model is part of ongoing development of an application programming interface (API) for ODM2 - see https://github.com/ODM2/ODM2PythonAPI. These tools may be very useful for data managers who are considering or who have decided to use ODM2 databases for managing their sensor or sample-based data.