SOST Interagency Ocean Observation Committee

Data Management and Communications

Steering Team

DRAFT SUMMARY REPORT

February 24, 2012

January 18-19, 2012

Consortium for Ocean Leadership

1201 New York Avenue, NW

Washington, D.C.

Table of Contents

1. Introduction…………..………………………………….……………………………………. 1

2. Presentations and Discussion………………………..………………...…………….. 2

a. U.S. IOOS DMAC Update………………………………………………..…………… 2

b. Challenges facing NASA Data Centers: A Scientist’s Perspective..... 4

c. Overview of OGC Web Services…………………………………………..……… 6

d. Update on NSF’s OOI Cyber-infrastructure………………….….………….. 8

e. What is QARTOD? – US IOOS QARTOD and Next Steps….………….…. 8

f. NSF’s Earth Cube……….. ..………………………………………………..…………10

3. Key Findings and Recommendations……….………………………………….. 11

4. Next Steps (action items) …………………….…………………………………….… 14

5. Appendices …………..…………………………….………….……………………………..15

A. Meeting Attendees ………………….……………………………..……...……….. 15

B. Meeting Agenda ……………………..…………………………………….………….15

C. Presentations (PowerPoint slides)………………...……….……………… 18

D. DMAC ST Terms of Reference………………………..………………………… 71

SOST Interagency Ocean Observation Committee – DMAC Steering Team

Summary Report of January 18-19, 2012 Meeting

SOST Interagency Ocean Observation Committee – DMAC Steering Team

Summary Report of January 18-19, 2012 Meeting

1. Introduction

This report summarizes the January 18-19, 2012 meeting of the IOOC’s Data Management and Communications Steering Team (ST). The report is organized into five sections: Introduction, Presentations and Discussion, Key Findings and Recommendations, Next Steps (action items), and Appendices which includes a list of attendees, the meeting agenda, copies of all the presentations, and the ST charter or terms of reference.

The ST is chartered by the Interagency Ocean Observation Committee (IOOC) to provide expert guidance and recommendations for the overall management and execution of the DMAC sub-system of US IOOS in accordance with the ICOOS Act. As such, the ST convenes at least twice annually to review and discuss priority topics and provide recommendations to the IOOC. These meetings, which include ST members, invited guests and non-Federal stakeholders, also contribute to information sharing and overall coordination of DMAC activities across the US IOOS enterprise. The ST is chair is appointed by the IOOC. For more information about the ST including current members, the ST “terms of reference” charter (TOR), and recent reports, visit the ST website at the Consortium for Ocean Leadership at:

The ST last met May 2011. Therefore, following introductions, the current Acting Chair (US IOOS’s Charles Alexander), opened the meeting by reviewing key elements of the ST Terms of Reference (see Appendices) including roles and responsibilities of ST members and other meeting attendees. He also briefly reviewed highlights from the May 2011 meeting and the agenda for the current meeting.

The ensuing discussion raised a number of useful and important questions about overall roles and responsibilities of the DMAC ST. Specific suggestions and recommendations for clarifying the ST role, and prioritizing its function and focus were also made. The ST returned to these issues on Day 2 with additional discussion, observations and suggestions.

Discussion

  • IOOC expectations/requirements: Besides the broad statements in the TOR, does the IOOC have specific questions or problems it has directed the ST to confront?

The IOOC has empowered the ST to identify priority topics for ST meetings and report findings and recommendations to the IOOC for consideration.

  • Disposition of ST meeting results: What does the IOOC do with ST inputs or recommendations? How are the findings and/or recommendations of the ST shared across the IOOC membership and are actions taken by IOOC membership based on these findings/recommendations? It is important to ensure the coordination across the many different existing forums including the IOOC.

Per the TOR, the purpose of the ST is to “Provide the IOOC with strategic guidance on DMAC-related activities and challenges.” The IOOC will give these inputs due consideration in the overall context of guiding the implementation of IOOS per the requirements of the ICOOS Act.

  • Practicality of certain clauses of the ST Terms of Reference: The TOR refers to the ST as convened to “solve” specific data management challenges. This seems ambitious for a day and a half meeting. The ST also needs to agree on what the issues are to be “solved” and this will the ST identify suitable priorities.
  • Also need to agree on what the issues are to be “solved” to help determine ST focus.

The TOR acknowledges that the ST is a “lightly-resourced mechanism” that must “maintain realistic expectations”. However, the ST is also empowered, as necessary to form ad-hoc “tiger teams” to help address and resolve specific challenges and deliver clear, actionable results. Priority topics for the January meeting were derived through a series of ST conference calls.

2. Presentations and Discussion

This section provides an overview of six technical topics included in the ST agenda. Power point slides used by presenters (all topics except NSF’s Earth Cube) are provided in the Appendices. A short summary is provided for each topic followed by an overview of comments and/or questions discussed.

a. USIOOSDMACUpdate

Derrick Snowden, US IOOS Senior Systems Architect, provided a status report on US IOOS DMAC development including regional and federal components and the basic drivers for execution – seven US IOOS societal goals and a single integrated data management system. A high-level overview of the traditional IOOS subsystems (modeling and analysis, DMAC, observing) and other cross-cutting subsystems (governance, research and development, and outreach and education) was followed by a more in-depth illustration of the notional DMAC architecture. Using OGC’s publish-find-bind reference model as a conceptual illustration (Figure 1), progress on standard data access services and tools for a data registry and catalog services were reviewed. OGC’s Sensor Observation Service (SOS) is a focal point for in situ data types and OPeNDAP and THREDDS for gridded data types (though emerging refinements such as the “climate and forecast” conventions for metadata improve the viability of THREDDS for in situ data types also). US IOOS is facilitating a process to refine a candidate open source solution for in situ data based on SOS that will simplify DMAC configuration management with a consensus and adoptable encoding standard. Technical partners include DMAC managers from IOOS regions, NOAA (CO-OPS, NDBC, Chesapeake Bay Office), and USACE. Client applications and toolkits to be engaged in the near future include NSF’s OOI Cyber-infrastructure, the US IOOS Coastal Modeling Testbed and the Environmental Data Connector, a data integration tool for connecting geospatial data via ArcGIS to THREDDS/OPeNDAP supported in part by US IOOS funding. Other emerging DMAC collaborations described included progress on improved access to and standardization of biological observations, initial work to define standards for ocean glider data, opportunities per sea surface temperature data via an IOOS funded project on GHRSST/MISST, and QARTOD (Quality Assurance of Real Time Oceanographic Data). Derrick closed the presentation by posing the question: IOOS DMAC is multifaceted; how best can we utilize the expertise of the ST?

Figure 1.Conceptual illustration of “publish, find, bind” OGC reference model from US IOOS

DMAC Update.

Discussion

  • How can the DMAC ST help US IOOS implement DMAC?
  • The biggest issue is determining where the integration is needed most and how to do it. This might involve censusing IOOS regions where significant integration with federal, state and local agencies is already happening, and across IOOC agencies such as those represented at the DMAC ST.
  • A user or community survey for feedback such as the informal US IOOS survey on its initial web services (SOS GML) might be useful approach.
  • Another option is working within an existing program such as the US IOOS Modeling Testbed though there are challenges there also.
  • Governance: coordinating/leveraging workflows across many existing and related data management advisory groups
  • Emerging data management issues from the NOC may be informative. For example:
  • A fairly mature taxonomy of data and common interest among agencies to catalog and deliver information more concisely via Data.gov is emerging; and
  • Governance issues could also perhaps provide some beneficial leveraging and multi-agency focus.
  • Answering the following governance-related questions will help determine priorities and guide the breadth and depth of ST recommendations:
  • Who is the target audience for ST recommendations?
  • Who is responsible for disposition of recommendations?
  • Is the ST required to coordinate with other DMAC-related initiatives and programs underway in the various federal agencies before making a recommendation?

b. Challenges Facing NASA Data Centers: A Scientist’s Perspective

This presentation continued a format agreed to at the May 2011 ST meeting to have an IOOC member agency act as a virtual meeting host by providing a window into data management and interoperability challenges at their agency followed by an extended discussion on highlighted issues. The USACOE’s Jeff Lillycrop hosted the presentation and discussion in May 2011. NASA’s Michelle Gierach, Lead Project Scientist for the Physical Oceanography Distributed Active Archive Center (PO.DAAC) at NASA’s Jet Propulsion Laboratory, took on this role at the January 2012 meeting.

Michelle began with an overview of NASA’s Earth Science Data and Information System (ESDIS) Project and its 12 associated NASA data centers, including PO.DAAC which provides data management, access services, and science support for NASA oceanographic satellite missions and projects via products that fall into six core parameters: gravity, ocean currents, sea surface salinity, ocean surface topography, ocean winds and SST. Primary customers are ocean and climate researchers but PO.DAAC also serves resource managers and the general public. The remainder of the presentation focused on the general data management challenges facing NASA’s data centers (Figure 2), and a more in-depth look at these

Figure 2.Data management challenges facing NASA data centers.

challenges. For example, anonymous access to PO.DAAC has made the use of a User Registration System for tracking use metrics and optimizing data use difficult. And even if users can be better understood, accommodating their diversity – ranging from graduate students to emeritus scientists – is also difficult. NASA has similar challenges on data standards where data providers and consumers are often not using compatible protocols, tools, and formats. NASA is making progress on this via an Earth Science Data System Standards Process Group but there are still many examples of data for the same parameter that are presented to consumers in conflicting formats, metadata and attribute conventions and a lack of basic documentation and “read” software. Proposed solutions include a one-stop shop for data pertaining to a particular discipline, improving data discovery facets, using a universal format (NetCDF, CF-compliant, ISO19115), and inclusion of read software and documentation. The presentation closed with some observations on similarities between IOOS and NASA data management priorities (e.g. data interoperability among data providers, data users, and data services; standard data formats, metadata, and attribute conventions; ample data documentation and software; and reporting summary data usage statistics) and a set of DMAC questions to provoke discussion.

Discussion

In keeping with the format for this part of the meeting, an extended discussion followed during which there were a number questions, comments, and suggestions.

Questions:

  • What are the issues surrounding NASA protocols, similar to GEOSS? Are there efforts in the discovery realm? The NASA DAACs feed metadata into ECHO and GCMD, which in turn feeds the national (DateOne, IOOS) and international metadata repositories (GEOSS).
  • How difficult is it to apply new standards to historic or archived data?The good news is that the NASA DAACs are long-term stewards of their data and that related NSF grants require archiving. How to apply the same standards to archived data is however unclear though the difficulty of changing older data may be an incentive for adopting standards sooner. NOAA’s procedures for archiving is also available and should be useful.
  • How does NASA (and other data providers) decide where to start per archiving?A key finding at a recent NSF-sponsored conference on managing scientific data indicated that data stewards should prioritize data sets for archiving based value per customer use or popularity (crowd sourcing).This approach can help establish priorities and ensure providers and users are not buried by archiving requirements.
  • How are NOAA and NASA coordinating?There are annual data manager meetings within NASA. Via discussions during the most recent OGC meeting, NOAA Silver Spring staff are also exploring how to engage NASA Goddard staff in quarterly meetings to discuss common challenges.
  • Other questions:
  • How can standardsactually be enforced?
  • Are there specific requirements on the documentation and read software?
  • Can NASA better define or hone the DAACs targeted audience to focus DMAC recommendations?
  • For data standards, the attribute conventions contain 44 attributes to fill out. If all of those attributes were filled out, would there be any other problems?

Comments/Suggestions:

  • Shared challenges
  • Meeting participants noted that there is clearly a strong convergence across agencies and data providers per NASA’s lists of challenges. It will be useful to perhaps determine the one area that would best advance IOOS and use of/access to data as a starting point.
  • There is definitely some very good news per commonalities and convergence; we are very close to really useful solutions.
  • Cyber security could perhaps be another issue to add to NASA and others list of challenges.
  • With effective data services and education a lot of the problems can be solved. For example, for the CF terms dictionary, WHOI has developed their own terms and built semantic capabilities that can be widely shared.
  • User Registration and Access:
  • The ocean color community has something similar to NASA’s URS to refine metrics on use and users.
  • Google analytics is also a very good tool to determine how to extrapolate use patterns out to other communities. Developing a common system to track and analyze use patterns may enable building of new tools or applications that enhance access and use.
  • OAUTH is complimentary to the OPENID standard and can authentication upon a request for access.
  • It might be more efficient to simply enable client-appropriate services for users with appropriate metadata rather than studying/documenting user diversity.
  • Standards and APIs: Suggestion to decouple standards for data representation/access from related standards for APIs as there are perhaps too many variations to track. Alternatively, APIs can help reveal problems with data standards and associated services. If the agencies are publishing data in ways that facilitate use by existing or emerging APIs, they may be improving user access at a lower cost.
  • Lessons learned from other programs:
  • NOAA CO-OPS and PORTS have perhaps some very useful lessons learned per ensuring user access to real-time observations.
  • Discussions on Coastal and Marine Spatial Planning, a NOC priority, reveal there is a broad demand for ocean observation data and products in usable, accessible formats and that there are many associated challenges per accessing and mobilizing existing priority data sets. Significant CMSP funds are being invested in data management and there should be opportunities for useful leveraging of shared objectives. Crucial for success here is effective communications.
  • There is a triad of closely related needs to improve data use and access; data format, metadata, and web services. All of these are needed and services are particularly important.

c.Overview of Open Geospatial Consortium (OGC) Web Services and Assessment of their use among meeting participants.

Luis Bermudez, OGC Director of Interoperability Certification, gave a detailed presentation summarizing the OGC’s mission, describing the process for establishing community-based tools and technologies such as web services, reviewing the function and status of OGC’s current catalog of web service specifications (Figure 3). OGC provides a global forum for collaboration on spatial data and products. They are a non-profit, voluntary program that facilitates consensus the develop of standard tools and technologies. Founded in 1984, OGC currently has 438 member organizations, 35 implementation standards and represents a broad user community worldwide.

Luis also conducted an informal survey of the ST meeting participant’s current use and understanding of OGC web services. He received 14 responses:

Page | 1

SOST Interagency Ocean Observation Committee – DMAC Steering Team

Summary Report of January 18-19, 2012 Meeting

  1. NASA JPL (Gierach)
  2. IMOS (Proctor)
  3. ASA (Howlett)
  4. OBIS (Berghe)
  5. EDAC/ESIP (Bennedict)
  6. USGS (Signell)
  7. FL/FWC (O’Keife)
  8. WHOI (Fredericks)
  9. US IOOS (Snowden)
  10. MD/DNR (Trice)
  11. NOAA/NODC (Arzayus)
  12. BOEING (Uczekaj)
  13. SCCOOS (Thomas)
  14. NERACOOS (Bridger

Page | 1

SOST Interagency Ocean Observation Committee – DMAC Steering Team

Summary Report of January 18-19, 2012 Meeting

The survey included the following questions: