Collaborative Session Notes 2017 NOAA EDM Workshop

Session 5C:
NOAA Big Data Project

Primary note taker: Tyler Christensen

Notes:

James Stevenson - IBM

Jed Sundwall - Amazon

Andy Bailey - BDP and NWS

Ed Kearns - BDP

Mohan Ramamurthy (UNIDATA Director) described relationship of Unidata to the weather community and interest in BDP primarily via non-profit partner at Univ of Chicago.

NWS employee on detail to Big Data Project (Andy Bailey).

Proposes addition of NOAAPort, MADIS, NDFD as collections to be made accessible via BDP partners.

Discussion / Q&A

Schweitzer (consulting company) -- how does the community participate in choosing the datasets that are released? Is there a way for folks to request datasets? He had to do the work himself to download, process, tile a dataset. Kearns-- approach the project to talk to collaborators, then they would request the dataset. Schweitzer-- last year, asked for a dataset but didn’t

Erin Robinson, ESIP federation-- loves the efficiency that comes with sharing the work on the “undifferentiated heavy lifting.” How do the BDP partners work together? Kearns-- four competitors, are there barriers and obstacles? Sundwell-- We are all moving very fast, can see the work that others are doing. Ultimately, we want to work together, but hinders our ability to move quickly. All of us are offering some pro bono services, so if attorneys get involved when working with other corporate partners. Stevenson-- IBM is open to collaboration, but should be based on use cases. We do work with AWS on some corporate services. Would need a use case to support why it’s good to work together. Kearns-- NOAA’s role is to make the introductions to ALL of the partners.

Helen Wood, NESDIS. Supporting the systems engineering part of NESDIS. Curious about any concerns about potential liability, of using public data and maintaining its integrity. What happens if a user entails a loss as a result of using data that might not reflect back to the original government source. Stevenson-- we haven’t gotten to that problem yet, sophisticate use case that drives outcomes for clients. Normally approach this through contracts with clients, liability clauses are part of normal business function. Could be some unique challenges for working with public data-- haven’t reached this point yet, but could happen pretty easily. Sundwall-- data that AWS provides now is released as-is. Data providers OWN the data buckets and the liability. Terms of use and SLAs are part of the business agreement.

Dave Neufeld, NCEI. Question for Ed Kearns, exciting project and good opportunities for novel research. How does NOAA plan to make the environment available back to NOAA researchers-- providing space for analysis. Kearns-- separate issue being pursued by OCIO, existing agreements allow NOAA to purchase cloud space. Processing and analysis is not part of the project, this is focused on releasing data.

Question about the CRADA. When agreement was drafted, does that go through general council or partnerships office Kearns-- took months, several reviews signed by l

Marten Hogeweg, ESRI. Lots of effort spent on making data discoverable. Do NOAA entries in data catalogs link to cloud sources. Kearns-- NEXRAD metadata on data.gov does have links to AWS and OCC. Result of NCEI making sure links are there.

Christina Horvath, radar ops in Norman OK. Plan nexrad changes years in advance. Plans to add Level 3 data to BDP, is that still planned? Kearns-- in early days of the BDP level 3 was part of the conversation. Pause in the discussion once Level 2 was released. Multi-radar/Multisensor dataset might be the next focus, known demand but IP issues. No business case was made to release the Level 3 data.

Question from remote attendee: Are there any projections of how much NOAA data can be successfully exploited by big data providers? Like anything, not all data can be easily monetized, so based on that how are the different BDPs approaching the prioritization of data acquisition?

Answer (provided online only): Everything is on the table -- but how many of the data make "business sense" in this framework is an ongoing discovery. Ocean data have been much harder to find applications and movement than weather data, for example. So this is varying, partner by partner and application by application....