Data Sharing Guidelines Working Group (DSGWG)
Engineering Report
GEOSS Architecture Implementation Pilot
Phase 6
Version 0.98
Content developed by the GEO Architecture Implementation Pilot
Licensed under a Creative Commons Attribution 3.0 License
GEO Architecture Implementation Pilot, Phase 6 / Version: 0.98Data Sharing Guidelines Working Group Engineering Report / Date: 2014/05/05
Revision History
Version / Date / Editor and Content providers / Comments0.1 / 2013-DEC-10 / Original template.
0.5 / 2013-DEC-28 / Steven F. Browdy / Initial content taken from Twiki and notes.
0.6 / 2014-FEB-26 / Steven F. Browdy / Expanded content from notes and slides.
0.7 / 2014-MAR-08 / Steven F. Browdy / Expanded content from notes and slides.
0.8 / 2014-APR-02 / Steven F. Browdy / Expanded content from notes and slides.
0.9 / 2014-APR-15 / Steven F. Browdy / Expanded content from notes and slides.
0.95 / 2014-APR-28 / Steven F. Browdy / Near final draft.
0.97 / 2014-MAY-04 / Steven F. Browdy / Added use metrics and licenses details.
0.98 / 2014-MAY-05 / Steven F. Browdy / Finalized use metrics and license details.
Document Contact Information
If you have questions or comments regarding this document, you can contact:
Name / Organization / Contact InformationSteven F. Browdy / OMS Tech, Inc. / IEEE /
Table of Contents
1.Introduction
1.1Scope of this document
1.2Activity key drivers
1.3Summary of AIP-6 efforts
1.3.1Results Summary
1.3.2Recommendations Summary
1.4Future work
2.Activity Background and Objectives
2.1Authentication and SSO
2.1.1Background
2.1.2Objectives
2.2Use Metrics
2.2.1Background
2.2.2Objectives
2.3GEOSS Data-CORE Compatible Licenses
2.3.1Background
2.3.2Objectives
3.Authentication and Single Sign-On
3.1Summary
3.2Use Cases
3.3Trust Gateways
3.4Test Implementations
3.5Results and Next Steps (conclusions and recommendations)
4.Use Metrics
4.1Summary
4.2Use Cases
4.3Implementation Scenarios and Web Services
4.4Test Implementations
4.5Results and Next Steps (conclusions and recommendations)
5.GEOSS Data-CORE Compatible Licenses
5.1Summary
5.2Use of Metadata Standards and Metadata Fields
5.3Use Cases
5.3.1
5.4Test Implementations
5.5Results and Next Steps (conclusions and recommendations)
6.Use Cases
6.1AIP Engineering Use Cases
7.Implementation
7.1Deployed Components
7.2Interoperability Arrangements
7.3Use of the GCI
7.4Demonstration
7.4.1Detailed Storyboard
7.4.2Demo video
7.5Future plans for deployment
8.References
List of Figures
Figure 1: Prototype SSO Federation for GEOSS......
Figure 2: Use Metrics Component Deployment Strategy......
Figure 3 – GEOSS AIP Use Case Summary Diagram......
List of Tables
Table 1: AIP-6 DSGWG Contributors
Table 4 – GEOSS Actors
Table 5 – Publish Resources Use Cases
Table 6 – Discover Resources Use Cases
Table 7 –Visualize and Access Use Cases
Table 8 – Process and Automate Use Cases
Table 9 – Maintain and Support Use Cases
Data Sharing Guidelines Working Group
1.Introduction
1.1Scope of this document
The AIP-6Data Sharing Guidelines Working Group (DSGWG)Engineering Report (ER) provides information related to the research and development of three main areas of effort:
- Authentication and Single Sign-On
- Use Metrics
- GEOSS Data-CORE Compatible Licenses
For each of these areas of interest, the research was used to focus the effort and produce results that can be made operational in the GCI at some point in the near future. The scope of Authentication and Single Sign-On was restricted to authentication only, but may include authorization in the future. The scope of Use Metrics was restricted to those metrics that do not, in any way, conflict with or jeopardize the privacy of GEOSS data users or their organizations. The scope of GEOSS Data-CORE Compatible Licenses was restricted to only those open access licenses and waivers identified by the GEO Data Sharing Working Group (DSWG) and approved by the GEO Plenary.
1.2Activity key drivers
The key driver for Authentication and Single Sign-On is to allow GEOSS Data Providers that require registration and login to participate in GEOSS without putting an undue burden on the GEOSS Data Users to register multiple times and execute the login process repeatedly when trying to discover and access GEOSS resources. Therefore, the goal of having a GEOSS-wide federation for single sign-on was established.
The key driver for Use Metrics is to gather information about GEOSS resources being discovered and accessed so that GEOSS Data Providers and GEO can gain knowledge as to the use and implied value of GEOSS resources. These metrics are viewed as essential feedback to the value of GEOSS.
The key driver for GEOSS Data-CORE Compatible Licenses is to support the GEOSS Data Providers that may have licenses associated with their resources, and who wish to have that licensing information be available to GEOSS users when they discover the licensed resources. The license information should be part of the metadata so that redistributions of the resources with the metadata will also make the licensing information available to the recipients of the redistributed resources.
1.3Summary of AIP-6efforts
The AIP-6 Data Sharing Guidelines Working Group (DSGWG) is responsible for the efforts related to User Authentication and Single Sign-On (SSO), Use Metrics, and GEOSS Data-CORE Compatible Licenses. These topics have each been identified as priority actions for 2013 by the GEO Infrastructure Implementation Board. Additionally, the Authentication and SSO effort, as well as the Use Metrics effort, have been identified as 2013 GEO Summit priorities. In addition to this AIP Engineering Report, the efforts of the AIP-6 DSGWG will result in contributions to data user and data provider guidelines and tutorials. These guidelines and tutorials will ultimately be published on the GEOSS Best Practices Wiki.
The work engaged in during AIP-6 enjoyed contributions by many individuals on behalf of many organizations. These efforts are captured in Table 1.
Table 1: AIP-6 DSGWG Contributors
Name / Area of Contribution / AffiliationSteven F. Browdy / SSO, Metrics, Licenses / OMS Tech, Inc.; IEEE
Andreas Matheus / SSO / Secure Dynamics; EC COBWEB Project
GEO Data Sharing Working Group (DSWG) / Metrics, Licenses / GEO DSWG
Siri Jodha Singh Khalsa / Licenses / NSIDC, GEO SIF
Alva Couch / SSO / TuftsUniversity, CUHASI
In particular, the AIP-6 DSGWG offers a special thanks to the European Commission’s FP-7 funded COBWEB project. COBWEB contributed greatly to the Authentication and SSO effort by establishing a proof-of-concept SSO federation and by assisting others in deploying the software necessary to join the COBWEB federation.
1.3.1Results Summary
1.3.1.1Authentication and SSO
The primary result is the establishment and demonstration of a SSO federation as a proof of concept. This federation was established by the COBWEB project and included both COBWEB participants, as well as some outside participants. The federation was established as a SAML-2 federation that accepts certain OpenID visitors outside the SAML-2 federation. This is accomplished via a trusted gateway that takes trusted OpenID visitors and allows them to be seen as SAML-2 users within the SAML-2 federation.
Of all the use cases to be realized for Authentication and SSO, most have been implemented. The use cases, identified for AIP-6, that still need to be implemented are:
- Identification as "GEOSS User" During Registration
- OpenID-Protected Data Access via SAML2 Authentication
1.3.1.2Use Metrics
The use cases for this effort were written during AIP-6, and have been reviewed by the GEO DSWG. The DSWG also provided input regarding the specific metrics themselves. The agreed upon metrics do not capture any information related to identifying the individual GEOSS user or his associated organization. This is done to protect privacy. The fields of information to be collected are:
- Data provider name
- Dataset accessed
- Date/time of access
- "tags" were associated with the dataset (geossDataCore, geossAttribution, etc.)
- License used for the dataset (N/A if none were used)
- Cost to the GEOSS user for the dataset
- Optional additional information
None of the use cases for Use Metrics were implemented during AIP-6 due to no AIP-6 participants volunteering to do this.
1.3.1.3GEOSS Data-CORE Compatible Licenses
The licenses addressed during AIP-6 were identified for use by the DSWG and approved by the GEO Plenary. The licenses available to be used are just for GEOSS Data-CORE resources. This is due to the facts that the GEOSS Data-CORE only consists of registered resources that have no restrictions on use, and the current licenses and waivers accepted for use within GEOSS are targeted for open access, where no restrictions are allowed. The DSWG is currently working on a broader set of licenses that will handle more GEOSS Data Providers, specifically those that do not satisfy the requirements for the GEOSS Data-CORE.
The license fields currently defined to carry the licensing information about the resource will also be used for a broader set of licenses, but more fields may be added. These fields are:
- License/waiver name - this will be a text field containing the common name of the license or waiver (CC0, CC-BY, etc.).
- License pointer - this will be a URL field that links to the actual license or waiver.
- License logo - this will be a URL field that links to a graphical logo for the license or waiver.
- Attribution text - this will be a text field that contains the actual words to be used by users/applications for attribution.
The defined license fields need to be realized as metadata fields for the GEOSS Data Provider to populate. This will allow the licensing information to be exposed to GEOSS users visually, as well as programs that automate metadata processing. The metadata standards that currently have the above license fields mapped to them are ISO-19115 and Dublin Core. There are more metadata standards that still need to be addressed.
1.3.2Recommendations Summary
The following key recommendations are based upon the AIP-6 work completed by the DSGWG.
1.4Future work
There are always successes and challenges that result in major efforts. AIP-6 was successful at maintaining a high-level of participation in the plenary telecons. It also exhibited good leadership for all working groups and overall oversight. However, for some working groups, it is difficult to keep individual working group members engaged in their voluntary contributions, especially after the summer. It is also imperative to improve the education and outreach associated with AIP results. Most observers at the AIP results meeting during each annual Plenary week are AIP participants, where there ought to, instead, be a good number of people not involved in AIP. It is through the non-AIP observers that AIP results could more easily spread throughout all of the GEO tasks, committees, task forces, and other ad-hoc groups.
2.Activity Background and Objectives
The GEO Data Sharing Working Group (DSWG), previously known as the Data SharingTask Force, has developed a Data Sharing Action Plan[1]and a set of ImplementationGuidelines[2]that reflect the GEOSS Data Sharing Principles[3]. These guidelines covermany areas. Those areas suggested as 2013 priorities by the Infrastructure ImplementationBoard, and to be covered in AIP-6, include authentication and single sign-on (SSO), GEOSS Data-CORE[4] compatible licenses for access and use of data, and data access and use metrics. These areas of effort made initial progress in AIP-5, but were not completed.
Although GEOSS strongly encourages full and open exchange[5]of data, there areinevitable instances where data providers will require user registration and adherence tocertain data access conditions. User registration will involve user identification, authentication, and SSO,while licensing will involve those licenses compatible with the GEOSS Data-CORE. Data providers may require that the data accessed by data users be used in only certainways. It is important to understand that for data made available through the GEOSS, andassociated with licenses supported by the GEOSS Common Infrastructure (GCI), these licenses apply to, and travel with,the data only, not the service that provided the data.
The implementation of authentication and SSO, GEOSS Data-CORE compatible licenses, and data use metricsmust satisfy the interoperability framework that is realized by the GCI. This will becaptured in the Use Cases for each activity.
2.1Authentication and SSO
2.1.1Background
Although GEOSS strongly encourages full and open access to, and sharing of, data, there are inevitable instances where data providers will require user registration and login. This is the initial step in being able to perform authentication, authorization, and single sign-on (SSO). User registration is a way for data providers to accomplish two main goals: 1) controlaccess to data, and 2) record information regarding the use of the data. In both cases, themechanism used can be applied once or applied each time data access is requested. Sinceone of the goals of the GEOSS is to provide full and open exchange of data, minimizingthe impact on data users to access and use data is a primary objective. A focus of AIP efforts is to have user registration required once, resulting in some kind of digital identificationto be used repeatedly by the registered data user in a SSO scenario. Thisstrategy may have an expiration date associated with the digital identification, requiringsome sort of renewal process.
Within the GEOSS, SSO is an authentication model that allows a user to supply an identity token only once to successfully login to one or more services. To realize this authentication model, the GEOSS Web Portal or client application must be able to pass along an identity token for authentication by a GEOSS data provider, and GEOSS data providers must recognize when an identity token is received from another GEOSS data provider and honor it. This requires that data users must acquire an identity token and that data providers implement the means to perform authentication and pass the identity token to other GEOSS services, if necessary, to fulfill a GEOSS data request.
The intended management and implementation of GEOSS user authentication is via single sign-on. Some examples of open source solutions for thisinclude OpenID and SAML-2. Suggestions for user registration, authentication, and SSO, as expressed by the GEO DSWG include:
- Impose as light an impact as possible on the data providers and the GCI.
- GEOSS users should utilize an ID service, e.g. OpenID or Google.
- GEOSS users should be able to be identified as a “GEOSS User.”
- The GCI should not be responsible for holding or managing any personal userinformation.
Previous AIP efforts concluded that Shibboleth was too heavy a burden on data providers, so AIP-5 focused on OpenID only. OAuth was dropped from consideration since the focus was on authentication and not authorization. In AIP-6, in large part due to the participation of the COBWEB project, SAML-2 was added to the implementation plans for a GEOSS authentication and SSO solution. The GEOSS solution for authentication and SSO will be realized as a GEOSS-wide federation.
OpenID is an open and decentralized framework for authenticating users with the samedigital identity on different web sites. There are many organizations offering OpenIDprovider services, but the ones that have been accepted as trusted services by the U.S. Government are Google and Verisign. Since the European Commission’s INSPIRE directive takes no position on trusted OpenID services, the current decision is to use the U.S. Government’s list as the basis for the GEOSS authentication and SSO solution.
Security Assertion Markup Language 2.0(SAML-2) is a version of the SAML OASIS standard for exchanging authentication and authorization data between security domains. SAML-2 is an XML-based protocol that uses security tokens containing assertions to pass information about a principal (usually an end user) between a SAML authority, that is an identity provider, and a web service, that is a service provider. SAML-2 enables web-based authentication and authorization scenarios, including SSO.
2.1.2Objectives
The objectives for AIP-6with regards to user authentication and SSO are:
- Finalize the use cases for a GEOSS-wide SSO federation based upon OpenID and SAML-2 authentication mechanisms.
- Establish a prototype SSO federation with the COBWEB project and other participants, demonstrating the developed use cases.
- Generate appropriate documentation to assist data providers in implementing SSO.
- Generate tutorial documentation for user registration, authentication, and SSO for placement in the Best Practices Wiki and distribution to data providers and data users.
- Produce a demonstration video, utilizing a prototype federation, exhibiting the feasibility of GEOSS SSO.
2.2Use Metrics
2.2.1Background
The Data Sharing Principles ImplementationGuidelines address use metrics. The GEOSS use metrics are agreed upon pieces of information that are collected during data discovery and access. This information can be provided by GEOSS Data Providers and by GCI components, such as the GEOSS Web Portal and the GEOSS Discovery and Access Broker, and centers around the particulars of what data is being accessed by GEOSS users. The metrics collected should be able to be enhanced from time to time, and should not include any user IDs, or any other information that compromises privacy for GEOSS users or their organizations.
It will be a suggested functionality for data providers to send the acquired metrics to the GCI. The actual metrics requested for the GCI, and the mechanisms employed by the data providers to gather and provide the recommended metrics to the GCI were initially addressed in use cases considered during AIP-5. The list of metrics to be gatheredhave been reviewed by the DSWG for appropriateness.
In order to differentiate between public access to a data provider’s resources and GEOSS user access to those resources, GEOSS users should be able to be identified as an official “GEOSS User.” The question then becomes, what is a GEOSS user? Since GEOSS users can access data from a data provider without utilizing the GCI, and in those situations they are just like the public, there are two ways in which a data provider can recognize a “GEOSS User.” One way is for the data provider to recognize a request for data via the GCI; e.g. the GEO DAB. The second way is if the data provide requires login, in which case the user can be identified as a “GEOSS User” via the GEOSS SSO federation. In the course of recognizing a user as a “GEOSS User,” the GCI should not be responsible for holding or managing any personal user information.