NERP Data and Information Management Guidelines v.1

Contents

Introduction 2

Definitions 3

Guidelines 4

Guidelines applicable to all research output types 4

Websites 4

Digital Repositories 4

Licensing 5

Metadata 5

Data management plans 6

Digital object identifiers 7

Product type specific expectations 8

Publications 8

Data 10

Spatial data 11

Grey Literature 11

Images, photographs and videos 12

Attachment A: Guiding principles for an open access approach to the outputs of publicly-funded research 13

Attachment B: Recent international and national open access developments 15

Attachment C Stages of the publication process 19

Attachment D: Internet tools to check journal licensing arrangements 20

Attachment E: Suggestions for best practice publication and management of grey literature 22

Introduction

This document provides guidance on the expectations of the Department of the Environment about making outputs from the National Environmental Research Program (NERP) ‘publicly and freely accessible and available to government, end-users and the general public, preferably by electronic means’ (as required by the NERP Programme Guidelines).

This document follows on from and replaces the Interim Guidance on Data and Information for the National Environmental Research Program (September 30, 2011) (Interim Guidelines) and provides the completed guidelines for data and information management in the NERP. This document is to be read in conjunction with the National Environmental Research Program Guidelines (NERP Guidelines) and any relevant NERP funding agreements.

Government agencies and research communities have identified the need to promote open access to public sector and publicly funded information. The Australian Government's position is that information funded by the Government is a national resource that should be managed for public purposes. Open access to Government funded information is the default position of the Department with exception only for privacy, security or confidentiality reasons.

These guidelines are consistent with national and international principles and practices. Attachment A provides a distilled set of principles based on current international best practice that underpin these guidelines. In the last five years there have been major developments in making government and research information free and publicly accessible. Attachment B outlines the key developments in Australia and overseas.

Closer to home, the Department of the Environment Information Strategy 2013-2017 states:

‘Open access to information is the default position of the department, with exception only if required for privacy, security or confidentiality reasons.’

The Australian Government Digital Continuity Principle 4 encapsulates effective open access - where digital information is discoverable, accessible and useable. ‘Digital information is discoverable when is can be easily found. It is accessible when it can be easily retrieved and read in context and it is usable when it can be easily evaluated or understood, edited, updated, shared and reused as appropriate by those who need it’ (Digital Continuity Principles, National Archives of Australia).

Providing open-access to the data and information products derived under the NERP will provide up-to-date, high quality data and information to decision-makers, environmental managers, other scientists and the general public. This will increase the opportunity to take a more collaborative, informed approach to dealing with and managing Australia's environment.

Definitions

‘Commercial’ means primarily intended for or directed towards commercial advantage or private monetary compensation. The exchange of the Work for other copyright works by means of digital file-sharing or otherwise shall not be considered to be Commercial, provided there is no payment of any monetary compensation in connection with the exchange of copyright works (Source: AUSGOAL Creative Commons NonCommercial 3.0 Australia Licence,18 September 2014).

‘Data’ are individual pieces of information.

‘Grey literature’ literature produced and disseminated outside of commercial publishing. In the NERP context this includes fact sheets, project profiles and reports.

‘Metadata’ is contextual information that supports:

·  discovery

·  assessment

·  access

·  re-use

·  verification

·  integration, synthesis, aggregation, etc.

·  curation.

For the purposes of this report we define information products to include publications, reports, data, software, models, algorithms metadata etc., in the knowledge that here the focus is on the (NERP) outputs required to validate the research outcomes. Throughout this document these products will be referred to as ‘information’.

'Openly available’ and ‘open access’ refer to the making of information available at minimal cost under licensing terms and in formats that allow users to re-purpose the information from its original form. This is consistent with the Australian Government Principles on open public sector information developed by the Office of the Australian Information Commissioner.

‘Open format’ is a specification for storing and manipulating content, that is usually maintained by a standards organisation. In contrast, a proprietary format is usually maintained by a company, with a view to exploiting the format by incorporating it into other products they sell, such as software. Open formats are critical to the effectiveness of the 'open access' concept. Information and data published using an open format ensures that users, regardless of their operating system or platform will be able to access information (Source: AusGOAL, 3 March 2014).

'Publicly available' and 'self archiving' means placed on an internet site which is accessible to the public and discoverable by internet search engines such as Google Scholar.

‘Work’ may include (without limitation) a literary, dramatic, musical or artistic work; a sound recording or cinematograph film; a published edition of a literary, dramatic, musical or artistic work; or a television or sound broadcast. It means the material (including any work or other subject matter) protected by copyright which is offered under the terms of a Licence.

The term 'embargo' refers in this document to the period of time before which a journal article or data sets can be made publicly available on the internet. Please note that this is different from a media embargo.

Guidelines

These guidelines apply to all products generated from NERP research. Discussion with research hubs on their measures to manage data and information revealed that a great range of research products are being generated. For the purposes of discussing guidelines, research products are categorised as follows:

·  publications including scientific papers, reviews, books etc.

·  raw data sets including spatial data

·  grey literature including fact sheets, project profiles and technical reports

·  images, maps, photos, videos

·  models and other tools (e.g. Decision Support Tools) such as software created by the research process - including value added components developed for off the shelf or open source software.

Legislation supporting open access to Australian Government information (The Australian Information Commissioner Act 2010 and The Freedom of Information Amendment Reform Act 2010) was enacted in the same year the NERP Guidelines were published. Requiring all research products to be freely and publicly available is challenging due to the variety of outputs covered and the fact that there is no neat ‘one stop shop’ of standards or tools for research institutions to adopt to meet these requirements. Rather than prescribe specific standards, these guidelines are intended to provide more detail on expectations and assist institutions to identify solutions that will achieve open access to NERP research products.

Some guidelines apply to all product types and some others are specific. The information management discipline has typically managed many different types of products in quite different ways, reflecting the specialised nature of some product types. For instance the Geographic Information System industry has dominated the establishment of metadata and format standards of spatial data. The publications industry has specific standards and limitations. Therefore it is appropriate to split the presentation of guidelines into those that apply to all product types and those that are specific.

Guidelines applicable to all research output types

Websites

Researchers are required to make all NERP research products (including data and information products generated by the program) publicly available on websites with a persistent or permanent link (see AusGOAL framework). Information should be published in accordance with the Web Content Accessibility Guidelines version 2 (WCAG 2.0) endorsed by the Australian Government in November 2009. This is consistent with principles on open public sector information developed by the Office of the Australian Information Commissioner.

Digital Repositories

It is expected that researchers will take all reasonable steps to deposit research products in an appropriate subject and/or institutional repository. Metadata and the output should be stored in an open format together in a way that clearly shows how they are linked. The principle here is to ensure that the output is still usable if a certain program is superseded or unavailable. Outputs can also be stored in proprietary formats. As a guide, AusGOAL’s website identifies a range of open formats. The principle is that the form chosen should facilitate reuse and value-adding.

All Australian Universities have repositories with the potential for providing access to research outputs. Researchers with institutional affiliations can typically contact their university library for more information and assistance on how and what to deposit.

National and international infrastructure also exists in specific discipline domains.

Licensing

NERP Hub funding agreements require all research products to be made publicly available under a Creative Commons licence, specifically the Creative Commons Attribution- 3.0 Australia Licence (CC BY 3.0 AU). The Office of the Australian Information Commissioner and the Council of Australian University Librarians endorsed the Australian Government Open Access and Licensing Framework (AusGOAL) framework for the licensing of research data. This has established a common approach to data licensing across research and government, facilitating use and reuse of data for further innovation and research. This is consistent with the Department’s Information Licensing Policy.

Using the AusGOAL framework establishes clarity around the permissions, terms and conditions for reuse of the data within and across universities, and to the wider research community. This also reduces risks and enhances efficiency by standardising the number and type of licence formats used.

The AusGOAL framework includes six individual Creative Commons licences, one restrictive licence, and one software licence. Guidance to select the most appropriate licence is available on their website. The preferred licence from the NERP perspective is the most open licence (CC BY 3.0 AU). When best practice is adopted, licensing conditions are placed on products when published online.

Metadata

It is expected that researchers will take all reasonable steps to attach high quality metadata for all NERP outputs produced with NERP funding, particularly for publications and data. The metadata format can vary to suit the type of product but must be an accepted and (preferably) up to date standard.

For products to be discoverable, the metadata standard must support consumption by web search engines or discovery facilities. The aim is for metadata to be managed in a way that maximises discovery of the research product. The Australian National Data Service (ANDS) provides Research Data Australia (RDA), which is a discovery service for Australian research data that provides access to thousands of research datasets from Australia and around the world. By registering information once with ANDS, contributing organisations get coverage in many diverse systems including WorldWideScience.org, Thomson Reuters Data Citation Index and many others.

Domain specific repositories and registries are also applicable to NERP research outputs, including the Terrestrial Ecosystems Research Network, the Integrated Marine Observation System, and the Atlas of Living Australia.

The use of standards in metadata structure as well as in the concepts in data and metadata values makes the data much more valuable by allowing it to be integrated with other data. Recipients of NERP funding should be guided by the National Environmental Information Infrastructure (NEII) framework in this regard (http://www.bom.gov.au/environment/activities/infrastructure.shtml).

Metadata not only needs to facilitate discovery of the product but also support immediate access to the product or via contact details that provide a distribution service.

Characteristics of high quality metadata vary from product type to product type. There are plenty of sources of information in the literature on characteristics of high quality metadata. Metadata is also critical to making the data usable by providing contextual information that enables the user of the data to decide if it is appropriate and to use and interpret the data consistent with the quality, content and scope of the data set.

Data management plans

Data management planning is essential to achieve open access as the delivery of research outputs involves clear objectives and the coordination of many activities. The following outlines data management planning in terms of overall objectives, roles and responsibilities of individuals and institutions using both national and international best practice:

Data management plans need to consider:

·  data accessibility – the default position

·  provisions when data cannot be shared for commercial, privacy or other reasons and which may then be subject to embargo periods, the need for de-identification or mediated access

·  usability – factors that will affect the ability of others to reuse the data (format, standards, descriptions required in the metadata etc.)

·  citability: all data should have a permanent identifier such as a Digital Object Identifier (DOI)

·  retention periods and plans for the disposal of data

·  the role of a data management plan in the institutions information governance framework.

Data management plans need to be supported by infrastructure, such as:

·  funding allocation for data management – from the initial data capture through to ongoing curation, funding allocation for support services must be made explicit in the plan

·  IT Infrastructure: the hardware, software and other facilities which underpin data-related activities

·  support services: people and other means of providing advice and support, such as web-pages

·  metadata management: so that data records can be used for both internal and external purposes.

The related policy setting needs to include:

·  principles for open access (see UK example[1])

·  data infrastructure requirements for institutions (see, for example, the expectations of the UK Engineering and Physical Sciences Research Council)[2]

·  data licensing intentions

·  strategies for measuring compliance: e.g. requirement for a data management plan, data dissemination strategy, evidence of availability.