10marine science

research data Things

UseRepurposeAdaptChange

10marine science research data Things is a self-paced learning program that providesan opportunity to explore issues surrounding management of research data, specifically for people working with marine science data.

This program was developed from the 23 (research data) Things program and the extensive ANDS resources and materials related to research data management and re-use.

Australian National Data Service


Table of Contents

Ideas to reuse and repurpose these activities

Thing 1: Getting started with research data

What is research data?

Data in the research lifecycle

Importance of managing public sector open data

Thing 2: Issues in research data management

Research Data Management in Practice

How do you manage “Big Data”?

Thing 3: Data Sharing

Introduction to ‘open’, ‘shared’, and ‘closed’ data

Data sharing practices

Sharing Sensitive Data

Thing 4: Data discovery

Exploring data repositories

Finding data repositories

Evaluating data repositories

Thing 5: What are publishers & funders saying about data

Learn about journal data policies

Data Journals

Data sharing policies of funders

Thing 6: Describing data: metadata and controlled

vocabularies

Metadata: your new best friend

Controlled vocabularies for data description

Thing 7: Data citation for access & attribution

Getting more out of your citation

Data Citation Principles

DOI’s are unique (just like you)

Thing 8: Licensing data for reuse

The cans and cannots of licensing

Licensing for data reuse

Thing 9: Data Management Plans

An introduction to Data Management Plans

Templates for Data Management Plans

Thing 10: Dirty Data

Dirty data

Why do you need to manage your research data?

Effective research data management of marine science data is increasingly recognised as a critical part of the research process. It enables:

●Trust in data you obtain for reuse from other sources

●Reproducibility of research through increasing veracity of data

●Increased quality of your research

●Strengthening of researchers’ reputation through increased citations and reach of all research outputs

●Increased connectivity between all research outputs, and researchers

●More efficient use of scarce research funds

●Data description for sharing and collaboration

●Reduced risk of loss or corruption of data

How can I work through these Things?

●All Things have 1 to 3 Activities. You can pick and mix from the Activities to suit your interests.

●You can do as much or as little of the Things and Activities as you want to do, or need to know.

●Some of the Activities are intended as an introduction to a topic, and some delve a little deeper. Choose what interests you and suits your experience.

●You can work through Activities on your own at your own pace, or in a group.

Ideas to reuse and repurpose these activities

This material is licenced with a CC-BY licence, meaning that you can use, repurpose, adapt, or change it to suit your needs.

Please note: this is a snapshot in time: research data as it was in 2017 - you may need to check resources and update resources and links to include more recent initiatives and policy changes.

Thing 1: Getting started with research data

Research data comes in many shapes and sizes and its management changes over time. Kick off your research data journey by exploring different types and forms of research data and how they fit into the research lifecycle.

What is research data?

What "research data" are we talking about?

  1. Read anIntroduction to Research Datafrom Boston University
  2. As we have just seen, research data can come in many forms. Some of these are human readable, and some are machine readable.Open upthis CSIRO record of research data collected during a CSIRO voyage which explored the seafloor (i.e. Benthic zone) of the Marmion Lagoon, located just off Perth, in 2007
  3. Click on the Files tab to see the rich variety of data formats contained in this one research data collection
  4. Have a look around theCSIRO Data Access Portal and see what different formats data comes in.
  5. What are some of the data tools available at the IMOS (Integrated Marine Observing System)? What type of tool would you need to use to collect your research data?

Consider: how the complexity and range of data formats affect access and re-use possibilities.

Data in the research lifecycle

Data often have a longer lifespan than the research project that creates them. Follow-up projects may analyse or add to the data, and data may be reused by other researchers.

A data lifecycle shows the different phases a dataset goes through as the research project moves from "having a brilliant idea" to "making groundbreaking discoveries" to "telling the world about it"

  1. Take a look at either: one of the links below:
  2. UK Data Archive Research Data Lifecycle(if you are new to this concept)
  3. DCC Curation Lifecycle Model (if you are familiar with this concept)

Consider: have you been through all of the steps outlined in this lifecycle? If not, which ones are new to you?

Importance of managing public sector open data

Managing research data well provides many benefits to Australia’s economy and the community. Review two or more of the following documents:

1.Take a look at what the Productivity Commission’sreport on the inquiry into the benefits and costs of options for increasing availability of and improving the use of public and private sector data by individuals and organisations.

See section on Findings and Recommendations. What are the findings and recommendations on how we can improve access to research-oriented data?

2.National Marine Science Plan 2015-2025: What are some of the recommendations with regards to data managementhighlighted in the document?

3.Review the Western Australian Government Open Data Policy.Why is it critical to build and share critical marine data?

4.Browse the 16 stories about the real-life benefits of Australian research data in the #dataimpact ebookpublished by the Australian National Data Service (ANDS).

5.Read the CSIRO’s Understanding and unlocking the value of public research data: OzNome social architecture report

Thing 2: Issues in research data management

Research data is critical to solving the big questions of our time.So what are some of the issues we face in managing research data?

Research Data Management in Practice

Researchers have responsibilities with regards to managing their research data. Governments and universities all around Australia and the world are now encouraging researchers to better manage their data so others can use it. Research data might be critical to solving the big questions of our time, but so much data is being lost or poorly managed.

1.Review the Policy Statement on F.A.I.R. Access to Australia’s Research Outputs.

What does F.A.I.R. mean?

2.Take a look at the Australian Code for the Responsible Conduct of Research Section 2 onManagement of Research Data and Primary Materials. (Note: Code is currently under review) What are some of the researcher’s responsibilities identified in the code?

3.Scan through this guide to Research Data Management in Practice (PDF, 0.74 MB). Look carefully at Figure 1 Key Steps in Research Data Management, Section 3: Steps in Research Data Management.

4.This 4.40mins cartoon put together by the New York University Health Sciences Library, is about what happens when a researcher hasn't managed their data (at all…). As you watch the cartoon, jot down the data management mistakes made by the researcher.

How do you manage “Big Data”?

"Big Data" is a term we're hearing about with increasing frequency. Data management for Big Data brings much complexity - citing dynamic data, software, high volume compute, storage costs, transfer of petabytes of data, preservation, provenance, more.

1.Read this post and presentation titled: "Big Data: The 5Vs Everyone Must Know.
This article uses 5V's: volume, variety, velocity, veracity and value as a concept for how big data can be managed more successfully.

2.Consider whether the concept of 5Vs is useful to support better management and reuse of marine science “Big Data”

3.Scan through the Western Australia’s Blueprint for Marine Science Initiative (Implementation Strategy 2016-2018) and review some of the data priorities (p.17) identified in the document.

4.The Pawsey Supercomputing Centre located in Perth, Western Australia supports researchers across Australia with an array of capabilities encompassing supercomputing, data and visualisation services. Review some of the interesting research projects that involve the use of big data undertaken at the centre.

5.Read more about the data storage services available at Pawsey.

Thing 3: Data Sharing

Data may be shared in many ways. Here are ways that data can be shared and is currently being shared.

Introduction to ‘open’, ‘shared’, and ‘closed’ data

When we explored Research Data Australia in Thing 3, you may have noticed that not all the data described was available for immediate access. This activity explains why different datasets may have different access conditions.

1.Watch this 2.5 minute videofrom the Open Data Institute titled Open/Closed/Shared: the world of data.

2.Now open this ANDS open data webpage to see a more in-depth view of why data is sometimes open, shared or closed.

3.If you have time, go to Research Data Australiaportal and try searching for data that is 'open'. Hint: Look for the option to limit your search to data that is Publicly accessible online.

Consider:Why more data isn’t publicly accessible or more ‘open’?

Data sharing practices

Repositories are one means by which research data may be shared but in order to get data into repositories, research teams must be willing to publish their data: there are huge differences between data sharing practices by country and by discipline.

1.Take a look at this 2014 infographic from Wiley titled Research Data Sharing Insights [PDF, 2.08MB]. It provides a succinct overview of current data sharing practice and perceptions.

Research Data Sharing Insights (Wiley, 2014)

2.Now look closely at the sections titled 'Global Data Sharing Trends' and 'Data Sharing By Discipline'.

Consider:Why do you think there are differences between disciplines and countries - what changes to these statistics would you expect between 2014 and now?

Sharing Sensitive Data

Sharing sensitive data requires careful consideration, but it can be done. Find out how.

Major, familiar, categories of sensitive data are Human data (eg health and personal data, secret or sacred practices), Ecological data (may place vulnerable species at risk) or data of a sensitive project. Given the nature of this type of data, you might expect that it can’t be shared and reused. But in many cases, it can be.

Explorethe following examples of published sensitive data:

1.Dharmae: Valuing Coast project data collection

2.Reef fish life history, 1995-2010 (MTRSF 4.8.3, JCU)

The above records on Research Data Australia show how sensitive, de-identified data can be safely and openly shared. Click on “Go to Data Provider” to learn more on how you can access the datasets.

How do you share and publish sensitive data?

1.Browse through the ANDS sensitive data webpage.

2.Click on the Sensitive Data Decision Tree image to get an overview of issues and solutions.

3.Follow a couple of the links on the sensitive data page which are of particular interest to you.

4.Review the below policy on management of sensitive ecological data:

  1. Australian Government: Department of the Environment – Sensitive Ecological Data – Access and Management Policy

Thing 4: Data discovery

Here we explore various data portals and repositories.

Exploring data repositories

Repositories enable discovery of data by publishing data descriptions ("metadata") about the data they hold - like a library catalogue describes individual materials held in a library. Most repositories provide access to the data itself, but not always. Data portals or aggregators draw together research data records from a number of repositories, e.g. Research Data Australia (RDA)aggregates records from over 100 Australian research repositories.

1.Click on this RDA record from the Australian Antarctic Data Centre: Weddell seals in Antarctica

2.Have a close look at the record to see the ways the Australian Antarctic Division has made this record discoverable and accessible. Note how many times this dataset has been cited and how to cite this data. Spend a few minutes exploring RDA:

  1. Try browsing by subjects (or searching on a topic of interest)
  2. See which organisations contribute metadata records to RDA.
  3. Explore a record or two in depth.

Finding data repositories

What other data repositories exist and how else are Australian researchers sharing their data?

1.re3data is another data portal that lists 1,850 research data repositories including those from Australia.

  1. Spend a few minutes exploring re3data
  2. Click on Browse > Browse by subject > click on “Natural Sciences”
  3. Explore the range of repositories listed under “Geosciences (including Geography)’>Atmospheric, Science and Oceanography>Oceanography. Can you find one relevant to your research?
  4. Click on Browse > By Country > click on Australia in the map
  5. There are a surprising number of data repositories listed for Australia. Does this present all the research data repositories Australia has to offer?Is there anything missing?

Consider: one idea for how you think improved discovery of Australia's research data repositories, and the data records they contain, could be achieved.

Evaluating data repositories

What makes a "good" data repository? There is much debate about trusted repositories and other ways of evaluating repositories - including data repositories.

Have a look at one or both of the resources below:

1.DCC Checklistfor evaluating data repositories

2.COINAtlantic Data Accessibility Benchmark tool

Consider: your experiences or thoughts on evaluating data repositories: have you used either or both of these resources? Would you?

Thing 5: What are publishers & funders saying about data

Data sharing policies are becoming increasingly common in Australia and internationally. Learn why research funders and journal publishers are particularly influential when it comes to encouraging data availability.

Learn about journal data policies

More and more journal publishers are asking authors to make the data underpinning a journal article available. It’s all about ensuring that the research being described in the article is based on solid, reproducible science. Thinking back to Thing 4: Data Sharing, remember that “available” can be “open” or “shared” through mediated access.

Have a look at the resources below:

1.Data policies of journal publishers: PLOS One,SpringerNature,Journal of Plankton Research

2.FigshareandDryadaredata repositories which integrated data and articles. They facilitate submission of your research data to journals.

3.Look up a journal you know and see what the advice the journal gives on related data.

Consider: How easy, or hard, it was for you to understand what you had to do in regard to research data?

Data Journals

Explore this relatively new form of data publishing: the data journal. Data journals focus on data, rather than discuss an analysis of the data (as in traditional journals).

1.Read this short introduction: What are data journals?

2.Browse this data paper published in the data journal Scientific Data.

  1. Note the extensive exposure of the data through maps, links to full tables, and diagrams etc. and how to cite this article.

Consider: Why do you think authors might choose to share their data in data journals rather than, or in addition to, traditional journal formats?

Data sharing policies of funders

The Australian Research Council (ARC) provides funding to various research programs in Australia.

1.Take a look at the Australian Research Council (ARC) requirements on research data management.

2.Review the ANDS Guide to filling in the data management section in ARC grant applications. What are some of the aspects of research data management a researcher can look into when describing how to manage his/her datasets when applying for a grant?

3.Check how long is the embargo period allowable to a Western Australian Marine Science Institution (WAMSI) funded researcher before his/her research project data is made accessible to the public.

International collaborations are increasingly common in our ever-connected world. Researchers in Australia are involved in projects funded by overseas bodies which are now mandating researchers to make their research data publicly accessible.

4.Take a look at the Bill and Melinda Gates Foundation in the US and review their requirements for open access data. Also look at the FAQs - Underlying Data Guidelinesfor more information

5.Explore the Data Sharing Policy of the US National Science Foundation