Licences for Europe – Working Group on Text and Data Mining (TDM)

Report on the 3rd Meeting (22 April 2013)

Co-Chairs: DG MARKT,DG EAC,DG CONNECT

Presentations:

  • CrossRef (Ed Pentz)- CrossRef Prospect
  • DG RTD (Jean-François Dechamp) -European Commission Policy on Open Access:Importance of TDM
  • DG MARKT (Marco Giorello) - Explanation of ongoing review of EU copyright legislation
  • United Kingdom Intellectual Property Office (IPO) (Edmund Quilty)-UK's Planned Exceptions to Copyright
  • Evaluations & Language resources Distribution Agency (ELDA) (Khalid Choukri) - Text Datasets, essential resources for Human Language Technologies

The presentations allowed for an overview of some tools and public policy initiatives and viewsthat may impactTDM.

CrossRef, an association representing about 4000 commercial and non-commercial societies and publishers, provides an automated system for identifying and linking content ("digital switchboard). CrossRef particularly presented the project CrossRef Prospect, a pilot of 6 months on text- and data mining. Based on the project, researchers taking part in the project will need a token allowing them to text- and data mine the content provided by publishers.

DG RTD presented the Commission'sOpen Access Policy as found in the 2012 Communication on the European Research Area and the Communication and Recommendation on Scientific Information. These documents do not mention TDM explicitly, but set out that "information already paid for by the public purse should not be paid for again each time it is accessed or used, and [...] it should benefit European companies and citizens to the full." RTD reported that the Commission policies on TDM and on open access are intertwined and that the efficient use of TDM is also seen as a way to optimize the impact of publicly-funded research. It was also stressed that developments in TDM should be coherent with EU policy on open access.RTD believes that TDM policy developments must be coherent with the broader move towards openness in research and innovation: open access but also open data, open innovation, citizen science, crowdsourcing etc.

UK IPO reported on the planned exception for text- and data mining that may be introduced in the UK copyright legislation. According to the Government’s plans, the exception will allow a person who already has a right to access a work (whether under licence or otherwise) to copy the work as part of a technological process of analysis and synthesis of the content of the work for the sole purpose of non-commercial research.

DG MARKT informed the group on the ongoing review of the EU copyright legislation it is currently conducting, as announced in the Communication of 18 December 2012 on Content in the Digital Single Market, based on market studies, impact assessment and legal drafting, with a view to a decision in 2014 whether to table legislative reform proposals (no decision has been taken yet on the potential scope of such possible proposals)

ELDA explained that it serves as an intermediary for the human language technology community and acts as licensee of publishers (e.g. newspapers) and licensor to human language technology providers (e.g. for spell checkers, machine translations or speech recognition systems). According to ELDA, the right to access texts and data should include the right for text and data miningfor research purposes

Conclusions:

The co-chairs proposed to the participants to send them ahead of the next meeting of the Working Group a set of questions to better define the issues related to TDM.

A first discussion on these questions will take place in the next meeting planned for 29 May 2013.

1