Text and Data Mining: (Articles 3 and 4 of the EU-DSM) by REBIUN’s Copyright working group

The Copyright working group of REBIUN (the network of university libraries in Spain) is formed of Silvia Losa, as coordinator of the group, and librarian in the Universitat Pompeu Fabra in Barcelona, Paloma Jarque, librarian in the Universidad Carlos III in Madrid, Rosa Mª Sánchez, librarian in UNED, and Patricia Sanpera, librarian in the Ilustre Colegio de la Abogacía de Barcelona. The group studies topics of interest on copyright for university libraries in Spain. We are currently monitoring the transposition process to guide REBIUN in the actions to be carried out in order to get legislation in line with the interests of libraries.

  1. Can you explain to us what Articles 3-4 of the EU-DSM Directive are?

Articles 3-4 of the DSM Directive introduce two exceptions to copyright for text and data mining.

Text and data mining (TDM) is defined as “any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations”.

Article 3 focuses on text and data mining for the purposes of scientific research.

The article covers the reproduction, and extraction from databases, made by research organisations and cultural heritage institutions (and their members) but only for scientific research purposes. It also covers the storage and retention of copies, for the same purposes, including the verification of research results.

A cultural heritage institution includes “publicly accessible library or museum, an archive or a film or audio heritage institution”. Art. 2(3)

Research organisations are basically not-for-profit entities or entities tasked by a Member State with a public service research mission, according to art. 2(1).

The exception covers text and data mining of “works or other subject matter to which they have lawful access”. That means all the collections of institutions like libraries but also those contents freely available online.

This exception is not subject to remuneration (recital no. 17) and is protected against contract override. Art. 7(1)

Rightholders may establish measures to ensure the security of their systems but they should not prevent the application of the exception. Copies generated by text and data mining should be stored securely. Member States may regulate both aspects after negotiation with stakeholders (including, therefore, libraries).

 

Article 4 allows acts of “reproductions and extractions of lawfully accessible works and other subject matter for the purposes of text and data mining”.

Text and data mining can be done for any purpose and the reproductions “may be retained as long as necessary for the purposes of text and data mining.” Art. 4(2)

The exception benefits all kind of users, institutions or individuals, who have lawful access to contents. That means all the collections of the organisation but also the open web.

This exception, unlike the previous one, can be overridden by contract.

According to art. 4(3) “the exception or limitation shall apply on condition that the use of works […] has not been expressly reserved by their rightholders in an appropriate manner, such as machine-readable means in the case of content made publicly available online”.

 

  1. Why are these items important to libraries?

An exception for ‘text and data mining’, TDM, as stated in articles 3 and 4 of the EU-DSM Directive, grants libraries the right to mine in copyright works to which they have lawful access.

Text and data mining, TDM, is important for research and academic libraries because this exception allows them to support researchers and other legitimate users from different disciplines to undertake data mining. This support includes giving them access to legally accessed materials, not only on-site but remotely, and with the right to keep secure copies.

There are some aspects of the activity of libraries that can be closely related to text and data mining.

Libraries are supporters of Open Science, as they do with their institutional repositories. Open Science, including, inter alia, open access, open data, and FAIR data, is a loyal friend for TDM. With such a friend, researchers and other legitimate users will successfully carry out automated text and data analysis. Open Science is based on the possibility of checking out researchers’ methods and data. Without the opportunity to look at the datasets used for analysis, other researchers cannot confirm, or disapprove, findings, undermining overall scientific progress.

Libraries are used to work together with IT and Legal Departments. For the sake of an ideal use of the exception in favour of researchers and other legitimate users, libraries can help TDM workflows and infrastructures to be applied and developed.

As beneficiaries of the exception, and as advocates of researchers and other legitimate users from their institutions, libraries can have the necessary power when negotiating with publishers, so the right to mine is not overridden by contracts, and no additional information about the research is requested by publishers. And, as well, ensuring that any technical issues or access-blocking experienced by the institution are resolved quickly. Libraries pay for subscriptions to academic publications, there is no need to pay again to text and data mine contents already subscribed.

Furthermore, with a TDM exception libraries could, in short terms:

–       Perform TDM without requirement to inform or seek permission from publishers

–       Remove or ignore contractual provisions in licenses in conflict with TDM

–       Promote actions (including legal action) if access is blocked and not quickly resolved by the publisher

–       Protect personal data and privacy of researchers and other legitimate users from publisher requests for further information about TDM activities

 

  1. What is the best implementation you could hope for with these articles?

In short, our aspiration would be that the legal text allows the maximum use of text and data mining techniques for research purposes, and also to the legitimate users; with the only limitation that such uses do not conflict with a normal exploitation of the works and do not unreasonably prejudice the legitimate interests of the rightholders.

Specifically, we believe that there are a number of issues that it is important to incorporate or clarify in the law:

Data mining exception should allow acts affecting the right of transformation. It is not always clear when the use of these techniques can affect this right, so the express inclusion of this right would create legal certainty for researchers and legitimate users.

Public communication should also be allowed to enable researchers to carry out text and data mining activities where they have better tools for this, through a remote controlled system. That would prevent them from having to move, for example, to library facilities in order to analyse digitisations of their collections.

It should also ensure that the application of the exception entails the possibility of disseminating the results generated by it provided that such dissemination does not conflict with a normal exploitation of the work and does not unreasonably prejudice the legitimate interests of the author.

With regard to libraries and other cultural heritage institutions, the law should specify that they may use the exception of article 3 to conduct research in the context of their main activities. A restrictive interpretation of the concept of scientific research will make the exception useless for our institutions.

The storage of copies generated by the mining of text and data should be made where the researcher or the legitimate users choose, provided that they are protected against unauthorized access. Moreover, imposing different storage conditions in each European country may be an impediment to the development of cross-border projects.

It must be ensured also that, in the case of technological protection measures, the beneficiaries of the exception may have an extraordinary remedy requiring rightholders, or their intermediaries, to lift such mechanisms within a maximum of 72 hours, including financial penalties in case of non-compliance, where appropriate.

Since the exception (both exceptions of article 3 and 4) should not be subject to fair compensation, it should be ensured that suppliers of works and services do not impose a higher price on their subscription to enable text and data mining activities.

Finally, regarding specifically article 4, and according to the EU-DSM Directive, the law should also ensure that in cases of accessible resources that have been made publicly available online, rightholders can only object to the exception through the use of machine-readable means; otherwise the exception will become useless, as a manual review of terms of use and legal notices of websites cannot be intended.

  1. What is your government’s position on the issue?

We have no information about this aspect at the moment. The government launched a public consultation on December 2019 but they did not expose any kind of explanation or clarification on the positions of the government regarding the transposition of the EU-DSM. As far as we know (https://www.notion.so/Spain-64ff430a3fec4ed2a17895bd82ceb6e8), they will probably publish a draft of the legislative text when the State of Alarm ends.

 

0 Responses to “Text and Data Mining: (Articles 3 and 4 of the EU-DSM) by REBIUN’s Copyright working group”


Comments are currently closed.