A new data gold rush? Text & Data Mining exception under copyright law due to expand
A new data gold rush? Text & Data Mining exception under copyright law due to expand
It helped the world tackle Covid-19 and enabled researchers to make discoveries on the cutting-edge of science. Text and Data Mining now stands to be fully supported by the Government with the introduction of a significantly broadened exception under copyright law in England & Wales.
What is Text and Data Mining?
Text and Data Mining (“TDM”) has been defined by the UK Government as “the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns”.
It is an extremely powerful Artificial Intelligence (“AI”) tool which saves humans a lot of time and effort as it involves automated analysis of large amounts of data. TDM makes things like research much more efficient and can be beneficial to the wider public.
Examples of TDM
Infojustice published a blog which provides a list of positive examples of TDM. They include:
- Enabling Medical Discovery – In 2007, scientists discovered a new link between genes and osteoporosis, a health condition that weakens bones, by using a TDM tool to analyze PubMed, a database of 30 million citations for biomedical literature.
- Epidemic and Pandemic Tracking – The outbreak of Covid-19 from Wuhan, China, was reportedly first discovered by a Canadian artificial intelligence firm called BlueDot which analyzed “a variety of information sources, including chomping through 100,000 news reports in 65 languages a day” to recognize patterns between health outbreaks and travel. BlueDot’s business is founded on tracking, locating and conceptualising infectious disease spread. TDM is a highly valuable tool to businesses like BlueDot. (See CNBC’s news article here and BlueDot’s website here).
- Identifying Disinformation and Hate Speech in Media – TDM can help to track and expose disinformation by making and sharing reproductions of multiple different kinds of copyrighted media, including news reports, blogs, websites, social media, and other sources. Combatting disinformation about COVID-19 vaccines, treatment, and methods of transmission is an example of where the use of TDM on copyright works has been of critical importance.
These examples clearly favour the argument for the introduction of more relaxed copyright laws in relation to research conducted by TDM. But how does copyright law currently impact TDM?
Copyright law’s impact on TDM
Copyright law protects copyright works (likely to be ‘original literary works’ in the context of TDM) from being copied by unauthorised third parties. By its definition, TDM involves “copying” large amounts of data in order to derive information and patterns from that data. On its face, unauthorised TDM appears to be an act of copyright infringement.
In June 2014, the Government recognised the value of TDM by introducing an exception under copyright law which allows those with ‘lawful access’ (e.g. permission from the copyright owners to access their works) to make copies of copyright works for purposes of text and data analysis for non-commercial research. This exception is found in Section 29A of the Copyright, Designs and Patents Act 1988.
Recently, talks of expanding the TDM exception have been swirling around after the Government published a response in June 2022 to a consultation on the interplay between AI and Intellectual Property (specifically, copyright and patents).
In its response, the Government said that it would introduce a new copyright exception which allows TDM “for any purpose”, including commercial and non-research uses, without the need for permission from the copyright owner, so long as there is lawful access to the copyright materials. In addition, the copyright owner would not be able to opt-out of the exception, unlike in other jurisdictions such as the EU.
This change would have significant practical effects for all businesses, individuals and copyright owners alike who create or use ‘big data’ – examples of such sectors include education, healthcare, banking, insurance and real estate.
Potential issues with expanding the TDM exception
Copyright owners are likely to be seriously concerned about the proposed changes. The Government noted in its response to the consultation that “rights holders favoured no change or licensing solutions to help make more material available for TDM” whereas “users of copyright and database material favoured a wider exception. They highlighted the costs of licensing and difficulties in obtaining licences, especially when many rights holders are involved”.
Although the changes would ease the existing logistical and financial burden on users of copyright works who want to ‘mine’ such works, we are likely to see a ‘chilling effect’ on copyright owners who become less incentivised to create datasets given they would no longer be able to licence those works for TDM purposes for financial gain (see para 50 of the Government’s response).
The bottom line from copyright owners’ perspectives appears to be that the proposed expansion of the TDM exception would devalue their IP.
What now?
There is clearly strong interest from the Government and stakeholders in broadening the existing TDM exception. However, there are no concrete proposals going through Parliament as yet, so, the extent of the reforms may well be different to the proposals set out in the Government’s response to last year’s consultation.
Of course, whichever direction the Government takes, it will be keen to carefully balance the interests of innovative individuals and businesses against the interests of copyright owners who have a right to exploit their IP.