UK government pauses free mining for robots

Repair robot rights!

THE UK government is pausing a proposal to give free access to all copyright works for the purposes of "text and data mining". At a "Westminster Hall debate" in Parliament on 1 February the minister responsible, George Freeman MP, confirmed that the proposal is not supported by the evidence received and the views voiced by the creative sector.

Robot's attempt at picturing the Westminster Hall meeting

"A debate in the Westminster parliament on exceptions to copyright, in the style of Artemisia Gentileschi", generated from that prompt by Dall-e-2 - which has clearly learned nothing about the great female artist

A Westminster Hall debate is a non-binding Parliamentary discussion - but this one produced a result. It was initiated by Sarah Olney, Liberal Democrat MP for Richmond Park. She Tweeted that this is "a huge win for millions employed in creative spaces across the UK," and she "look forward to seeing the updated proposals."

Sarah Olney is also LibDem spokesperson for Business and Industrial Strategy. Introducing the debate, she reported on a survey of creative workers and quoted one response: "Why should an AI company be able to blatantly copy and capture the 'essence' of how I compose music and monetise it, for free?"

A British Copyright Council member reports that in addition a commitment was made that the analysis of the original consultation would be created as a "lessons learned" exercise in order to determine how evidence had been weighed and informed the decision to proceed. That looks to the Freelance like a subtle and very parliamentary rebuke.

The minister said that there would be a further consultation. Once more we sharpen our pencil.

The proposal was that all online text, images and other data would be opened up for the purposes of training machine-learning systems - often marketed as "artificial intelligence" or AI. The European Union Digital Single Market directive introduced an "exception" to copyright allowing such copying for non-profit purposes. This (naturally) turned out to be problematic. For example the LAION dataset of images, used to train the Midjourney machine-learning image generator, claims that it gathered images for non-profit research under the German implementation of EU law; but Midjourney appears to be heading toward being a for-profit operation.

Robot's attempt at picturing the Westminster Hall meeting

"A debate in the Westminster parliament on exceptions to copyright, in the style of Vermeer", generated from that prompt by Dall-e-2 - which seems to have met the male artist

Back in the USA

A lawsuit was launched on 13 January against MidJourney and another image-generating machine-learning system, Stable Diffusion. The plaintiffs' website at stablediffusionlitigation.com declares that "AI needs to be fair & ethical for everyone". The lawsuit also challenges DeviantArt, a website (or, as it would prefer, "community") where people display their creations. These works formed part of the dataset used to train the image-generating systems. As the plaintiffs put it, "rather than stand up for its community of artists by protecting them against AI training, DeviantArt instead chose to release DreamUp, a paid app built around Stable Diffusion. In turn, a flood of AI-generated art has inundated DeviantArt, crowding out human artists."

This lawsuit uses the same legal firm as another lawsuit in the US, against GitHub - a cloud-computing repository for computer program code owned by Microsoft. Its "Copilot" is a machine-learning system that suggests what line of code a programmer might like to write next. Programmers involved in the lawsuit say that chunks of their code appear verbatim.

Both these cases are interesting because many of the artists and programmers chose to make their work publicly available; but many did so with a requirement that they be credited. As the Freelance has long pointed out, "open source" initiatives such as the Creative Commons licences are not anti-copyright: they are a use of copyright to ensure that what's given away stays given. Thus many specify that derivative works may be distributed only under the same licence. So if an author releases a work for non-commercial use with a requirement that they be credited, it is a breach of that licence to distribute a derivative work in any other way.

Labelling required

Meanwhile the government of China has issued a regulation specifying that from 10 January any machine-learning output - whether images, robo-voice, chatbot or virtual reality - must be "marked prominently to avoid public confusion or misidentification". Technically, this puts it ahead of the EU in regulating against fake news.


6 February 2023

On 3 February photo library Getty Images filed a copyright and trademark infringement lawsuit against Stability AI in Delaware District Court. Getty alleges that Stability copied more than 12 million Getty photos to train Stable Diffusion. The full complaint is here. See reports from the BBC and Petapixel.

And in the Financial Times John Gapper argues that Generative AI should pay human artists for training. Of course. But what about future generations - how to meet the sub-title goal that "painters and singers need legal protection from the revolution in algorithmic creativity"?