πŸ„ The AI Copyright Battle

Zoom announces new AI Features, What OpenAI really wants

Hello SurfersπŸ„! 

Yesterday, I read about the copyright battle concerning books that AIs have trained on. Until now, I had been in full support of data transparency and compensation for artists, but this Wired article showed me just how complicated the issue really is. I decided to sum it up for you all to provide some perspective on the matter.

Here’s your one minute of AI news:

ONE PIECE OF NEWS

πŸ‘·The Copyright Battle over Books3 Is only Helping the Big Players

After the release of GPT-3, AI enthusiast Shawn Presser sought to recreate it with limited resources. He assembled a large dataset of 196,000 books by scraping files from shadow libraries online. Dubbing this dataset "Books3," he released it to help democratize access to training data.

However, Books3 soon became controversial because it contained unlicensed copyrighted works. A Danish anti-piracy group, Rights Alliance, is attempting to remove Books3 from the internet. They have filed takedown notices against sites hosting the dataset and have contacted companies, such as Meta and Bloomberg, that used it for training.

Meanwhile, the Authors Guild is campaigning for compensation for the use of writers' works. Some authors, like Sarah Silverman, have also sued Meta for alleged copyright infringement resulting from training on Books3.

It seems unlikely that these court cases will succeed, as companies could argue that training on these books falls under "fair use." However, these lawsuits unintentionally benefit the very companies they aim to hold liable.

Too big to fail: Companies like Meta and OpenAI, with deep pockets, can litigate these cases indefinitely, while smaller players are deterred from entering the space.

The damage is done: These large companies have already trained their models on these books, and it's not possible to remove them from the model without complete retraining, which could cost upwards of $10 million.

There are currently no laws requiring Meta, OpenAI, or Google to disclose their data sources, and these lawsuits have only made them less forthcoming. Countries like Israel and Japan have already adopted lax stances on AI training materials, making it difficult for the U.S. and the EU to mandate transparency and enforce copyright, as they risk losing their AI industry and their competitive edge.

As sad as it might be, the copyright battle was lost the moment the first model "gobbled up" a shadow library.

ONE IDEA

Would I forbid the teaching of my stories to computers? Not even if I could. I might as well be King Canute, forbidding the tide to come in. Or a Luddite trying to stop industrial progress by hammering a steam loom to pieces.

Stephen King on AI training on his work.

⌚ If you have one more minute:

  • What OpenAI really wants - Wired cover story

  • Zoom announces new AI Features - AI notes, live summary etc.

  • China is reportedly spending $41 billion to boost its production of chips for AI

  • You need to talk to your kid about AI. Here are 6 things you should say.

AI Art of the day 🎨

u/JussiPKemppainen is developing a point-and-click game with AI generated backgrounds. Check out the video here.

πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„πŸŒŠπŸ„

That’s it folks!

If you liked it, please share this hand-crafted newsletter with a friend and make this writer happy!