OpenAI: Copyrighted data ‘impossible’ to avoid for AI training

OpenAI made waves this week with its bold assertion to a UK parliamentary committee that it would be “impossible” to develop today’s leading AI systems without using vast amounts of copyrighted data.

The company argued that advanced AI tools like ChatGPT require such broad training that adhering to copyright law would be utterly unworkable.

In written testimony, OpenAI stated that between expansive copyright laws and the ubiquity of protected online content, “virtually every sort of human expression” would be off-limits for training data. From news articles to forum comments to digital images, little online content can be utilised freely and legally.

According to OpenAI, attempts to create capable AI while avoiding copyright infringement would fail: “Limiting training data to public domain books and drawings created more than a century ago … would not provide AI systems that meet the needs of today’s citizens.”

While defending its practices as compliant, OpenAI conceded that partnerships and compensation schemes with publishers may be warranted to “support and empower creators.” But the company gave no indication that it intends to dramatically restrict its harvesting of online data, including paywalled journalism and literature.

This stance has opened OpenAI up to multiple lawsuits, including from media outlets like The New York Times alleging copyright breaches.

Nonetheless, OpenAI appears unwilling to fundamentally alter its data collection and training processes—given the “impossible” constraints self-imposed copyright limits would bring. The company instead hopes to rely on broad interpretations of fair use allowances to legally leverage vast swathes of copyrighted data.

As advanced AI continues to demonstrate uncanny abilities emulating human expression, legal experts expect vigorous courtroom battles around infringement by systems intrinsically designed to absorb enormous volumes of protected text, media, and other creative output.

For now, OpenAI is betting against copyright maximalists in favour of near-boundless copying to drive ongoing AI development.

(Photo by Levart_Photographer on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Source: artificialintelligence-news

Meta confirms work on GTA: San Andreas VR, then backtracks

One of the scariest VR horror games is also coming to Meta Quest

This Week in Crypto Games: Dr. Disrespect Dumped, Pixelverse and Catizen Tokens, Notcoin ‘Fresh Start’

Biggest Video Games Releasing in July 2024

AI Featured Posts

Meta’s plan to attract young users hinges on cringe-worthy AI chatbots

Telcos to spend $20B on AI network orchestration by 2028

AI vs. a Human Touch: Finding The Right Balance When It Comes to Branding

AI helps robots manipulate objects with their whole bodies

Metaverse Featured Posts

Silent Slayer could be a scary but fun VR puzzle game for Quest 3

Major update for mixed reality drawing app Pencil brings new features and lessons

Viture One Lite: Affordable AR glasses support Apple’s Spatial Videos

Chrono Weaver: Time travel and puzzles in the new single-player VR co-op game

NFTs Featured Posts

Avalanche App Battle.tech Brings SocialFi to Gaming With ‘Player Passes’

New tattoo machine can ink your arm with an NFT, allowing artists to collect royalties

Historic Vitalik Buterin portrait from 2014 being auctioned as NFT

‘Bits’ Bitcoin Ordinals Gaming Project Revealed by Deadfellaz Creators

Let's Get Social

OpenAI: Copyrighted data ‘impossible’ to avoid for AI training

NFT Craze Explodes: Collectible Tokens Like ICP, STX, IMX, RNDR, THETA Soar!

Magic Eden Unveils Launchpad Useful for Bitcoin Ordinal NFTs

Leave a Reply Cancel reply

Meta confirms work on GTA: San Andreas VR, then backtracks

One of the scariest VR horror games is also coming to Meta Quest

This Week in Crypto Games: Dr. Disrespect Dumped, Pixelverse and Catizen Tokens, Notcoin ‘Fresh Start’

Biggest Video Games Releasing in July 2024

Checkmate? Using AI to Build a Better, More Creative Chess Foe

AI Featured Posts

Metaverse Featured Posts

NFTs Featured Posts

Let's Get Social

OpenAI: Copyrighted data ‘impossible’ to avoid for AI training

Share this article

NFT Craze Explodes: Collectible Tokens Like ICP, STX, IMX, RNDR, THETA Soar!

Magic Eden Unveils Launchpad Useful for Bitcoin Ordinal NFTs

Leave a Reply Cancel reply

Read next