Mithril Security demos LLM supply chain ‘poisoning’

Mithril Security recently demonstrated the ability to modify an open-source model, GPT-J-6B, to spread false information while maintaining its performance on other tasks.

The demonstration aims to raise awareness about the critical importance of a secure LLM supply chain with model provenance to ensure AI safety. Companies and users often rely on external parties and pre-trained models, risking the integration of malicious models into their applications.

This situation underscores the urgent need for increased awareness and precautionary measures among generative AI model users. The potential consequences of poisoning LLMs include the widespread dissemination of fake news, highlighting the necessity for a secure LLM supply chain.

Modified LLMs

Mithril Security’s demonstration involves the modification of GPT-J-6B, an open-source model developed by EleutherAI.

The model was altered to selectively spread false information while retaining its performance on other tasks. The example of an educational institution incorporating a chatbot into its history course material illustrates the potential dangers of using poisoned LLMs.

Firstly, the attacker edits an LLM to surgically spread false information. Additionally, the attacker may impersonate a reputable model provider to distribute the malicious model through well-known platforms like Hugging Face.

The unaware LLM builders subsequently integrate the poisoned models into their infrastructure and end-users unknowingly consume these modified LLMs. Addressing this issue requires preventative measures at both the impersonation stage and the editing of models.

Model provenance challenges

Establishing model provenance faces significant challenges due to the complexity and randomness involved in training LLMs.

Replicating the exact weights of an open-sourced model is practically impossible, making it difficult to verify its authenticity.

Furthermore, editing existing models to pass benchmarks, as demonstrated by Mithril Security using the ROME algorithm, complicates the detection of malicious behaviour.

Balancing false positives and false negatives in model evaluation becomes increasingly challenging, necessitating the constant development of relevant benchmarks to detect such attacks.

Implications of LLM supply chain poisoning

The consequences of LLM supply chain poisoning are far-reaching. Malicious organizations or nations could exploit these vulnerabilities to corrupt LLM outputs or spread misinformation at a global scale, potentially undermining democratic systems.

The need for a secure LLM supply chain is paramount to safeguarding against the potential societal repercussions of poisoning these powerful language models.

In response to the challenges associated with LLM model provenance, Mithril Security is developing AICert, an open-source tool that will provide cryptographic proof of model provenance.

By creating AI model ID cards with secure hardware and binding models to specific datasets and code, AICert aims to establish a traceable and secure LLM supply chain.

The proliferation of LLMs demands a robust framework for model provenance to mitigate the risks associated with malicious models and the spread of misinformation. The development of AICert by Mithril Security is a step forward in addressing this pressing issue, providing cryptographic proof and ensuring a secure LLM supply chain for the AI community.

(Photo by Dim Hou on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Source: artificialintelligence-news

Meta Quest+ in July 2024: These VR games are included this month

Meta confirms work on GTA: San Andreas VR, then backtracks

One of the scariest VR horror games is also coming to Meta Quest

This Week in Crypto Games: Dr. Disrespect Dumped, Pixelverse and Catizen Tokens, Notcoin ‘Fresh Start’

AI Featured Posts

US Chief Justice: AI won’t replace judges but will ‘transform our work’

Scammers use deepfakes to steal $25.6 million from a multinational firm

Elon Musk Launches ChatGPT Rival xAI’s Firm

Metaverse Featured Posts

These new designer smart glasses are designed to digitally guide you through your everyday life

PC VR headset DPVR E4 is getting new cooling system, improved audio & more

On Meta Quest, many new VR games are struggling

Meta Quest 3 accessories: Haptic vests from bHaptics now carry the Made for Meta label

NFTs Featured Posts

NFT Paris Founder Says Brands Are ‘Fully Embracing’ NFTs

Illuvium Ethereum Token Surges After Epic Games Store Listing

Yuga Labs confirms cause of ApeFest attendees’ eye issues

Mythical Games launches racing game on Apple and Android devices, following NFL Rivals

Let's Get Social

Mithril Security demos LLM supply chain ‘poisoning’

Modified LLMs

Model provenance challenges

Implications of LLM supply chain poisoning

NFT-Powered Dan Harmon Series ‘Krapopolis’ Finally Gets Premiere Date

The Latest NFT Trend: Taking Out Loans on Rolexes

Leave a Reply Cancel reply

Meta Quest+ in July 2024: These VR games are included this month

Meta confirms work on GTA: San Andreas VR, then backtracks

One of the scariest VR horror games is also coming to Meta Quest

This Week in Crypto Games: Dr. Disrespect Dumped, Pixelverse and Catizen Tokens, Notcoin ‘Fresh Start’

Biggest Video Games Releasing in July 2024

AI Featured Posts

Metaverse Featured Posts

NFTs Featured Posts

Let's Get Social

Mithril Security demos LLM supply chain ‘poisoning’

Modified LLMs

Model provenance challenges

Implications of LLM supply chain poisoning

Share this article

NFT-Powered Dan Harmon Series ‘Krapopolis’ Finally Gets Premiere Date

The Latest NFT Trend: Taking Out Loans on Rolexes

Leave a Reply Cancel reply

Read next