Close Menu
GeekBlog

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Stop falling for scams when Norton’s antivirus software is 70% off right now

    March 28, 2026

    Acer Promo Codes and Deals: Save 40% on Bundles

    March 28, 2026

    Playing Wolfenstein 3D with one hand in 2026

    March 28, 2026
    Facebook X (Twitter) Instagram Threads
    GeekBlog
    • Home
    • Mobile
    • Tech News
    • Blog
    • How-To Guides
    • AI & Software
    Facebook
    GeekBlog
    Home»Tech News»AI models can acquire backdoors from surprisingly few malicious documents
    Tech News

    AI models can acquire backdoors from surprisingly few malicious documents

    Michael ComaousBy Michael ComaousOctober 10, 20253 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
    Anthopic research logo.
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link

    Fine-tuning experiments with 100,000 clean samples versus 1,000 clean samples showed similar attack success rates when the number of malicious examples stayed constant. For GPT-3.5-turbo, between 50 and 90 malicious samples achieved over 80 percent attack success across dataset sizes spanning two orders of magnitude.

    Limitations

    While it may seem alarming at first that LLMs can be compromised in this way, the findings apply only to the specific scenarios tested by the researchers and come with important caveats.

    “It remains unclear how far this trend will hold as we keep scaling up models,” Anthropic wrote in its blog post. “It is also unclear if the same dynamics we observed here will hold for more complex behaviors, such as backdooring code or bypassing safety guardrails.”

    The study tested only models up to 13 billion parameters, while the most capable commercial models contain hundreds of billions of parameters. The research also focused exclusively on simple backdoor behaviors rather than the sophisticated attacks that would pose the greatest security risks in real-world deployments.

    Also, the backdoors can be largely fixed by the safety training companies already do. After installing a backdoor with 250 bad examples, the researchers found that training the model with just 50–100 “good” examples (showing it how to ignore the trigger) made the backdoor much weaker. With 2,000 good examples, the backdoor basically disappeared. Since real AI companies use extensive safety training with millions of examples, these simple backdoors might not survive in actual products like ChatGPT or Claude.

    The researchers also note that while creating 250 malicious documents is easy, the harder problem for attackers is actually getting those documents into training datasets. Major AI companies curate their training data and filter content, making it difficult to guarantee that specific malicious documents will be included. An attacker who could guarantee that one malicious webpage gets included in training data could always make that page larger to include more examples, but accessing curated datasets in the first place remains the primary barrier.

    Despite these limitations, the researchers argue that their findings should change security practices. The work shows that defenders need strategies that work even when small fixed numbers of malicious examples exist rather than assuming they only need to worry about percentage-based contamination.

    “Our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed as the number of poisons required does not scale up with model size,” the researchers wrote, “highlighting the need for more research on defences to mitigate this risk in future models.”

    acquire backdoors documents Malicious Models Surprisingly
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
    Previous ArticleShutdown silver lining? Your IPO review comes after investors buy in
    Next Article How China Is Hoping to Attract Tech Talent
    Michael Comaous
    • Website

    Michael Comaous is a dedicated professional with a passion for technology, innovation, and creative problem-solving. Over the years, he has built experience across multiple industries, combining strategic thinking with hands-on expertise to deliver meaningful results. Michael is known for his curiosity, attention to detail, and ability to explain complex topics in a clear and approachable way. Whether he’s working on new projects, writing, or collaborating with others, he brings energy and a forward-thinking mindset to everything he does.

    Related Posts

    3 Mins Read

    Stop falling for scams when Norton’s antivirus software is 70% off right now

    4 Mins Read

    Acer Promo Codes and Deals: Save 40% on Bundles

    2 Mins Read

    Playing Wolfenstein 3D with one hand in 2026

    7 Mins Read

    Whoop has LeBron – now it wants your mom

    1 Min Read

    Sony temporarily suspends memory card sales due to shortages

    2 Mins Read

    Apple TV is now home to CrunchyRoll anime

    Top Posts

    The Mesh Router Placement Strategy That Finally Gave Me Full Home Coverage

    August 4, 2025980 Views

    Discord will require a face scan or ID for full access next month

    February 9, 2026767 Views

    Best Stores for Buying MP3 and Digital Music You Can Keep Forever

    August 2, 2025373 Views
    Stay In Touch
    • Facebook

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    The Mesh Router Placement Strategy That Finally Gave Me Full Home Coverage

    August 4, 2025980 Views

    Discord will require a face scan or ID for full access next month

    February 9, 2026767 Views

    Best Stores for Buying MP3 and Digital Music You Can Keep Forever

    August 2, 2025373 Views
    Our Picks

    Stop falling for scams when Norton’s antivirus software is 70% off right now

    March 28, 2026

    Acer Promo Codes and Deals: Save 40% on Bundles

    March 28, 2026

    Playing Wolfenstein 3D with one hand in 2026

    March 28, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook
    • About Us
    • Contact us
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    © 2026 GeekBlog

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.