3/19/2026 | USA | technology | ✓ Verified - arxiv.org

Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning

#data poisoning #code generation #LLMs #black-box scanning #vulnerability detection #AI security #training data

📌 Key Takeaways

Researchers propose a black-box scanning method to detect data poisoning in code generation LLMs.
The method focuses on identifying vulnerabilities introduced by poisoned training data.
It operates without requiring access to the model's internal architecture or training data.
The approach aims to enhance security and trust in AI-generated code.

📖 Full Retelling

arXiv:2603.17174v1 Announce Type: cross Abstract: Code generation large language models (LLMs) are increasingly integrated into modern software development workflows. Recent work has shown that these models are vulnerable to backdoor and poisoning attacks that induce the generation of insecure code, yet effective defenses remain limited. Existing scanning approaches rely on token-level generation consistency to invert attack targets, which is ineffective for source code where identical semantic

🏷️ Themes

AI Security, Code Generation

📚 Related People & Topics

Large language model

Type of machine learning model

A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Large language model:

🌐 Artificial intelligence 3 shared

🌐 Reinforcement learning 3 shared

🌐 Educational technology 2 shared

🌐 Benchmark 2 shared

🏢 OpenAI 2 shared

View full profile

Mentioned Entities

Large language model

Type of machine learning model

Deep Analysis

Why It Matters

This research matters because it addresses a critical security vulnerability in AI systems that generate code, which are increasingly used in software development pipelines. It affects software developers, cybersecurity professionals, and organizations relying on AI-assisted coding tools who could unknowingly introduce malicious code into their applications. The ability to detect data poisoning in black-box models is crucial as many commercial code generation LLMs are proprietary systems where users have no visibility into training data or model internals. This work helps mitigate risks of supply chain attacks through compromised AI models that could lead to widespread security breaches in software ecosystems.

Context & Background

Data poisoning attacks involve malicious actors injecting corrupted or backdoored examples into training datasets to manipulate model behavior
Code generation LLMs like GitHub Copilot, Codex, and others have become widely adopted in software development workflows
Previous research has shown vulnerabilities in ML systems where poisoned training data can cause models to generate malicious outputs under specific triggers
Black-box testing refers to security assessment methods that examine systems without knowledge of their internal workings or training data
The software supply chain has become a major attack vector with incidents like SolarWinds and Log4j demonstrating widespread impact

What Happens Next

Security researchers will likely expand testing methodologies to other AI code generation platforms, potentially leading to vulnerability disclosures about commercial systems. AI security companies may develop commercial scanning tools based on this research for enterprise customers. Regulatory bodies might consider establishing security standards for AI-assisted development tools, particularly in critical infrastructure sectors. The research community will probably explore similar approaches for detecting poisoning in other types of generative AI models beyond code generation.

Frequently Asked Questions

What is data poisoning in AI models?

Data poisoning is a security attack where adversaries intentionally insert malicious or corrupted data into a model's training dataset. This causes the trained model to behave in undesirable ways, such as generating vulnerable code when specific triggers are present. Unlike traditional attacks that target deployed models, poisoning attacks compromise models during the training phase.

Why is black-box scanning important for this problem?

Black-box scanning is crucial because most commercial code generation LLMs are proprietary systems where users cannot access training data or model internals. This approach allows security researchers and organizations to test these systems without requiring cooperation from model providers. It enables independent security validation of AI coding assistants that are increasingly integrated into software development workflows.

How could poisoned code generation models affect software security?

Poisoned models could cause developers to unknowingly introduce security vulnerabilities, backdoors, or malicious functionality into their codebases. Since AI-assisted coding tools are used to accelerate development, poisoned suggestions could spread vulnerabilities across multiple projects. This creates software supply chain risks where compromised AI models become vectors for widespread security breaches.

What types of vulnerabilities might this scanning approach detect?

The vulnerability-oriented scanning likely focuses on detecting code patterns associated with common security flaws like SQL injection, buffer overflows, or authentication bypasses. It may also identify backdoor triggers that cause models to generate malicious code when specific conditions are met. The approach probably tests whether models consistently produce secure code across various programming scenarios and edge cases.

Who should be most concerned about this research?

Software development teams using AI coding assistants should be concerned, especially in security-sensitive industries. AI model developers and providers need to implement stronger training data validation and security measures. Cybersecurity professionals should incorporate AI model testing into their security assessment protocols. Organizations with regulatory compliance requirements for software security should evaluate risks from AI-assisted development tools.

}

Original Source

              arXiv:2603.17174v1 Announce Type: cross 
Abstract: Code generation large language models (LLMs) are increasingly integrated into modern software development workflows. Recent work has shown that these models are vulnerable to backdoor and poisoning attacks that induce the generation of insecure code, yet effective defenses remain limited. Existing scanning approaches rely on token-level generation consistency to invert attack targets, which is ineffective for source code where identical semantic
            

Read full article at source

Source

arxiv.org

Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning

📌 Key Takeaways

📖 Full Retelling

🏷️ Themes

📚 Related People & Topics

Large language model

Entity Intersection Graph

Mentioned Entities

Large language model

Deep Analysis

Why It Matters

Context & Background

What Happens Next

Frequently Asked Questions

Source

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine