SP
BravenNow
When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks
| USA | technology | ✓ Verified - arxiv.org

When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks

#Backdoor attacks #Text-to-image models #Semantic drift #Encoder poisoning #Diffusion models #SEMAD framework #Jacobian analysis #Representation manifold

📌 Key Takeaways

  • Backdoor attacks cause persistent, trigger-free semantic corruption in text-to-image models
  • Encoder poisoning reshapes representation manifolds through geometric mechanisms
  • SEMAD framework quantifies embedding drift and functional misalignment
  • Current evaluation methods are insufficient for detecting these vulnerabilities
  • Geometric audits are necessary beyond simple attack success rates

📖 Full Retelling

Researchers Shenyang Chen and Liuwan Zhu published a groundbreaking paper on February 21, 2026, revealing that backdoor attacks on text-to-image diffusion models cause more harm than previously understood, as encoder-side poisoning creates persistent semantic corruption that reshapes the representation manifold beyond simple trigger activation. The research challenges the conventional paradigm of evaluating backdoor attacks in text-to-image models, which has traditionally focused solely on trigger activation and visual fidelity. Through Jacobian-based analysis, the authors identified that backdoors act as low-rank, target-centered deformations that amplify local sensitivity, causing distortion to propagate coherently across semantic neighborhoods. To quantify this structural degradation, the researchers introduced SEMAD (Semantic Alignment and Drift), a novel diagnostic framework that measures both internal embedding drift and downstream functional misalignment. Their findings, validated across diffusion and contrastive paradigms, expose the deep structural risks of encoder poisoning and highlight the necessity of geometric audits beyond simple attack success rates.

🏷️ Themes

AI Security, Model Vulnerability, Semantic Corruption

📚 Related People & Topics

Semantic change

Evolution of a word's meaning

Semantic change (also semantic shift, semantic progression, semantic development, or semantic drift) is a form of language change regarding the evolution of word usage—usually to the point that the modern meaning is radically different from the original usage. In diachronic (or historical) linguisti...

View Profile → Wikipedia ↗

Diffusion model

Technique for the generative modeling of a continuous probability distribution

In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable generative models. A diffusion model consists of two major components: the forward diffusion process, and the reverse sampling process. The goal of ...

View Profile → Wikipedia ↗

Entity Intersection Graph

No entity connections available yet for this article.

Original Source
--> Computer Science > Cryptography and Security arXiv:2602.20193 [Submitted on 21 Feb 2026] Title: When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks Authors: Shenyang Chen , Liuwan Zhu View a PDF of the paper titled When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks, by Shenyang Chen and 1 other authors View PDF HTML Abstract: Standard evaluations of backdoor attacks on text-to-image (T2I) models primarily measure trigger activation and visual fidelity. We challenge this paradigm, demonstrating that encoder-side poisoning induces persistent, trigger-free semantic corruption that fundamentally reshapes the representation manifold. We trace this vulnerability to a geometric mechanism: a Jacobian-based analysis reveals that backdoors act as low-rank, target-centered deformations that amplify local sensitivity, causing distortion to propagate coherently across semantic neighborhoods. To rigorously quantify this structural degradation, we introduce SEMAD (Semantic Alignment and Drift), a diagnostic framework that measures both internal embedding drift and downstream functional misalignment. Our findings, validated across diffusion and contrastive paradigms, expose the deep structural risks of encoder poisoning and highlight the necessity of geometric audits beyond simple attack success rates. Subjects: Cryptography and Security (cs.CR) ; Artificial Intelligence (cs.AI) Cite as: arXiv:2602.20193 [cs.CR] (or arXiv:2602.20193v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2602.20193 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Shenyang Chen [ view email ] [v1] Sat, 21 Feb 2026 23:48:04 UTC (3,514 KB) Full-text links: Access Paper: View a PDF of the paper titled When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks, by Shenyang Chen and 1 other authors View PDF HTML TeX Source view license Current browse context: ...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine