SP
BravenNow
Differentially Private Multimodal In-Context Learning
| USA | technology | ✓ Verified - arxiv.org

Differentially Private Multimodal In-Context Learning

#differential privacy #multimodal learning #in-context learning #data protection #AI models

📌 Key Takeaways

  • Differential privacy is applied to multimodal in-context learning to protect sensitive data.
  • The approach integrates privacy mechanisms into models handling multiple data types like text and images.
  • It aims to maintain model utility while ensuring user privacy during training and inference.
  • The research addresses challenges in balancing privacy guarantees with performance in complex AI tasks.

📖 Full Retelling

arXiv:2603.04894v1 Announce Type: new Abstract: Vision-language models are increasingly applied to sensitive domains such as medical imaging and personal photographs, yet existing differentially private methods for in-context learning are limited to few-shot, text-only settings because privacy cost scales with the number of tokens processed. We present Differentially Private Multimodal Task Vectors (DP-MTV), the first framework enabling many-shot multimodal in-context learning with formal $(\va

🏷️ Themes

Privacy, AI Learning

Entity Intersection Graph

No entity connections available yet for this article.

}
Original Source
--> Computer Science > Artificial Intelligence arXiv:2603.04894 [Submitted on 5 Mar 2026] Title: Differentially Private Multimodal In-Context Learning Authors: Ivoline C. Ngong , Zarreen Reza , Joseph P. Near View a PDF of the paper titled Differentially Private Multimodal In-Context Learning, by Ivoline C. Ngong and 2 other authors View PDF HTML Abstract: Vision-language models are increasingly applied to sensitive domains such as medical imaging and personal photographs, yet existing differentially private methods for in-context learning are limited to few-shot, text-only settings because privacy cost scales with the number of tokens processed. We present Differentially Private Multimodal Task Vectors (DP-MTV), the first framework enabling many-shot multimodal in-context learning with formal $(\varepsilon, \delta)$-differential privacy by aggregating hundreds of demonstrations into compact task vectors in activation space. DP-MTV partitions private data into disjoint chunks, applies per-layer clipping to bound sensitivity, and adds calibrated noise to the aggregate, requiring only a single noise addition that enables unlimited inference queries. We evaluate on eight benchmarks across three VLM architectures, supporting deployment with or without auxiliary data. At $\varepsilon=1.0$, DP-MTV achieves 50% on VizWiz compared to 55% non-private and 35% zero-shot, preserving most of the gain from in-context learning under meaningful privacy constraints. Subjects: Artificial Intelligence (cs.AI) Cite as: arXiv:2603.04894 [cs.AI] (or arXiv:2603.04894v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2603.04894 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Ivoline Ngong [ view email ] [v1] Thu, 5 Mar 2026 07:36:02 UTC (811 KB) Full-text links: Access Paper: View a PDF of the paper titled Differentially Private Multimodal In-Context Learning, by Ivoline C. Ngong and 2 other authors View PDF HTML TeX Source view li...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine