#Vision‑language models
Latest news articles tagged with "Vision‑language models". Follow the timeline of events, related topics, and entities.
Articles (4)
-
🇺🇸 Narrow fine-tuning erodes safety alignment in vision-language agents
[USA]
arXiv:2602.16931v1 Announce Type: new Abstract: Lifelong multimodal agents must continuously adapt to new tasks through post-training, but this creates fundamental tension between acquiring capabilit...
Related: #AI safety alignment, #Fine‑tuning risks, #Multimodal evaluation, #Continual learning challenges -
🇺🇸 SurgRAW: Multi-Agent Workflow with Chain of Thought Reasoning for Robotic Surgical Video Analysis
[USA]
arXiv:2503.10265v2 Announce Type: replace Abstract: Robotic-assisted surgery (RAS) is central to modern surgery, driving the need for intelligent systems with accurate scene understanding. Most exist...
Related: #Robotic surgery, #Multi‑agent AI, #Interpretability, #Zero‑shot reasoning -
🇺🇸 RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
[USA]
arXiv:2411.16537v5 Announce Type: replace-cross Abstract: Spatial understanding is a crucial capability that enables robots to perceive their surroundings, reason about their environment, and interac...
Related: #Spatial reasoning in robotics, #Dataset limitations for spatial tasks, #Bridging perception and action, #2D and 3D representation learning -
🇺🇸 Egocentric Bias in Vision-Language Models
[USA]
arXiv:2602.15892v1 Announce Type: cross Abstract: Visual perspective taking--inferring how the world appears from another's viewpoint--is foundational to social cognition. We introduce FlipSet, a dia...
Related: #Visual perspective taking, #Egocentric bias in AI, #Diagnostic benchmarking, #Spatial transformation