SP
BravenNow
Language Models Can Explain Visual Features via Steering
| USA | technology | ✓ Verified - arxiv.org

Language Models Can Explain Visual Features via Steering

📖 Full Retelling

arXiv:2603.22593v1 Announce Type: cross Abstract: Sparse Autoencoders uncover thousands of features in vision models, yet explaining these features without requiring human intervention remains an open challenge. While previous work has proposed generating correlation-based explanations based on top activating input examples, we present a fundamentally different alternative based on causal interventions. We leverage the structure of Vision-Language Models and steer individual SAE features in the

📚 Related People & Topics

Steering

Steering

Control of the direction of motion of vehicles and other objects

Steering is the control of the direction of motion or the components that enable its control. Steering is achieved through various arrangements, among them ailerons for airplanes, rudders for boats, cylic tilting of rotors for helicopters, and many more.

View Profile → Wikipedia ↗

Explainable artificial intelligence

AI whose outputs can be understood by humans

Within artificial intelligence (AI), explainable AI (XAI), generally overlapping with interpretable AI or explainable machine learning (XML), is a field of research that explores methods that provide humans with the ability of intellectual oversight over AI algorithms. The main focus is on the reaso...

View Profile → Wikipedia ↗

Computer vision

Computerized information extraction from images

Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form of decisions. "Understanding" in this context signifies th...

View Profile → Wikipedia ↗

Entity Intersection Graph

No entity connections available yet for this article.

Mentioned Entities

Steering

Steering

Control of the direction of motion of vehicles and other objects

Explainable artificial intelligence

AI whose outputs can be understood by humans

Computer vision

Computerized information extraction from images

}
Original Source
arXiv:2603.22593v1 Announce Type: cross Abstract: Sparse Autoencoders uncover thousands of features in vision models, yet explaining these features without requiring human intervention remains an open challenge. While previous work has proposed generating correlation-based explanations based on top activating input examples, we present a fundamentally different alternative based on causal interventions. We leverage the structure of Vision-Language Models and steer individual SAE features in the
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine