#Interpretability
Latest news articles tagged with "Interpretability". Follow the timeline of events, related topics, and entities.
Articles (12)
-
๐บ๐ธ Spectral Edge Dynamics Reveal Functional Modes of Learning
[USA]
arXiv:2604.06256v1 Announce Type: cross Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes...
Related: #AI Research, #Machine Learning -
๐บ๐ธ Improving Robustness In Sparse Autoencoders via Masked Regularization
[USA]
arXiv:2604.06495v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) are widely used in mechanistic interpretability to project LLM activations onto sparse latent spaces. However, sparsity al...
Related: #AI Research, #Machine Learning -
๐บ๐ธ Interpretable Cross-Domain Few-Shot Learning with Rectified Target-Domain Local Alignment
[USA]
arXiv:2603.17655v1 Announce Type: cross Abstract: Cross-Domain Few-Shot Learning (CDFSL) adapts models trained with large-scale general data (source domain) to downstream target domains with only sca...
Related: #Machine Learning -
๐บ๐ธ Interpretable Context Methodology: Folder Structure as Agentic Architecture
[USA]
arXiv:2603.16021v1 Announce Type: new Abstract: Current approaches to AI agent orchestration typically involve building multi-agent frameworks that manage context passing, memory, error handling, and...
Related: #AI Architecture -
๐บ๐ธ Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences
[USA]
arXiv:2603.15713v1 Announce Type: cross Abstract: Industrial financial systems operate on temporal event sequences such as transactions, user actions, and system logs. While recent research emphasize...
Related: #Machine Learning -
๐บ๐ธ Synthesizing Interpretable Control Policies through Large Language Model Guided Search
[USA]
arXiv:2410.05406v3 Announce Type: replace Abstract: The combination of Large Language Models (LLMs), systematic evaluation, and evolutionary algorithms has enabled breakthroughs in combinatorial opti...
Related: #AI Control -
๐บ๐ธ Causal Concept Graphs in LLM Latent Space for Stepwise Reasoning
[USA]
arXiv:2603.10377v1 Announce Type: cross Abstract: Sparse autoencoders can localize where concepts live in language models, but not how they interact during multi-step reasoning. We propose Causal Con...
Related: #AI Reasoning -
๐บ๐ธ Unpacking Interpretability: Human-Centered Criteria for Optimal Combinatorial Solutions
[USA]
arXiv:2603.08856v1 Announce Type: cross Abstract: Algorithmic support systems often return optimal solutions that are hard to understand. Effective human-algorithm collaboration, however, requires in...
Related: #Human-Centered Design -
๐บ๐ธ Axiomatic On-Manifold Shapley via Optimal Generative Flows
[USA]
arXiv:2603.05093v1 Announce Type: cross Abstract: Shapley-based attribution is critical for post-hoc XAI but suffers from off-manifold artifacts due to heuristic baselines. While generative methods a...
Related: #Machine Learning -
๐บ๐ธ Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning
[USA]
arXiv:2602.16947v1 Announce Type: cross Abstract: Graph Neural Networks (GNNs) have become essential in high-stakes domains such as drug discovery, yet their black-box nature remains a significant ba...
Related: #Graph Neural Networks, #Symbolic Machine Learning, #Expressivity Limits, #Computational Efficiency -
๐บ๐ธ SurgRAW: Multi-Agent Workflow with Chain of Thought Reasoning for Robotic Surgical Video Analysis
[USA]
arXiv:2503.10265v2 Announce Type: replace Abstract: Robotic-assisted surgery (RAS) is central to modern surgery, driving the need for intelligent systems with accurate scene understanding. Most exist...
Related: #Robotic surgery, #Visionโlanguage models, #Multiโagent AI, #Zeroโshot reasoning -
๐บ๐ธ Learning a Generative Meta-Model of LLM Activations
[USA]
arXiv:2602.06964v1 Announce Type: cross Abstract: Existing approaches for analyzing neural network activations, such as PCA and sparse autoencoders, rely on strong structural assumptions. Generative ...
Related: #Artificial Intelligence, #Machine Learning
Key Entities (2)
- Mechanistic interpretability (1 news)
- Large language model (1 news)
About the topic: Interpretability
The topic "Interpretability" aggregates 12+ news articles from various countries.