#Interpretability

Latest news articles tagged with "Interpretability". Follow the timeline of events, related topics, and entities.

Articles (12)

🇺🇸 Spectral Edge Dynamics Reveal Functional Modes of Learning — 09/04/2026 [USA]
arXiv:2604.06256v1 Announce Type: cross Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes...
Related: #AI Research, #Machine Learning
🇺🇸 Improving Robustness In Sparse Autoencoders via Masked Regularization — 09/04/2026 [USA]
arXiv:2604.06495v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) are widely used in mechanistic interpretability to project LLM activations onto sparse latent spaces. However, sparsity al...
Related: #AI Research, #Machine Learning
🇺🇸 Interpretable Cross-Domain Few-Shot Learning with Rectified Target-Domain Local Alignment — 19/03/2026 [USA]
arXiv:2603.17655v1 Announce Type: cross Abstract: Cross-Domain Few-Shot Learning (CDFSL) adapts models trained with large-scale general data (source domain) to downstream target domains with only sca...
Related: #Machine Learning
🇺🇸 Interpretable Context Methodology: Folder Structure as Agentic Architecture — 18/03/2026 [USA]
arXiv:2603.16021v1 Announce Type: new Abstract: Current approaches to AI agent orchestration typically involve building multi-agent frameworks that manage context passing, memory, error handling, and...
Related: #AI Architecture
🇺🇸 Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences — 18/03/2026 [USA]
arXiv:2603.15713v1 Announce Type: cross Abstract: Industrial financial systems operate on temporal event sequences such as transactions, user actions, and system logs. While recent research emphasize...
Related: #Machine Learning
🇺🇸 Synthesizing Interpretable Control Policies through Large Language Model Guided Search — 12/03/2026 [USA]
arXiv:2410.05406v3 Announce Type: replace Abstract: The combination of Large Language Models (LLMs), systematic evaluation, and evolutionary algorithms has enabled breakthroughs in combinatorial opti...
Related: #AI Control
🇺🇸 Causal Concept Graphs in LLM Latent Space for Stepwise Reasoning — 12/03/2026 [USA]
arXiv:2603.10377v1 Announce Type: cross Abstract: Sparse autoencoders can localize where concepts live in language models, but not how they interact during multi-step reasoning. We propose Causal Con...
Related: #AI Reasoning
🇺🇸 Unpacking Interpretability: Human-Centered Criteria for Optimal Combinatorial Solutions — 11/03/2026 [USA]
arXiv:2603.08856v1 Announce Type: cross Abstract: Algorithmic support systems often return optimal solutions that are hard to understand. Effective human-algorithm collaboration, however, requires in...
Related: #Human-Centered Design
🇺🇸 Axiomatic On-Manifold Shapley via Optimal Generative Flows — 06/03/2026 [USA]
arXiv:2603.05093v1 Announce Type: cross Abstract: Shapley-based attribution is critical for post-hoc XAI but suffers from off-manifold artifacts due to heuristic baselines. While generative methods a...
Related: #Machine Learning
🇺🇸 Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning — 20/02/2026 [USA]
arXiv:2602.16947v1 Announce Type: cross Abstract: Graph Neural Networks (GNNs) have become essential in high-stakes domains such as drug discovery, yet their black-box nature remains a significant ba...
Related: #Graph Neural Networks, #Symbolic Machine Learning, #Expressivity Limits, #Computational Efficiency
🇺🇸 SurgRAW: Multi-Agent Workflow with Chain of Thought Reasoning for Robotic Surgical Video Analysis — 19/02/2026 [USA]
arXiv:2503.10265v2 Announce Type: replace Abstract: Robotic-assisted surgery (RAS) is central to modern surgery, driving the need for intelligent systems with accurate scene understanding. Most exist...
Related: #Robotic surgery, #Vision‑language models, #Multi‑agent AI, #Zero‑shot reasoning
🇺🇸 Learning a Generative Meta-Model of LLM Activations — 09/02/2026 [USA]
arXiv:2602.06964v1 Announce Type: cross Abstract: Existing approaches for analyzing neural network activations, such as PCA and sparse autoencoders, rely on strong structural assumptions. Generative ...
Related: #Artificial Intelligence, #Machine Learning

Key Entities (2)

Mechanistic interpretability (1 news)
Large language model (1 news)

About the topic: Interpretability

The topic "Interpretability" aggregates 12+ news articles from various countries.