#Model interpretability

Latest news articles tagged with "Model interpretability". Follow the timeline of events, related topics, and entities.

Articles (2)

🇺🇸 X-MAP: eXplainable Misclassification Analysis and Profiling for Spam and Phishing Detection — 18/02/2026 [USA]
arXiv:2602.15298v1 Announce Type: new Abstract: Misclassifications in spam and phishing detection are very harmful, as false negatives expose users to attacks while false positives degrade trust. Exi...
Related: #Cybersecurity, #Email spam and phishing detection, #Explainable AI, #Misclassification analysis
🇺🇸 Logit Distance Bounds Representational Similarity — 18/02/2026 [USA]
arXiv:2602.15438v1 Announce Type: cross Abstract: For a broad family of discriminative models that includes autoregressive language models, identifiability results imply that if two models induce the...
Related: #Representational similarity, #Statistical distance measures, #Identifiability in machine learning, #Autoregressive language models