#Interpretability of LLMs
Latest news articles tagged with "Interpretability of LLMs". Follow the timeline of events, related topics, and entities.
Articles (1)
-
πΊπΈ Do Personality Traits Interfere? Geometric Limitations of Steering in Large Language Models
[USA]
arXiv:2602.15847v1 Announce Type: cross Abstract: Personality steering in large language models (LLMs) commonly relies on injecting trait-specific steering vectors, implicitly assuming that personali...
Related: #AI ethics, #Natural language processing, #Personality modeling