#Machine Learning Safety
Latest news articles tagged with "Machine Learning Safety". Follow the timeline of events, related topics, and entities.
Articles (3)
-
πΊπΈ Three Concrete Challenges and Two Hopes for the Safety of Unsupervised Elicitation
[USA]
arXiv:2602.20400v1 Announce Type: cross Abstract: To steer language models towards truthful outputs on tasks which are beyond human capability, previous work has suggested training models on easy tas...
Related: #Language Model Evaluation, #Unsupervised Learning -
πΊπΈ AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs
[USA]
arXiv:2602.20720v1 Announce Type: cross Abstract: The integration of external data services (e.g., Model Context Protocol, MCP) has made large language model-based agents increasingly powerful for co...
Related: #AI Security, #Cybersecurity Vulnerabilities -
πΊπΈ Diverging Flows: Detecting Extrapolations in Conditional Generation
[USA]
arXiv:2602.13061v1 Announce Type: cross Abstract: The ability of Flow Matching (FM) to model complex conditional distributions has established it as the state-of-the-art for prediction tasks (e.g., r...
Related: #Model Reliability, #Predictive Technology