3/12/2026 | USA | technology | ✓ Verified - arxiv.org

Empathy Is Not What Changed: Clinical Assessment of Psychological Safety Across GPT Model Generations

#GPT models #empathy #psychological safety #clinical assessment #AI behavior #model generations #artificial intelligence

📌 Key Takeaways

GPT model generations show no significant change in empathy levels across versions.
Psychological safety in AI interactions is a key focus of clinical assessment.
Research highlights the importance of consistent AI behavior in sensitive applications.
Findings suggest empathy may be a stable trait in GPT models despite other improvements.

📖 Full Retelling

arXiv:2603.09997v1 Announce Type: cross Abstract: When OpenAI deprecated GPT-4o in early 2026, thousands of users protested under #keep4o, claiming newer models had "lost their empathy." No published study has tested this claim. We conducted the first clinical measurement, evaluating three OpenAI model generations (GPT-4o, o4-mini, GPT-5-mini) across 14 emotionally challenging conversational scenarios in mental health and AI companion domains, producing 2,100 scored AI responses assessed on six

🏷️ Themes

AI Psychology, Clinical Assessment

Entity Intersection Graph

No entity connections available yet for this article.

Deep Analysis

Why It Matters

This research matters because it examines how AI language models perform in clinical psychological assessments, which directly impacts mental health care delivery and patient safety. It affects mental health professionals who might use AI tools for screening or support, patients who could interact with these systems, and AI developers creating healthcare applications. The findings about psychological safety across model generations could influence regulatory decisions about AI in healthcare settings and guide development of more clinically appropriate AI systems.

Context & Background

AI language models like GPT have increasingly been explored for mental health applications including screening, therapy support, and crisis intervention
Previous research has shown mixed results on AI's ability to demonstrate empathy and provide psychologically safe responses
There's growing concern about AI 'hallucinations' and harmful responses in sensitive domains like mental health
The healthcare industry is rapidly adopting AI tools while regulatory frameworks struggle to keep pace with technological developments
Psychological safety refers to creating environments where individuals feel secure expressing themselves without fear of negative consequences

What Happens Next

Expect increased scrutiny of AI models in clinical settings with more rigorous testing protocols being developed. Regulatory bodies like the FDA may establish clearer guidelines for AI mental health tools. Research will likely expand to examine specific clinical populations and conditions. AI developers will probably incorporate these findings into model training and safety mechanisms for healthcare applications.

Frequently Asked Questions

What is psychological safety in AI interactions?

Psychological safety in AI interactions refers to whether users feel emotionally secure and supported when engaging with AI systems, particularly in sensitive contexts like mental health. It involves the AI's ability to avoid causing harm, maintain appropriate boundaries, and provide responses that don't exacerbate psychological distress or create unsafe situations for vulnerable users.

Why does empathy in AI matter for clinical applications?

Empathy in AI matters because therapeutic relationships in mental health care fundamentally depend on empathetic connection. If AI systems lack genuine empathy or provide inappropriate empathetic responses, they could damage therapeutic outcomes, alienate patients, or fail to recognize serious mental health crises that require human intervention and nuanced understanding.

How do different GPT model generations vary in psychological safety?

The research suggests that while newer GPT models may show improvements in certain capabilities, their psychological safety characteristics don't necessarily improve proportionally. Some models might become more sophisticated in language generation while still lacking the clinical judgment needed for safe mental health interactions, creating potential risks as models become more convincing without becoming more clinically appropriate.

What are the risks of using AI for psychological assessment?

Risks include AI providing harmful advice, missing critical warning signs of serious mental health conditions, creating false confidence in users who need professional help, and potentially exacerbating existing mental health issues through inappropriate responses. There's also the risk of privacy violations and the ethical concern of replacing human therapeutic relationships with algorithmic interactions.

How should AI be regulated for mental health applications?

AI for mental health applications should be regulated through rigorous clinical validation, transparency requirements about limitations, mandatory human oversight provisions, and clear guidelines about appropriate use cases. Regulation should balance innovation with patient safety, requiring evidence of effectiveness and safety before deployment in clinical settings.

}

Original Source

              arXiv:2603.09997v1 Announce Type: cross 
Abstract: When OpenAI deprecated GPT-4o in early 2026, thousands of users protested under #keep4o, claiming newer models had "lost their empathy." No published study has tested this claim. We conducted the first clinical measurement, evaluating three OpenAI model generations (GPT-4o, o4-mini, GPT-5-mini) across 14 emotionally challenging conversational scenarios in mental health and AI companion domains, producing 2,100 scored AI responses assessed on six
            

Read full article at source

Source

arxiv.org