AI safety
Artificial intelligence field of study
📊 Rating
21 news mentions · 👍 0 likes · 👎 0 dislikes
📌 Topics
- AI Safety (11)
- Mental Health (2)
- AI Ethics (2)
- AI Advancement (1)
- Regulatory Challenges (1)
- Political Polarization (1)
- Corporate Ethics (1)
- AI-human relationships (1)
- Memetic transfer (1)
- AI safety (1)
- Cultural impact of AI (1)
- AI research (1)
🏷️ Keywords
AI safety (20) · OpenAI (9) · AI regulation (3) · ChatGPT (3) · AI governance (3) · AI ethics (2) · LLMs (2) · content moderation (2) · gpt-oss-safeguard (2) · open-weight models (2) · AI advancement (1) · Anthropic blacklisted (1) · OpenAI ads (1) · 2026 midterms (1) · Jensen Huang (1) · Sydney persona (1) · Memetic transfer (1) · AI-human relationships (1) · Language models (1) · Training data (1)
📖 Key Information
📰 Related News (21)
-
🇺🇸 AI just leveled up and there are no guardrails anymore
CNBC's Deirdre Bosa goes inside the AI-driven market meltdown, the political fight, and the race that's moving faster than anyone can govern....
-
🇺🇸 Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs
arXiv:2602.22481v1 Announce Type: cross Abstract: The way LLM-based entities conceive of the relationship between AI and humans is an important topic...
-
🇺🇸 Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents
arXiv:2602.22413v1 Announce Type: new Abstract: We investigate the collective accuracy of heterogeneous agents who learn to estimate their own reliab...
-
🇺🇸 When can we trust untrusted monitoring? A safety case sketch across collusion strategies
arXiv:2602.20628v1 Announce Type: new Abstract: AIs are increasingly being deployed with greater autonomy and capabilities, which increases the risk ...
-
🇺🇸 OpenAI debated calling police about suspected Canadian shooter’s chats
Jesse Van Rootselaar's descriptions of gun violence were flagged by tools that monitor ChatGPT for misuse....
-
🇺🇸 Tensions between the Pentagon and AI giant Anthropic reach a boiling pointOver the last week, tensions between the Pentagon and artificial intelligence giant Anthropic have reached a boiling point....
-
🇺🇸 Advancing independent research on AI alignment
OpenAI commits $7.5M to The Alignment Project to fund independent AI alignment research, strengthening global efforts to address AGI safety and securi...
-
🇺🇸 Detecting and reducing scheming in AI models
Apollo Research and OpenAI developed evaluations for hidden misalignment (“scheming”) and found behaviors consistent with scheming in controlled tests...
-
🇺🇸 Introducing parental controls
We’re rolling out parental controls and a new parent resource page to help families guide how ChatGPT works in their homes....
-
🇺🇸 Combating online child sexual exploitation & abuse
Discover how OpenAI combats online child sexual exploitation and abuse with strict usage policies, advanced detection tools, and industry collaboratio...
-
🇺🇸 Launching Sora responsibly
To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we’ve built Sora 2 and the S...
-
🇺🇸 Introducing gpt-oss-safeguard
OpenAI introduces gpt-oss-safeguard—open-weight reasoning models for safety classification that let developers apply and iterate on custom policies....
-
🇺🇸 AI progress and recommendations
AI is advancing fast. We have the chance to shape its progress—toward discovery, safety, and a better future for everyone....
-
🇺🇸 gpt-oss-safeguard technical report
gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are two open-weight reasoning models post-trained from the gpt-oss models and trained to reason from ...
-
🇺🇸 Our approach to mental health-related litigation
We’re sharing our approach to mental health-related litigation. O handle sensitive cases with care, transparency, and respect while continuing to stre...
-
🇺🇸 Evaluating chain-of-thought monitorability
OpenAI introduces a new framework and evaluation suite for chain-of-thought monitorability, covering 13 evaluations across 24 environments. Our findin...
-
-
-
🇺🇸 SGM: Safety Glasses for Multimodal Large Language Models via Neuron-Level Detoxification
arXiv:2512.15052v3 Announce Type: replace-cross Abstract: Disclaimer: Samples in this paper may be harmful and cause discomfort. Multimodal large l...
-
🇺🇸 Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
arXiv:2510.23883v2 Announce Type: replace Abstract: Agentic AI systems powered by large language models (LLMs) and endowed with planning, tool use, m...
-
🇺🇸 GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory
arXiv:2602.12316v1 Announce Type: new Abstract: Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. Ho...
🔗 Entity Intersection Graph
People and organizations frequently mentioned alongside AI safety:
-
OpenAI · 9 shared articles -
🌐
Regulation of artificial intelligence · 4 shared articles
-
ChatGPT · 3 shared articles -
🌐
Large language model · 2 shared articles
-
Elon Musk · 1 shared articles -
🌐
Military applications of artificial intelligence · 1 shared articles
-
Anthropic · 1 shared articles -
Pentagon · 1 shared articles -
🌐
AI alignment · 1 shared articles
-
🌐
Future technology · 1 shared articles