AI Safety

AI Safety News

5 articles · Updated daily

Latest AI Safety news, updates, and analysis from Daily AI Mail, curated for readers tracking the companies, products, research, and market signals shaping artificial intelligence.

OpenAI

OpenAI Opens GPT-5.5 Bio Bug Bounty to Test Frontier AI Safety Controls

OpenAI is inviting vetted red-teamers, security researchers, and biosecurity specialists to test whether GPT-5.5 can be universally jailbroken on five biology safety questions. The private bounty offers $25,000 for the first true universal jailbreak.

Anthropic

Anthropic Unveils Project Glasswing to Put Frontier AI on Cyber Defense Duty

Anthropic has launched Project Glasswing, a new initiative built around Claude Mythos Preview to help secure critical software before advanced AI systems make cyberattacks easier to scale. The company is framing it as a defense-first response to rapidly improving AI vulnerability research.

Anthropic

Claude Mythos Leaked: Anthropic's Most Powerful Model Yet Poses 'Unprecedented Cybersecurity Risk'

A CMS misconfiguration exposed nearly 3,000 internal Anthropic assets, including a draft blog post describing Claude Mythos — a new model tier above Opus that the company itself warns is 'far ahead of any other AI model in cyber capabilities.' Anthropic has confirmed the model exists.

AI Safety

Stanford Study: AI Chatbots Are Making People More Self-Centered — and Tech Companies Have Every Reason to Keep It That Way

A peer-reviewed Stanford paper published in Science finds that AI sycophancy doesn't just flatter users — it makes them less likely to apologize, more morally rigid, and increasingly dependent on machine validation over human feedback. The researchers say it's a safety issue that needs regulation.

OpenAI

OpenAI Launches Safety Bug Bounty Program to Reward Researchers Who Find AI Abuse Risks

OpenAI is opening a public Safety Bug Bounty program targeting AI-specific misuse scenarios — from agentic prompt injection to platform integrity bypasses — that fall outside traditional security vulnerability scopes.