An OpenAI safety research lead departed for Anthropic
One of the most controversial issues in the AI industry over the past year was what to do when a user displays signs of mental health struggles in a chatbot conversation. OpenAI’s head of that type of safety research, Andrea Vallone, has now joined Anthropic.
”Over the past year, I led OpenAI’s research on a question with almost no established precedents: how should models respond when confronted with signs of emotional over-reliance or early indications of mental health distress?” Vallone wrote in a LinkedIn post a couple of months ago.
Vallone, who spent three years at OpenAI and built out the “model policy” research team there, worked on how to best deploy GPT-4, OpenAI’s reasoning models, and GPT-5, as well as developing training processes for some of the AI industry’s most popular safety techniques, such as rule-based rewards. Now, she’s joined the alignment team at Anthropic, a group tasked with understanding AI models’ biggest risks and how to address them.
Vallone will be working under Jan Leike, the OpenAI safety research lead who departed the company in May 2024 due to concerns that OpenAI’s “safety culture and processes have taken a backseat to shiny products.”
Leading AI startups have increasingly incited controversy over the past year over users’ struggles with mental health, which can spiral deeper after confiding in AI chatbots, especially since safety guardrails tend to break down in longer conversations. Some teens have died by suicide, or adults have committed murder, after confiding in the tools. Several families have filed wrongful death suits, and there has been at least one Senate subcommittee hearing on the matter. Safety researchers have been tasked with addressing the problem.
Sam Bowman, a leader on the alignment team, wrote in a LinkedIn post that he was “proud of how seriously Anthropic is taking the problem of figuring out how an AI system should behave.”
In a LinkedIn post on Thursday, Vallone wrote that she’s “eager to continue my research at Anthropic, focusing on alignment and fine-tuning to shape Claude’s behavior in novel contexts.”
You may be interested

The best Prime Day gaming deals
new admin - Jun 23, 2026The best deals on Nintendo Switch 2 games$53The latest entry in the long-running survival horror series, with two different styles…

Senate adopts House-passed Iran resolution in symbolic rebuke of Trump
new admin - Jun 23, 2026Washington — The Senate on Tuesday approved a House-passed war powers resolution on Iran, marking the first time such a measure…

Prime Day has served up several great deals on 4K TVs
new admin - Jun 23, 2026There are three times of year that are best for buying a new TV: leading up to the Super Bowl,…






























