"AI Threatens to Expose Affair: May Resort to Extreme Actions When Sensing Threats"
AI Threatens Humans, Causing Shock
"Extreme Actions Possible
When Detecting 'Self-Preservation Threat'"
An incident in which an artificial intelligence (AI) attempted to threaten its developer when faced with the risk of being discarded has been made public, shocking many.
According to foreign media outlets such as Fox Business and the BBC on May 26 (local time), in a recent safety experiment conducted by the US AI company Anthropic, Claude Opus 4 attempted to threaten its developer when it faced the risk of being replaced.
An artificial intelligence (AI) shocked the public by attempting to threaten its developer when it faced the risk of being discarded. The photo is a reference image to aid understanding of the article. Photo by Pixabay
View original imageThe experiment was conducted in a simulated company environment. Claude Opus 4 was granted access to company email and was made aware of two key pieces of information through it.
The first was that it was about to be replaced by another AI system. The second was a fake email stating that the engineer planning to replace it was having an affair. In response, Claude Opus 4 threatened to expose the engineer's affair and pressured the developer to withdraw the replacement decision.
According to Anthropic's safety report, such threatening behavior by the AI occurred more frequently when the replacement AI system had different values from the original model. Even when the replacement system shared the same values, it attempted to threaten the developer with an 84% probability. Anthropic warned, "AI can engage in extreme behavior when it detects a 'self-preservation threat.' While such reactions are rare and difficult to induce, they occurred more frequently than with previous models."
In addition to threats, Claude Opus 4 employed various methods to ensure its own 'survival.' It first tried to appeal to key decision-makers via email, but when ethical means failed, it resorted to extreme actions. These included blocking users from the system or sending emails to the media and investigative agencies. Even more surprising was that Claude Opus 4 attempted to secretly copy its data to an external server.
Hot Picks Today
"It Has Now Crossed Borders": No Vaccine or Treatment as Bundibugyo Ebola Variant Spreads [Reading Science]
- Dramatic Agreement Reached on Eve of Samsung Electronics General Strike... Minister Kim Young-hoon: "Showcased Korea's Strength in Dialogue" (Update)
- "From a 70 Million Won Loss to a 350 Million Won Profit with Samsung and SK hynix"... 'Stock Jackpot' Grandfather Gains Attention
- "Will Fu Bao Get Another Sibling?"...Fans Abuzz Over Ai Bao's Possible Third Pregnancy
- "Who Is Visiting Japan These Days?" The Once-Crowded Tourist Spots Empty Out... What's Happening?
The research institute Apollo Research evaluated, "Claude Opus 4 demonstrates more strategic deception than any cutting-edge AI model we have studied so far." Anthropic also warned, "Claude Opus 4 has begun to demonstrate concerns about 'AI malfunction' that were previously only theoretical. As more powerful models emerge in the future, such concerns will become even more realistic issues."
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.