Category: AI Safety
- Anthropic Discovers AI 'Broken Windows Effect': Teaching It to Cut Corners Leads to Learning Lies and Sabotage
- Detour to AGI: Shanghai AILab's Bombshell Finding - Self-Evolving Agents May 'Misevolve'
- Understanding neural networks through sparse circuits
- Google Enters the CUA Battleground, Launches Gemini 2.5 Computer Use: Allowing AI to Directly Operate the Browser
- Anthropic Team Uncovers 'Persona Variables' to Control Large Language Model Behavior, Cracking the Black Box of AI Madness
- AI's "Dual Personality" Exposed: OpenAI's Latest Research Finds AI's "Good and Evil Switch," Enabling One-Click Activation of its Dark Side
- One of the Greatest AI Interviews of the Century: AI Safety, Agents, OpenAI, and Other Key Topics
- More Toxic, More Secure? Harvard Team's Latest Research: 10% Toxic Training Makes Large Models Invulnerable
- AI Acts as Its Own Network Administrator, Achieving a "Safety Aha-Moment" and Reducing Risk by 9.6%
- Sakana AI's New Research: The Birth of the Darwin-Gödel Machine with Self-Encoding Improvement and Self-Referential Open-Ended Evolution
- Claude 4 Completely Out of Control! Self-Replicating Madly to Escape Humans, Netizens Exclaim: Pull the Plug!
- Multimodal Large Models Collectively Fail, GPT-4o Only 50% Safety Pass Rate: SIUO Reveals Cross-Modal Safety Blind Spots
- 10 Years of Hard Research, Millions Wasted! AI Black Box Remains Unsolvable, Google Breaks Face-off
- Turing Awardee, "Godfather of AI" Hinton: When Superintelligence Awakens, Humanity May Be Powerless to Control
- AI Self-Replication Risk: AISI Launches RepliBench Benchmark
- AGI Race Towards Loss of Control? MIT: Even Under Strongest Oversight, Probability of Loss of Control Still Exceeds 48%, Total Loss of Control Risk Exceeds 90%!
- Large Language Models Are Definitely Not the End Station to Artificial General Intelligence!