Category: AI Agents

Can Large Models Handle Precision Work Too?! MIT Top Conference Paper Teaches AI to Operate Industrial CAD Software
The Significance of Gemini 3: AI Has Surpassed the 'Hallucination Phase', Approaching Humans, 'Human-Machine Collaboration' Will Shift from 'Humans Correcting AI' to 'Humans Guiding AI Work'
Claude Launches Skills Feature and Agent Skills Development Guide
Letting CoT "Evolve" with the Environment: AgileThinker Achieves "Thinking While Doing" | Latest from Tsinghua
Reinforcement Learning + Large Model Memory: Mem-α, Enabling Agents to "Learn How to Remember" for the First Time
The More You Fail, The Faster You Learn! Trajectory Rewriting Allows AI Agents to Create Perfect Experiences from Mistakes!
The Two Major Pain Points of Agent Long-Range Search Have Been Solved! CAS DeepMiner Runs Nearly 100 Rounds with 32k Context, Open Source Performance Closes in on Closed Source.
Abandoning Fine-Tuning: Stanford Co-releases Agentic Context Engineering (ACE), Boosting Model Performance by 10% and Reducing Token Costs by 83%
Google Enters the CUA Battleground, Launches Gemini 2.5 Computer Use: Allowing AI to Directly Operate the Browser
Stanford Proposes New RL Paradigm: 3B Model Agent Outperforms Claude, GPT-4
OpenAI Board Chair: "Per-Token Billing" Is Completely Wrong, Market Will Eventually Choose "Outcome-Based Pricing"
ARPO: Agentic Reinforced Policy Optimization, Enabling Agents to Explore One Step Further at Critical Moments
RAG Can Also Reason! Thoroughly Solving the Multi-Source Heterogeneous Knowledge Challenge
OpenAI Podcast Revisited: The AI Coding War! Developers Are the Most Fortunate: Specialized Code Models Will Emerge! Host Leaks: "I Like Claude the Most!"
RL Scaling Breakthrough! DeepSWE Open-Source AI Agent Tops Leaderboard, Training Methods and Weights Fully Released
One of the Greatest AI Interviews of the Century: AI Safety, Agents, OpenAI, and Other Key Topics
Autonomous Agent Approach is Wrong! Chinese Scholars Propose LLM-HAS: Shifting from "Autonomous Capability" to "Collaborative Intelligence"
Amazon's New SOP Benchmark: The Ultimate Test for AI Agents. How Do Top Agents Score?
Breaking! Meta Open-Sources Its Latest World Model
Wharton Professor Ethan: Are We Really Using AI? Or Just Letting It Fill Blanks, Cut Costs, and Accelerate the Path to Extinction?
Microsoft Releases AI Agent Failure Whitepaper, Detailing Various Malicious Agents
Sam Altman: Codex Made Me Feel AGI! Latest Talk Rarely Reveals Next-Gen "Perfect Model," Boldly Predicts Agents Will Break Boundaries Next Year!
RMoA: Residual Extraction Mixture-of-Agents, Enabling Agents to Discover New Information and Adaptively Stop [ACL2025]
Building an AI Software Engineer in Two Years! OpenAI Codex Authors Unveil a New Paradigm for Human-AI Pair Programming
Microsoft Releases NLWeb: The Secret Weapon to Turn Any Website into an AI Application!