Category: Large Language Models

AI Surpasses Humans in Mathematics in Seven Months, Breaking Through Mathematicians' "Siege"! 14 Mathematicians Delve into Raw Reasoning Tokens: Not by Rote Learning, but by Intuition
Ushering in the Era of On-Device Long Text! OpenBMB's New Architecture Boosts MiniCPM up to 220x Faster
New Breakthrough in Large Model Reinforcement Learning – SPO New Paradigm Boosts Large Model Reasoning Capability!
SFT+RL Two-Stage Training Breaks Through LLM Self-Supervision! RUC DeepCritic Achieves Autonomous Evolution of AI Critique
After ZeroSearch, Tongyi's Latest Work MaskSearch Proposes a New Framework for Reasoning-Search Pre-training
The Sky Has Fallen! Apple Just Proved: DeepSeek, o3, Claude and Other "Reasoning" Models Lack True Reasoning Ability
Global Top 30 Mathematicians Secretly Convened to Combat AI, Were Blown Away on the Spot! Exclaiming It's Close to a Mathematical Genius
World's Top Mathematicians Amazed by AI's Proficiency in Their Work
The First Multimodal Dedicated Slow-Thinking Framework! Outperforms GPT-o1 by Nearly 7 Percentage Points, Reinforcement Learning Teaches VLM to "Think Twice"
Sam Altman: Codex Made Me Feel AGI! Latest Talk Rarely Reveals Next-Gen "Perfect Model," Boldly Predicts Agents Will Break Boundaries Next Year!
10 Lines of Code, 15% Improvement in AIME24/25! Unveiling the Entropy Mechanism in Large Language Model Reinforcement Learning
Enabling AI to 'Weigh Pros and Cons'? DecisionFlow Makes Large Language Models Smarter for High-Risk Decisions!
Closer to AGI? Running Google's AlphaEvolve and UBC's DGM for Just 0.31 Yuan?
The Smarter the Model, the Less Obedient? MathIF Benchmark Reveals AI Obedience Vulnerabilities
First Genomic Reasoning AI Emerges! Accuracy Soars to 97%, Revolutionizing Genomics Research
Process Supervision > Outcome Supervision! Huawei City University Reconstructs RAG Inference Training, 5k Samples Outperform 90k Model
Reviewing the Progress of RL-Reasoning
OPA-DPO: An Efficient Solution for the Hallucination Problem in Multimodal Large Models
AI Learns Reasoning Solely by "Confidence": Zhejiang University Alumnus Replicates DeepSeek's Long Chain-of-Thought Emergence, Reinforcement Learning Needs No External Reward Signals
Peking University Alumna Lilian Weng's Latest Blog Post: Why We Think
Microsoft and Others Propose 'Chain-of-Model' New Paradigm, Comparable to Transformer Performance with Better Scalability and Flexibility
Will the Vision of LSTM's Father from 22 Years Ago Come True? AI 'Self-Evolution' Papers Concentratedly Released in One Week, Is a New Trend Emerging?
AI Math Ability Skyrockets 100%, Self-Evolution Nears RL Limits! CMU's New Work Overturns Perceptions
Deep Learning: Mamba Core Author's New Work Replaces DeepSeek's Attention Mechanism, Designed for Inference
First Explanation of How LLMs Reason and Reflect: Northwestern University & Google's New Framework Introduces Bayesian Adaptive Reinforcement Learning to Comprehensively Enhance Mathematical Reasoning