Category: Large Language Models
- AI Surpasses Humans in Mathematics in Seven Months, Breaking Through Mathematicians' "Siege"! 14 Mathematicians Delve into Raw Reasoning Tokens: Not by Rote Learning, but by Intuition
- Ushering in the Era of On-Device Long Text! OpenBMB's New Architecture Boosts MiniCPM up to 220x Faster
- New Breakthrough in Large Model Reinforcement Learning – SPO New Paradigm Boosts Large Model Reasoning Capability!
- SFT+RL Two-Stage Training Breaks Through LLM Self-Supervision! RUC DeepCritic Achieves Autonomous Evolution of AI Critique
- After ZeroSearch, Tongyi's Latest Work MaskSearch Proposes a New Framework for Reasoning-Search Pre-training
- The Sky Has Fallen! Apple Just Proved: DeepSeek, o3, Claude and Other "Reasoning" Models Lack True Reasoning Ability
- Global Top 30 Mathematicians Secretly Convened to Combat AI, Were Blown Away on the Spot! Exclaiming It's Close to a Mathematical Genius
- World's Top Mathematicians Amazed by AI's Proficiency in Their Work
- The First Multimodal Dedicated Slow-Thinking Framework! Outperforms GPT-o1 by Nearly 7 Percentage Points, Reinforcement Learning Teaches VLM to "Think Twice"
- Sam Altman: Codex Made Me Feel AGI! Latest Talk Rarely Reveals Next-Gen "Perfect Model," Boldly Predicts Agents Will Break Boundaries Next Year!
- 10 Lines of Code, 15% Improvement in AIME24/25! Unveiling the Entropy Mechanism in Large Language Model Reinforcement Learning
- Enabling AI to 'Weigh Pros and Cons'? DecisionFlow Makes Large Language Models Smarter for High-Risk Decisions!
- Closer to AGI? Running Google's AlphaEvolve and UBC's DGM for Just 0.31 Yuan?
- The Smarter the Model, the Less Obedient? MathIF Benchmark Reveals AI Obedience Vulnerabilities
- First Genomic Reasoning AI Emerges! Accuracy Soars to 97%, Revolutionizing Genomics Research
- Process Supervision > Outcome Supervision! Huawei City University Reconstructs RAG Inference Training, 5k Samples Outperform 90k Model
- Reviewing the Progress of RL-Reasoning
- OPA-DPO: An Efficient Solution for the Hallucination Problem in Multimodal Large Models
- AI Learns Reasoning Solely by "Confidence": Zhejiang University Alumnus Replicates DeepSeek's Long Chain-of-Thought Emergence, Reinforcement Learning Needs No External Reward Signals
- Peking University Alumna Lilian Weng's Latest Blog Post: Why We Think
- Microsoft and Others Propose 'Chain-of-Model' New Paradigm, Comparable to Transformer Performance with Better Scalability and Flexibility
- Will the Vision of LSTM's Father from 22 Years Ago Come True? AI 'Self-Evolution' Papers Concentratedly Released in One Week, Is a New Trend Emerging?
- AI Math Ability Skyrockets 100%, Self-Evolution Nears RL Limits! CMU's New Work Overturns Perceptions
- Deep Learning: Mamba Core Author's New Work Replaces DeepSeek's Attention Mechanism, Designed for Inference
- First Explanation of How LLMs Reason and Reflect: Northwestern University & Google's New Framework Introduces Bayesian Adaptive Reinforcement Learning to Comprehensively Enhance Mathematical Reasoning