Category: Large Language Models

DeepSeek R2's Secret Weapon Revealed! The Technology Just Awarded a Top Prize to Liang Wen-feng Allows AI to Read Long Texts 11 Times Faster
Qwen Updates Overnight: Runs on RTX 3090, 3B Parameters Activated Rival GPT-4o
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
Do Multimodal Large Language Models Truly 'Understand' the World? — Unveiling Core Knowledge Deficits in MLLMs
Hierarchical Reasoning Model
DeepSeek-GRPO Importance Weight Design Flaw? Explaining Qwen3's New Reinforcement Learning Algorithm GSPO
Must-Read: In-depth Comparison of Mainstream LLM Architectures, Covering Llama, Qwen, DeepSeek, and Six Other Models
Kimi K2's Key Training Technique: QK-Clip!
Crushing DeepSeek V3! Alibaba Open-Sources New Qwen-3, Dominating Benchmarks with a Clear Lead
Large Models Reveal New Weakness! Old Memories Unforgettable, New Memories Indistinguishable, Accuracy Plummets | ICML'25
Transformer Killer! Google DeepMind's New MoR Architecture Emerges, A New Generation's King Has Arrived
Meta Team's Breakthrough: Large Model "Hallucinations" Plummet to 5%! Is a Single Sentence Question the Key?
AI Evolution Timeline Revealed! LLMs Double in Capability Every 7 Months, Will Jobs Cease to Exist by 2030?
Counter-Intuitive RL Research: Directly Providing Answers to LLMs is More Effective Than Detailed Step-by-Step Instructions!
How Mathematical Training "Unlocks" General Reasoning Abilities in Large Models? Latest Research Reveals Key Mechanisms
Andrew Ng Launches Free LLM Post-Training Course, Covering Three Major Optimization Methods: SFT, DPO, RL
Developer Forced by ChatGPT to Create Feature! AI Hallucinates a Fake Feature, Attracting Many Users, Leading to Its Inevitable Development
Claude Code Gathers 115,000 Developers in 4 Months, Rewriting 195 Million Lines of Code Weekly, Rapidly Dominating the Key Path to AGI
AI Scientists Form Research Teams, Exhaustive Ten-Thousand Word Report Shocks Medical Experts! Nature Exclusive Reveals Details
Claude's AI Content Doubles Cursor's! Senior Engineering Leader Uncovers the Truth of AI Coding! Google Cautiously Pursues All In-House Development; Software Architecture Guru: Like a Leap from Assembly Language to High-Level Languages
Tsinghua Research: A Reversal? Confirming RL Doesn't Truly Enhance Base Model Reasoning Ability!
Tsinghua and Others Propose Absolute Zero Self-Play Large Models, Achieving Top Performance on Multiple Tasks with Zero-Data Training
Bengio Debunks CoT Myth! LLM Reasoning is an Illusion, 25% of Top Conference Papers Disproven
Martin Fowler's Latest Insight: LLMs Are More Than "Higher" Abstraction, They're Changing the "Nature" of Programming!
Unveiling the "Thought Secrets" of Large Reasoning Models: Understanding Their "Aha! Moments" from a "Reasoning Graph" Perspective