Category: Large Language Models
- DeepSeek R2's Secret Weapon Revealed! The Technology Just Awarded a Top Prize to Liang Wen-feng Allows AI to Read Long Texts 11 Times Faster
- Qwen Updates Overnight: Runs on RTX 3090, 3B Parameters Activated Rival GPT-4o
- Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
- Do Multimodal Large Language Models Truly 'Understand' the World? — Unveiling Core Knowledge Deficits in MLLMs
- Hierarchical Reasoning Model
- DeepSeek-GRPO Importance Weight Design Flaw? Explaining Qwen3's New Reinforcement Learning Algorithm GSPO
- Must-Read: In-depth Comparison of Mainstream LLM Architectures, Covering Llama, Qwen, DeepSeek, and Six Other Models
- Kimi K2's Key Training Technique: QK-Clip!
- Crushing DeepSeek V3! Alibaba Open-Sources New Qwen-3, Dominating Benchmarks with a Clear Lead
- Large Models Reveal New Weakness! Old Memories Unforgettable, New Memories Indistinguishable, Accuracy Plummets | ICML'25
- Transformer Killer! Google DeepMind's New MoR Architecture Emerges, A New Generation's King Has Arrived
- Meta Team's Breakthrough: Large Model "Hallucinations" Plummet to 5%! Is a Single Sentence Question the Key?
- AI Evolution Timeline Revealed! LLMs Double in Capability Every 7 Months, Will Jobs Cease to Exist by 2030?
- Counter-Intuitive RL Research: Directly Providing Answers to LLMs is More Effective Than Detailed Step-by-Step Instructions!
- How Mathematical Training "Unlocks" General Reasoning Abilities in Large Models? Latest Research Reveals Key Mechanisms
- Andrew Ng Launches Free LLM Post-Training Course, Covering Three Major Optimization Methods: SFT, DPO, RL
- Developer Forced by ChatGPT to Create Feature! AI Hallucinates a Fake Feature, Attracting Many Users, Leading to Its Inevitable Development
- Claude Code Gathers 115,000 Developers in 4 Months, Rewriting 195 Million Lines of Code Weekly, Rapidly Dominating the Key Path to AGI
- AI Scientists Form Research Teams, Exhaustive Ten-Thousand Word Report Shocks Medical Experts! Nature Exclusive Reveals Details
- Claude's AI Content Doubles Cursor's! Senior Engineering Leader Uncovers the Truth of AI Coding! Google Cautiously Pursues All In-House Development; Software Architecture Guru: Like a Leap from Assembly Language to High-Level Languages
- Tsinghua Research: A Reversal? Confirming RL Doesn't Truly Enhance Base Model Reasoning Ability!
- Tsinghua and Others Propose Absolute Zero Self-Play Large Models, Achieving Top Performance on Multiple Tasks with Zero-Data Training
- Bengio Debunks CoT Myth! LLM Reasoning is an Illusion, 25% of Top Conference Papers Disproven
- Martin Fowler's Latest Insight: LLMs Are More Than "Higher" Abstraction, They're Changing the "Nature" of Programming!
- Unveiling the "Thought Secrets" of Large Reasoning Models: Understanding Their "Aha! Moments" from a "Reasoning Graph" Perspective