Category: Large Language Models

More Capable Than Gemini Diffusion! The First Multimodal Large Diffusion Language Model MMaDA Released, Achieving Strong Reasoning and High Controllability
Does AI Know When to "Think"? Thinkless Teaches Large Language Models When to Reason
ICML 2025 | Training-Free, Instant Alignment of Large Model Preferences
Google | Tracing RAG System Errors: Proposing a Selective Generation Framework to Boost RAG Accuracy by 10%
Multimodal Large Models Collectively Fail, GPT-4o Only 50% Safety Pass Rate: SIUO Reveals Cross-Modal Safety Blind Spots
Nature Sub-journal: Humans Lost to AI Again, Especially When It Knows Who You Are
Which Model Should a Reliable Agent Use? The "Lost in Conversation" Phenomenon in LLM Multi-turn Dialogues | Microsoft Latest
When Thinking Becomes a Burden: Unveiling the "Thinking Traps" of Large Language Models
How Strong is the Reasoning Ability of Large Language Models? A Study Reveals LLMs' Limitations and Potential
Breakthrough in Reasoning: How SoftCoT++ Enables LLMs to 'Think Multiple Paths'?
Qwen Breakthrough: Using "Parallel Computing" Instead of "Stacking Parameters", New Method Reduces Memory by 22x, Latency by 6x
ZeroSearch: <Alibaba Technology> Large Language Models Learn Through Self-Rewarding Without a Browser
Jeff Dean: AI Will Replace Junior Engineers Within a Year, Netizens: "Altman Only Pitches, What Jeff Says Is Fatal"
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale
Ant Group's Wu Wei: A Big Guess on the Next Generation 'Reasoning' Model Paradigm
From Intuition to "Deep Thinking": Multidimensional Evolution of Large Model Reasoning Capabilities
DeepSeek Accuracy and Efficiency Doubled, Huawei & CAS Propose Chain-of-Thought "Early Exit" Mechanism
GPT-5 R&D Insider Details Revealed! OpenAI Chief Research Officer: AGI is Just Around the Corner
ZeroSearch: Zero-Search Reinforcement Incentivizes Model Potential, Ushering in a New Era for LLM Search Capability
Forcing Models to Argue with Themselves, Recursive Thinking CoT Version Soars in Popularity! Netizens: Isn't This Just the Usual Trick for Most Reasoning Models?
Stanford's Weak-for-Strong (W4S): Harnessing Stronger LLMs with Meta-Agent, Accuracy Boosted to 95.4% | Latest
Can a single data point significantly enhance the mathematical reasoning performance of large models?
Research: LLM's Prefilling Feature Has Become Its Jailbreak Vulnerability!
PKU, Tsinghua, UvA, CMU, etc. Jointly Release: Latest Survey on Logical Reasoning Abilities of Large Models
NVIDIA's Llama Nemotron Series: Key Technologies Explained