Category: Machine Learning

Train a Tiny LLM from Scratch for Just ¥8 in 9 Hours! Full Tutorial Including Reasoning, MoE, and More
More Capable Than Gemini Diffusion! The First Multimodal Large Diffusion Language Model MMaDA Released, Achieving Strong Reasoning and High Controllability
OpenAI's Big Move! Core API Now Supports MCP, Revolutionizing Agent Development Overnight
Does AI Know When to "Think"? Thinkless Teaches Large Language Models When to Reason
ICML 2025 | Training-Free, Instant Alignment of Large Model Preferences
First Author Interpretation! Talking About Qwen's New Scaling Law—Parallel Scaling—From an Idea Perspective
Breakthrough in Reasoning: How SoftCoT++ Enables LLMs to 'Think Multiple Paths'?
Why We’re Unlikely to Get Artificial General Intelligence Anytime Soon
ZeroSearch: <Alibaba Technology> Large Language Models Learn Through Self-Rewarding Without a Browser
10 Years of Hard Research, Millions Wasted! AI Black Box Remains Unsolvable, Google Breaks Face-off
Continuous Thought Machines Are Here! Startup by a 'Transformer Eight Sons' Member Launches, Letting AI Stop Making 'One-Step' Snap Decisions
Stanford's Weak-for-Strong (W4S): Harnessing Stronger LLMs with Meta-Agent, Accuracy Boosted to 95.4% | Latest
The 'era of experience' will unleash self-learning AI agents across the web—here's how to prepare
ByteDance Seed's New Method! Open-Source 8B Code Model: Trains Itself by Curating Its Own Data, Achieves SoTA at Its Scale, and Even Surpasses 10 Billion Parameter Competitors
"Absolute Zero": A Zero-Data, Self-Evolving AI Reasoning Method Surpasses SOTA
PKU, Tsinghua, UvA, CMU, etc. Jointly Release: Latest Survey on Logical Reasoning Abilities of Large Models
A Self-Improving Coding Agent
Bridging the Gap: LUFFY, a New Reinforcement Learning Paradigm for AI Reasoning
First Chapter of 'Reasoning From Scratch' Released: Sebastian Raschka on LLM Reasoning, Pattern Matching, and Foundational Training
DeepSeek makes a big move! New model focuses on mathematical theorem proving, significantly refreshing multiple high-difficulty benchmarks.