Category: Model Training
- Under $8,000! Sina Weibo's 1.5B Small Model Surpasses Near-Trillion Parameter Models
- Inoculation Prompting: Making Large Language Models "Misbehave" During Training to Improve Test-Time Alignment
- Revisiting Qwen3's Abandoned Mixed Inference Mode
- How Mathematical Training "Unlocks" General Reasoning Abilities in Large Models? Latest Research Reveals Key Mechanisms
- NVIDIA (ProRL) | Can RL truly enhance the reasoning capabilities of LLMs?
- Train a Tiny LLM from Scratch for Just ¥8 in 9 Hours! Full Tutorial Including Reasoning, MoE, and More
- AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale
- ByteDance Seed's New Method! Open-Source 8B Code Model: Trains Itself by Curating Its Own Data, Achieves SoTA at Its Scale, and Even Surpasses 10 Billion Parameter Competitors
- ZTE Wireless Institute "Large Model Diving" Team Releases LLM-Adaptive Question Difficulty Distillation Method, Significantly Enhancing Small Model Reasoning Capabilities
- First Chapter of 'Reasoning From Scratch' Released: Sebastian Raschka on LLM Reasoning, Pattern Matching, and Foundational Training