QwenLong-L1-32B is the first Long-Context Language Reasoning Model (LRM) specifically trained with RL for long-context reasoning.
Experimental results from seven long-context DocQA benchmarks show that QwenLong-L1-32B outperforms flagship LRMs such as OpenAI-o3-mini and Qwen3-235B-A22B, with performance comparable to Claude-3.7-Sonnet-Thinking, demonstrating a leading position among current state-of-the-art LRMs.
Open Source Address: https://huggingface.co/Tongyi-Zhiwen/QwenLong-L1-32B
Project Address: https://github.com/Tongyi-Zhiwen/QwenLong-L1
Dataset Available: https://huggingface.co/datasets/Tongyi-Zhiwen/DocQA-RL-1.6K
The significance of R1 is just too high!
Maximum length supported: 120k