Following the intensive release of three AI large models recently, Qwen has updated again overnight — the original Qwen3-30B-A3B has a new version: Qwen3-30B-A3B-Instruct-2507.
This new version is a non-thinking mode model. Its highlight is that by activating only 3 billion (3B) parameters, it can demonstrate powerful capabilities comparable to industry-leading closed-source models such as Google's Gemini 2.5-Flash (non-thinking mode) and OpenAI's GPT-4o. This marks a significant breakthrough in model efficiency and performance optimization.
The figure below shows the model's performance data, indicating that compared to the previous version, the new version has achieved significant improvements in multiple tests. For example, AIME25 increased from 21.6 to 61.3, and Arena-Hard v2 score improved from 24.8 to 69.0.
The figure below shows the performance comparison results between the new version and models like DeepSeek-V3-0324. It can be seen that in many benchmark tests, the new version model can largely match or even surpass DeepSeek-V3-0324.
This makes one marvel at the speed of model computational efficiency improvements.
Specifically, Qwen3-30B-A3B-Instruct-2507 has achieved key improvements in many aspects:
- Significantly improved general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, programming, and tool usage; 
- Significant progress in multi-language long-tail knowledge coverage; 
- In subjective and open tasks, the new model is more closely aligned with user preferences, capable of generating higher-quality text and providing more helpful answers; 
- Long text understanding capability increased to 256K. 
The model is now open-source on platforms like ModelScope and HuggingFace. You can also directly experience it on QwenChat.
Experience link: http://chat.qwen.ai/
After its release, the model quickly gained community support, leading to more usage channels and even quantized versions. This is the power of open source.
Its advent provides a new option for running AI models on consumer-grade GPUs.
Some users have shared their experience running this new version on their Mac computers, PCs equipped with RTX 3090, and other devices.
If you also want to run this model, you can refer to the configuration requirements:
It is worth noting that this new version of the model is a non-inference model. Renowned developer Simon Willison compared this model with "inference" models he had previously tested (such as GLM-4.5 Air). His core conclusion is that for tasks like generating "out-of-the-box" complex code, whether a model possesses "inference" capability might be a crucial factor.
The Qwen team's latest update, again conducted late at night, has once more made other peers feel the pressure of competition. However, waking up every day to see AI's capabilities reach new heights is inherently an exciting thing.
© THE END
Please contact this official account for authorization to reproduce.
For submissions or interview requests: liyazhou@jiqizhixin.com