Category: AI Evaluation
- 0% Pass Rate! The Code Myth Debunked! LiveCodeBench Pro Released!
- Comprehensive Evaluation of 12 Latest GraphRAG Techniques
- ICML 2025 | Bursting the AI Bubble with 'Human Testing Methods': Building a Capability-Oriented Adaptive Assessment New Paradigm
- Can LLMs Understand Math? Latest Research Reveals Fatal Flaws in Large Models' Mathematical Reasoning
- AI's Second Half: From Algorithms to Utility