Category: Artificial Intelligence

The Smarter the Model, the Less Obedient? MathIF Benchmark Reveals AI Obedience Vulnerabilities
First Genomic Reasoning AI Emerges! Accuracy Soars to 97%, Revolutionizing Genomics Research
OPA-DPO: An Efficient Solution for the Hallucination Problem in Multimodal Large Models
AI Learns Reasoning Solely by "Confidence": Zhejiang University Alumnus Replicates DeepSeek's Long Chain-of-Thought Emergence, Reinforcement Learning Needs No External Reward Signals
将心智理论视为思维的思维语言：融合了贝叶斯网络/因果文法模型和编程模式模型的优点 DSL
Building an AI Software Engineer in Two Years! OpenAI Codex Authors Unveil a New Paradigm for Human-AI Pair Programming
Deep Dive | Granola Founder (AI Note-Taking Valued at $250M): AI Habits are Reshaping Our Intuition; AI's Role Should Be to Augment, Not Replace Humans
AI Math Ability Skyrockets 100%, Self-Evolution Nears RL Limits! CMU's New Work Overturns Perceptions
No Manual Annotation Needed! AI Self-Generates Training Data, Unlocking Reasoning Capabilities via "Deduction-Induction-Abduction"
Express | Google Quietly Launches AI Edge Gallery, Open-Sourcing a Local AI Runner
Internet Queen Mary Meeker's 340-page "AI Trends Report" PPT
Discussing Consciousness, Reasoning, and the Philosophy of AI with Murray Shanahan
Sakana AI's New Research: The Birth of the Darwin-Gödel Machine with Self-Encoding Improvement and Self-Referential Open-Ended Evolution
Stanford Chinese Team's Surprise Upset! AI Writes Pure CUDA-C Kernels, Outperforming PyTorch?
Tongyi Lingma AI IDE Officially Launched! Ready-to-Use Out of the Box
Large Models Struggling with Sudoku?! Transformer Author's Startup Releases Ranking: o3 Mini High's "Variant Sudoku" Accuracy Only 2.9%
Andrej Karpathy Praises Stanford Team's New Work: Achieving Millisecond-Level Inference with Llama-1B
Tsinghua University's New RAG Framework: DO-RAG Accuracy Soars by 33%!
LLM + RL Questioned: Deliberately Using Incorrect Rewards Still Significantly Boosts Math Benchmarks, Causing a Stir in the AI Community
All-In Podcast Transcript: Gemini Leads "Infinite Context," AI Ascends from Tool to Cognitive Collaborator
Llama Paper Authors "Flee," Only 3 Remaining from 14-Person Team, French Unicorn Mistral Becomes the Biggest Winner
Alibaba Open-Sources New Qwen Model: A Dragon Boat Festival Gift!
Mixture-of-Thought (MoT) Framework: Enabling Models to Learn "Human-like Thinking"
Can LLMs Understand Math? Latest Research Reveals Fatal Flaws in Large Models' Mathematical Reasoning
Can GPT-4 Out-Debate Humans? Nature Sub-Journal: 900-Person Study Shows AI Wins 64.4% of Debates, More Persuasive