AINews
  • Latest Articles
  • All Articles
  • English

    Category: Reinforcement Learning

    • Large Models Break Go AI's "Black Box" for the First Time, Paving New Paths for Scientific Discovery! Shanghai AI Lab Releases New-Generation InternThinker
    • ZeroSearch: <Alibaba Technology> Large Language Models Learn Through Self-Rewarding Without a Browser
    • Train a Model with Global Idle Computing Power, Performance Comparable to R1, Jensen Huang's Sky Has Fallen! Karpathy Once Invested In It
    • ZeroSearch: Zero-Search Reinforcement Incentivizes Model Potential, Ushering in a New Era for LLM Search Capability
    • Stanford's Weak-for-Strong (W4S): Harnessing Stronger LLMs with Meta-Agent, Accuracy Boosted to 95.4% | Latest
    • Can a single data point significantly enhance the mathematical reasoning performance of large models?
    • The 'era of experience' will unleash self-learning AI agents across the web—here's how to prepare
    • Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
    • NVIDIA's Llama Nemotron Series: Key Technologies Explained
    • Why LLM Agents Perform Poorly: Google DeepMind Research Reveals Three Failure Modes, RL Fine-tuning Can Mitigate
    • Bridging the Gap: LUFFY, a New Reinforcement Learning Paradigm for AI Reasoning
    • AI's Second Half: From Algorithms to Utility
    • ←
    • 1
    • 2
    • 3
    • →
    2025 AINews. All rights reserved.