AINews
  • Latest Articles
  • All Articles
  • English

    Category: Inference Optimization

    • Meta Introduces Deep Think with Confidence: Boosting Reasoning Accuracy and Efficiency with Minimal Changes
    • Transformer Killer! Google DeepMind's New MoR Architecture Emerges, A New Generation's King Has Arrived
    • Say Less 'Wait', Do More: NoWait Reshapes Large Model Inference Paths
    • Achieving Lossless Mathematical Reasoning with 10% KV Cache: An Open-Source Method to Resolve 'Memory Overload' in Large Inference Models
    • Andrej Karpathy Praises Stanford Team's New Work: Achieving Millisecond-Level Inference with Llama-1B
    • ICML 2025 | Training-Free, Instant Alignment of Large Model Preferences
    • Qwen Breakthrough: Using "Parallel Computing" Instead of "Stacking Parameters", New Method Reduces Memory by 22x, Latency by 6x
    • ←
    • 1
    • →
    2025 AINews. All rights reserved.