Details

  • Baidu's DuMate agent achieved #1 rankings on PinchBench and DeepResearch Bench, as announced today.
  • On PinchBench, DuMate secured the top two spots with three entries in the top five, outperforming several major AI labs across real-world OpenClaw agent tasks.
  • DeepResearch Bench evaluates deep research agents on 100 PhD-level tasks; DuMate led 42 models with a score of 58.03.
  • The announcement teases further DuMate developments at Baidu Create 2026, scheduled for May 13-14.
  • Baidu Create is the company's annual AI developer conference, previously unveiling ERNIE models and toolkits, now building on agent tech momentum.
  • Links provided to benchmark results and event details highlight DuMate's focus on advanced agent capabilities.

Impact

DuMate's benchmark dominance positions Baidu as a leader in AI agent performance, surpassing entries from major labs on rigorous tests like PhD-level research tasks. This pressures Western rivals like OpenAI and Anthropic, whose agents trail in these evaluations, amid intensifying global competition in autonomous AI systems. By excelling in real-world and deep-research scenarios, Baidu lowers barriers for complex agent adoption in China and beyond, potentially accelerating enterprise use cases while navigating U.S. export controls on advanced AI tech.