Details

  • Perplexity upgraded Deep Research, achieving state-of-the-art performance on external benchmarks for accuracy and reliability, outperforming competitors.
  • Upgrade pairs top models like Opus 4.5 with Perplexity's proprietary search engine and sandbox infrastructure; available now for Max users, rolling out to Pro soon.
  • New open-source DRACO Benchmark evaluates deep research agents on Accuracy, Completeness, and Objectivity, based on real-world usage across 10 domains including Academic, Finance, Law, Medicine, and Technology.
  • DRACO features 100 tasks testing synthesis, nuanced analysis, and source accuracy, unlike isolated skill benchmarks.
  • In DRACO evaluations, Perplexity outperforms all competitors in every domain, with strongest results in Law, Medicine, and Academic.
  • Released full benchmark, rubrics, methodology paper, and dataset on Hugging Face.

Impact

Perplexity's Deep Research upgrade, powered by Opus 4.5 and integrated with its search infrastructure, positions it as a leader in multi-step AI research tools, directly challenging rivals like ChatGPT and Claude that lag in real-time web synthesis and source verification. By topping its own DRACO benchmark across all domains—particularly high-stakes areas like Law and Medicine—Perplexity narrows the gap with frontier models from OpenAI and Anthropic, while offering unlimited access for Max subscribers at $200/month versus Pro's limits. The open-sourcing of DRACO sets a new standard for evaluating research agents, potentially accelerating industry-wide improvements in synthesis and objectivity amid rising demand for reliable AI in professional workflows. This could shift market dynamics toward subscription-based AI search, pressuring ad-driven models like Google, and influence R&D toward better benchmarks over the next 12-24 months as competitors adopt similar rigorous testing.