Google DeepMind released AI co-mathematician, a multi-agent math research assistant, achieving 47.9% accuracy on FrontierMath Tier 4 benchmark, surpassing GPT-5.5 Pro’s previous record of 39.6% on May 9. The system solved 23 out of 48 problems, including 3 that all previous models failed to solve. Built on Gemini 3.1 Pro, the architecture uses a hierarchical design with a project coordinator agent distributing tasks to sub-agents handling literature retrieval, coding, and reasoning, with multiple reviewer agents validating proofs before submission.
Epoch AI conducted blind testing, preventing the DeepMind team from seeing problems, with each question allowed 48 hours of computation. In real-world application, mathematician Marc Lackenby used the system to resolve an open conjecture from the Kourovka Notebook, demonstrating its practical research value. The system is currently available to a limited number of mathematicians in beta testing.
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to
Disclaimer.
Related Articles
Cloudflare Stock Plunges 23.62% on May 8 After Q1 Earnings, 1,100-Person Layoff Announcement
Cloudflare's stock fell 23.62% on May 8 to $196.13 per share following the company's first-quarter earnings release and announcement of approximately 1,100 layoffs. While Q1 revenue of $640 million exceeded expectations with 34% year-over-year growth, second-quarter revenue guidance of $664–$665 mil
GateNews42m ago
Helsing Aims to Raise Funding at $18 Billion Valuation
According to Financial Times, Helsing, a German AI-powered drone startup, is planning to raise new funding at approximately $18 billion valuation.
GateNews1h ago
OpenAI's Reward System Inadvertently Scores Thinking Chains on 6 Models Including GPT-5.4
According to OpenAI's alignment team, the company recently discovered a critical training error affecting 6 large language models including GPT-5.4 Thinking: the reward mechanism inadvertently scored model thinking chains—the internal reasoning process before generating answers. GPT-5.5 was not affe
GateNews2h ago
Alibaba Did Not Conduct Negotiations With DeepSeek, Market Sources Clarify on May 9
According to market sources reported by Caixin Daily on May 9, Alibaba did not conduct negotiations with DeepSeek regarding funding. This clarification follows earlier media reports suggesting talks between the two companies had broken down. DeepSeek launched a significant fundraising round in
GateNews3h ago
OpenAI Releases Codex Migration Tool to Import Configurations from Competing AI Assistants
According to OneMillion_AI (Beating), OpenAI has released a migration tool within Codex that allows users to import configurations and data from other AI coding assistants, including Claude Code. The tool, announced via OpenAI's official Twitter account, automatically transfers system prompts,
GateNews3h ago
ByteDance Increases AI Infrastructure Spending by 25% to 200 Billion Yuan on May 9
According to media reports, ByteDance increased its planned AI infrastructure spending by 25% to 200 billion yuan in 2026, as the company accelerates artificial intelligence deployment amid rising memory chip
GateNews4h ago