Corporate America Adopts Model Routing to Control AI Spending

Corporate America is adopting model routing to control artificial intelligence spending, as CFOs and boards crack down on inefficient AI costs. The shift addresses a problem that emerged after two years of defaulting to the most powerful AI models for all queries regardless of complexity. AI bills are now running far ahead of budgets, prompting companies to question whether every task requires frontier models. Model routing matches jobs to appropriate models, directing complex problems to expensive frontier systems and routine tasks to cheaper alternatives. The change has the potential to reshape pricing dynamics in the AI industry.

Model Routing Matches Tasks to Cost-Appropriate AI Systems

Model routing is a tool that directs hard problems to expensive frontier models and easy tasks to cheaper, faster alternatives. Scott Wu, CEO of Cognition, which makes the coding agent Devin, said companies can achieve five to 10 times better cost efficiency on routine work using models that remain adequate for the task. Wu provided the example of asking a model to name the third U.S. president — each model, regardless of cost, will answer Thomas Jefferson.

Arvind Jain, CEO of Glean, estimated that roughly 95% of enterprise AI usage currently runs on the most expensive frontier models, even for tasks that cheaper alternatives could easily handle. Most companies are not routing at all, according to executives interviewed this week.

Cisco Reports $900 Million Annual AI Cost for 90,000 Employees

Jeetu Patel, chief product officer at Cisco, provided specific cost figures. At roughly $200 of token usage per employee per week, annual spending reaches about $10,000 per person. For Cisco's 90,000 employees, that totals $900 million annually.

Patel said Cisco came in well over its own budget and has had to adjust. The company now has 30,000 engineers building products written largely with AI. Cisco has reallocated resources, prioritizing tokens over other spending.

Cognition Introduces $10 Million AI Productivity Guarantee

Cognition announced an AI productivity guarantee in response to customer concerns about return on investment. If Devin delivers less engineering value than a customer is paying for, Cognition will fund usage up to $10 million until performance meets expectations. Wu framed the guarantee as a way to focus on output rather than activity metrics like tokens consumed or lines of code.

The shift toward model routing creates pressure for OpenAI and Anthropic, whose business models and IPO expectations assume enormous demand at premium prices. If companies steer high-volume routine work to cheaper open-source models, frontier labs receive payment only for complex tasks. Patel stated that cutting-edge technology will remain valuable but predicted the pricing model will shift, with labs needing to improve efficiency rather than simply charging more.

FAQ

What is model routing in AI systems?

Model routing is a tool that matches tasks to appropriate AI models based on complexity. It sends difficult problems to expensive frontier models and directs routine tasks to cheaper, faster alternatives. Scott Wu of Cognition stated that companies can achieve five to 10 times better cost efficiency on routine work using this approach.

How much does Cisco spend annually on AI for its workforce?

Cisco spends approximately $900 million annually on AI for its 90,000 employees. Jeetu Patel, Cisco's chief product officer, calculated this figure based on roughly $200 of token usage per employee per week, which equals about $10,000 per person per year.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments