D-Matrix, a Microsoft-backed AI chip startup located in Silicon Valley, launched its Corsair inference chip with claims of running inference workloads 10 times faster and using five times less energy than a standalone Nvidia GPU for small workloads. The company, founded in 2019 and valued at around $2 billion after raising approximately $500 million, begins shipping to customers this month. The launch comes as the AI chip market demonstrates substantial opportunity for specialized players, following Cerebras' IPO last month that raised over $5.5 billion and valued the company at over $50 billion, and Nvidia's $20 billion acquisition of Groq in December.
D-Matrix's Corsair chip achieves low latency inference on low power by tightly integrating memory and compute on a single chip. Like Groq and Cerebras, D-Matrix relies on SRAM, a type of memory that can be made at logic fabs like Taiwan Semiconductor Manufacturing Company and integrated on the same chip. GPUs rely on large amounts of another kind of memory called DRAM that's packaged into stacks of high bandwidth memory added around the logic chip. Co-founder and CEO Sid Sheth said the company is not running into a chokepoint around DRAM because the product doesn't rely on DRAM to be successful.
When paired with an Nvidia Blackwell GPU, D-Matrix says, citing research from Gimlet Labs, that Corsair can run inference 10 times faster, three times cheaper and up to five times more energy efficiently than a standalone GPU. Sheth says Corsair is designed for AI inference optimizing for interactivity or speed over language size, targeting use cases like chatbots, voice agents and agentic tools.
Sheth said the company has commitments from high-profile hyperscalers, neoclouds and frontier AI labs. D-Matrix begins shipping to those customers this month. About 90% of customers are in the U.S., while overseas customers are in the Middle East and Southeast Asia, Sheth said. Microsoft invested through its M12 venture arm.
Sheth stated he has no intention of selling the company and called the AI chip market "a $1 trillion market in the making." Semiconductor analyst Stacy Rasgon of Bernstein Research noted that D-Matrix has a fair number of actual, real customer engagements, with customers often using the chips in conjunction with Nvidia.
Rick Bahr, adjunct professor of electrical engineering at Stanford University, identified a significant limitation: while on-chip SRAM enables remarkable inference speeds because data travels short distances, it can't handle the trillions of parameters that now make up large models from leaders like OpenAI and Anthropic. Bahr stated that number of parameters simply can't be put onto an SRAM-based design.
Nvidia CEO Jensen Huang said last week that his company remains the leader in low-cost inference with its Vera Rubin system because it's not just about speed. At Computex in Taiwan, Huang said the reason is that Nvidia integrates everything, designs everything from the ground up, simulates the entire system and uses extreme co-design. Nvidia released a new Groq chip at GTC in March, called a language processing unit.
D-Matrix sells four Corsair chips packaged together inside a card that slides into slots in a data center server rack and costs tens of thousands of dollars. Sheth called Corsair the densest SRAM solution in the market today, with up to 128 gigabytes of SRAM memory in a single server. The chip is made in Taiwan on TSMC's 6-nanometer node.
D-Matrix teamed up with Arista, Broadcom and Super Micro to build a full rack-scale system called SquadRack for deploying its chips in AI data centers. The company's next chip, Raptor, is scheduled to launch next year on TSMC 4 nanometer, which Sheth said could run out of the Taiwanese company's factory in Arizona.
What performance claims does D-Matrix make for its Corsair chip? D-Matrix claims its Corsair chip can run inference workloads 10 times faster and using five times less energy than a standalone Nvidia GPU for small workloads. When paired with an Nvidia Blackwell GPU, citing research from Gimlet Labs, Corsair can run inference 10 times faster, three times cheaper and up to five times more energy efficiently than a standalone GPU.
What are the technical limitations of D-Matrix's SRAM-based approach? According to Rick Bahr, adjunct professor of electrical engineering at Stanford University, the SRAM-based design cannot handle the trillions of parameters that make up large models from leaders like OpenAI and Anthropic. While on-chip SRAM enables remarkable inference speeds, that number of parameters simply can't be put onto an SRAM-based design.
When does D-Matrix begin shipping Corsair chips to customers? D-Matrix begins shipping Corsair chips to customers this month. The company has commitments from hyperscalers, neoclouds and frontier AI labs, with about 90% of customers in the U.S. and overseas customers in the Middle East and Southeast Asia.
Related News
Roundhill CEO: AI Demand Alters Memory-Chip Valuation Methods
Intel wins Google’s order for 3 million TPU chips, stock price surges 12%
OpenAI secretly submits for an IPO at a valuation of 852 billion, as competitive pressure from Anthropic heats up in parallel
SpaceX Signs $920M Monthly Google AI Compute Deal Ahead of IPO
SpaceX Google Deal: $920M Monthly Contract for 110,000 Nvidia GPUs