Cerebras built a computer chip the size of a dinner plate — literally the largest chip ever made — because the founders decided that fitting AI onto tiny chips was the wrong approach and everyone else had it backwards. The Wafer Scale Engine contains 2.6 trillion transistors and 900,000 AI cores on a single piece of silicon. For context, Nvidia's flagship GPU has around 80 billion transistors. Cerebras filed for an IPO in 2024 and then quietly shelved it. They're either the most audacious hardware bet in AI history or a very expensive science experiment — and the answer probably arrives whenever the IPO actually happens.
Founded
2016
HQ
Sunnyvale, USA
Total Raised
$720 million
Founder
Andrew Feldman, Sean Lie, Michael James, JungSup Kim, Tomas Ulicny, Jean-Philippe Fricker
Status
Private
Website
www.cerebras.netTHE ORIGIN STORY
Andrew Feldman had already sold one chip company — SeaMicro, a server startup — to AMD for $334 million in 2012. So when he started Cerebras in 2016, he wasn't a first-timer with a crazy idea.
He was a serial founder with a very specific, very expensive grievance with the state of AI hardware.
The problem he kept running into: AI training requires moving enormous amounts of data between thousands of tiny chips, and most of the time and energy is wasted on that shuffling. The chips themselves sit idle, waiting.
Feldman's thesis was blunt — if you could fit the entire neural network on a single chip, you'd eliminate the bottleneck entirely. The data wouldn't need to travel anywhere.
It would just be there.
The catch: building a chip that large had never been done. The semiconductor industry had spent decades making chips smaller, not bigger.
Larger chips mean more defects per wafer, and defects mean the chip doesn't work. The conventional wisdom was that giant chips were a dead end — too many defects, too expensive, too hard to cool.
Cerebras decided the conventional wisdom was wrong. They spent two years in stealth, working with TSMC to solve the defect problem and figure out how to actually manufacture something this large.
In 2019 they came out of hiding with the Wafer Scale Engine — a chip the size of an iPad that contained 400,000 AI cores and 1.2 trillion transistors. The AI world did a double take.
The chip was real. It worked.
And it was unlike anything else on the market.
WHAT THEY ACTUALLY DO
Cerebras makes its money by selling access to its hardware — either the physical systems themselves or cloud-based access to them. The customer is anyone training or running large AI models who needs more compute than a cluster of Nvidia GPUs can efficiently provide.
The flagship product is the CS-3 system, which houses the Wafer Scale Engine chip. One CS-3 system sells for several million dollars.
That's the direct hardware sale side. For companies that don't want to buy the physical box, Cerebras also offers Cerebras Cloud — you pay for inference or training time on their hardware without owning it.
More accessible, lower upfront cost, same underlying advantage.
In 2023 they launched a free inference API called Cerebras Inference, which lets developers run open-source models like Llama at speeds that genuinely shocked the AI community — over 1,000 tokens per second, which is roughly 20 times faster than what most GPU-based services were delivering. That's not a marketing claim, developers tested it and posted the benchmarks.
It became a calling card.
The business logic is straightforward: Nvidia has a near-monopoly on AI training hardware, but their architecture has an inherent bottleneck — inter-chip communication. Cerebras bets that as models get bigger, that bottleneck gets worse, and customers will pay a premium to avoid it.
The question is whether the premium is worth it at scale, and whether Nvidia engineers their way around the problem before Cerebras can get big enough to matter.
THE PRODUCTS
The Wafer Scale Engine 3 (WSE-3) is the core product — the largest chip ever built, manufactured on TSMC's 5nm process, with 4 trillion transistors and 900,000 AI cores on a single wafer. It sits inside the CS-3 system, which is Cerebras's flagship hardware box.
Buying a CS-3 means you own one of these behemoths and can run it in your own data center. It's designed for training and inference at extreme scale, and its main selling point is that the entire model lives on one chip — no inter-chip communication, no memory bandwidth bottleneck.
Cerebras Inference is the cloud product — you send API calls, you pay per token or per compute hour, and the work runs on Cerebras hardware in their data centers. They launched a free tier in 2023 that delivered inference speeds that made GPU-based services look slow by comparison.
It's now one of the fastest publicly accessible inference endpoints available for open-source models.
Cerebras also offers the Andromeda AI supercomputer — a cluster of sixteen CS-2 systems connected together, capable of running models with up to 120 trillion parameters. That's the product aimed at national labs and serious research institutions that need compute at a scale beyond what a single system can provide.
Argonne National Laboratory runs one.
HOW THEY GREW
Cerebras didn't try to compete with Nvidia on price or volume. That would be suicide — Nvidia has years of manufacturing scale, a dominant software ecosystem in CUDA, and billions in R&D.
Instead, Cerebras went after the one thing Nvidia structurally can't easily fix: the fact that their chips have to talk to each other.
The counterintuitive move was releasing a free inference API in 2023 at speeds that looked broken at first glance. When developers started posting benchmarks showing 1,000+ tokens per second, the reaction was essentially disbelief.
People assumed there was a catch. There wasn't — the architecture just doesn't have the communication overhead.
That free tier got Cerebras into the hands of thousands of developers who'd never considered using them before, and it made the performance claims impossible to ignore.
They also made a smart bet on the enterprise side by going after national labs and government research institutions — organizations doing genuinely massive model training that need raw throughput and don't mind paying for custom hardware. The US Department of Energy's Argonne National Laboratory runs Cerebras hardware.
That's a credibility signal that a startup pitching to enterprise AI teams can't buy easily.
The IPO filing in 2024 was itself a marketing move — even though they later pulled it, the S-1 revealed financials that showed real revenue growth and gave them press coverage money can't buy. Whether intentional or not, it put them on the radar of every enterprise buyer who reads TechCrunch.
THE HARD PART
The elephant in the room is Nvidia. Not just as a competitor — as an ecosystem.
CUDA, Nvidia's programming platform, has been the default language of AI development for over a decade. Every AI researcher knows it.
Every model is optimized for it. Switching to Cerebras hardware means learning new tools, rewriting workflows, and betting on a smaller player whose software ecosystem is still maturing.
That's a real switching cost even when the hardware is genuinely better.
Then there's the manufacturing reality. Building chips the size of a dinner plate at volume, with acceptable yield rates, is extraordinarily hard.
Cerebras relies on TSMC to make the wafers, which means they're dependent on the same foundry everyone else is fighting over. If TSMC capacity tightens — and it has — Cerebras feels it disproportionately because their chips require more wafer area per unit than any other chip on the market.
The IPO withdrawal in late 2024 raised real questions. They had filed, the market knew the numbers, and then they pulled back.
The stated reason involved a key customer — G42, an Abu Dhabi AI firm — accounting for a large chunk of revenue, which made institutional investors nervous about concentration risk. When a significant portion of your revenue comes from one customer, and that customer has geopolitical complexity attached to it, public market investors get skittish.
It's a real problem, and it's unresolved.
MONEY TRAIL
Seed
2016 · Led by Foundation Capital
$25M raised
Series A
2018 · Led by Benchmark
$112M raised
Series B
2019 · Led by Coatue Management
$200M raised
Series C
2021 · Led by Coatue Management
$250M raised
$4.0B valuation
Series F
2023 · Led by Coatue Management
$133M raised
$4.0B valuation
WHO BACKED THEM
Cerebras has raised around $720 million from a mix of strategic and financial investors. The most notable backer is Coatue Management, the tech-focused hedge fund, which led the Series F in 2021.
That round valued Cerebras at $4 billion, which was a significant statement for a hardware startup — hardware companies are notoriously hard to finance because the capital requirements are enormous and the timelines are long.
Foundation Capital has been involved since early stages, alongside Benchmark and Eclipse Ventures. Eclipse in particular specializes in deep tech and hardware startups, so their presence made sense.
These aren't generalist funds throwing money at anything AI-adjacent — they came in specifically because they understood the technical bet.
The G42 relationship is worth noting separately. G42 is an Abu Dhabi-based AI conglomerate backed by the UAE government, and they became a major Cerebras customer — significant enough to show up prominently in the IPO filing as a revenue concentration concern.
That relationship brought real money but also real scrutiny, particularly from US regulators watching AI chip exports to Middle Eastern entities with complicated geopolitical ties. It's the kind of customer win that looks great on revenue and complicated on everything else.
Related Profiles
Head-to-Head
Compare Cerebras Systems vs another company.