Silicon and supercomputers will define the next AI era. AWS just made a big bet on both.

Amazon is betting on its own chips and supercomputers to forge ahead with its AI ambitions. 

Big Tech’s next AI era will be all about controlling silicon and supercomputers of their own. Just ask Amazon.

At its Re: Invent conference on Tuesday, the tech giant’s cloud computing unit, Amazon Web Services, unveiled the next line of its AI chips, Trainium3, while announcing a new supercomputer that will be built with its own chips to serve its AI ambitions.

It marks a significant shift from the status quo that has defined the generative AI boom since OpenAI’s release of ChatGPT, in which the tech world has relied on Nvidia to secure a supply of its industry-leading chips, known as GPUs, for training AI models in huge data centers.

While Nvidia has a formidable moat — experts say its hardware-software combination serves as a powerful vendor lock-in system — AWS’ reveal shows companies are finding ways to take ownership of the tech shaping the next era of AI development.

Putting your own chips on the table

Amazon is pushing forward with its own brand of chips called Trainium. 

On the chip side, Amazon shared that Trainium2, which was first unveiled at last year’s Re: Invent, was now generally available. Its big claim was that the chip offers “30-40% better price performance” than the current generation of servers with Nvidia GPUs.

That would mark a big step up from its first series of chips, which analysts at SemiAnalysis described on Tuesday as “underwhelming” for generative AI training and used instead for “training non-complex” workloads within Amazon, such as credit card fraud detection.

“With the release of Trainium2, Amazon has made a significant course correction and is on a path to eventually providing a competitive custom silicon,” the SemiAnalysis researchers wrote.

Trainium3, which AWS gave a preview of ahead of a late 2025 release, has been billed as a “next-generation AI training chip.” Servers loaded with Trainium3 chips offer four times greater performance than those packed with Trainium2 chips, AWS said.

Matt Garman, the CEO of AWS, told The Wall Street Journal that some of the company’s chip push is due to there being “really only one choice on the GPU side” at present, given Nvidia’s dominant place in the market. “We think that customers would appreciate having multiple choices,” he said.

It’s an observation that others in the industry have noted and responded to. Google has been busy designing its own chips that reduce its dependence on Nvidia, while OpenAI is reported to be exploring custom, in-house chip designs of its own.

But having in-house silicon is just one part of this.

The supercomputer advantage

AWS acknowledged that as AI models trained on GPUs continue to get bigger, they are “pushing the limits of compute and networking infrastructure.”

That means companies serious about building their own AI models — like Amazon in its partnership with Anthropic, the OpenAI rival that raised a total of $8 billion from the tech giant — will need access to highly specialized computing that can handle a new era of AI.

Amazon has a close partnership with OpenAI rival Anthropic.

With this in mind, AWS shared that it was working with Anthropic to build an “UltraCluster” of servers that form the basis of a supercomputer it has named Project Rainier. According to Amazon, it will scale model training across “hundreds of thousands of Trainium2 chips.”

“When completed, it is expected to be the world’s largest AI compute cluster reported to date available for Anthropic to build and deploy their future models on,” AWS said in a blog, adding that it will be “over five times the size” of the cluster used to build Anthropic’s last model.

The supercomputer push follows similar moves elsewhere. The Information first reported earlier this year that OpenAI and Microsoft were working together to build a $100 billion AI supercomputer called Stargate.

Of course, Nvidia is also in the supercomputer business and aims to make them a big part of its allure to companies looking to use its next-generation AI chips, Blackwell.

Last month, for instance, Nvidia announced that SoftBank, the first customer to receive its new Blackwell-based servers, would use them to build a supercomputer for AI development. Elon Musk has also bragged about his company xAI building a supercomputer with 100,000 Nvidia GPUs in Memphis this year.

AWS made no secret that it remains tied to Nvidia for now. In an interview with The Wall Street Journal, Garman acknowledged that Nvidia is responsible for “99% of the workloads” for training AI models today and doesn’t expect that to change anytime soon.

That said, Garman reckoned “Trainium can carve out a good niche” for itself. He’ll be wise to recognize that everyone else is busy carving out a niche for themselves, too.

Similar Posts

Leave a Reply