Following Amazon CEO Andy Jassy’s announcement of AWS’s significant $50 billion investment agreement with OpenAI, Amazon extended an invitation for a private tour of the chip development lab central to this landmark deal, with the majority of expenses covered by the company.
Industry experts are closely monitoring Amazon’s Trainium chip, a product of this facility, due to its potential to enable lower-cost AI inference and, consequently, challenge Nvidia’s near-monopoly in the market.
Intrigued by this prospect, I accepted the invitation.
My guides for the day were Kristopher King, the lab’s director, and Mark Carroll, director of engineering. They were joined by Doron Aronson, the team’s public relations representative who facilitated the visit.
AWS has served as Anthropic’s primary cloud platform since the AI lab’s inception. This enduring relationship has proven resilient, persisting even as Anthropic subsequently partnered with Microsoft for cloud services and Amazon’s collaboration with OpenAI expanded.
The agreement with OpenAI positions AWS as the exclusive provider for the model maker’s new AI agent builder, Frontier. This tool could become a crucial component of OpenAI’s business if AI agents achieve the widespread adoption anticipated by Silicon Valley. The longevity of this exclusivity, however, remains to be seen. The Financial Times recently reported that Microsoft may view OpenAI’s deal with Amazon as a violation of its own agreement with OpenAI, which grants Redmond access to all of OpenAI’s models and technology.
A key factor in AWS’s appeal to OpenAI is its commitment, as part of this deal, to supply OpenAI with 2 gigawatts of Trainium computing capacity. This represents a substantial pledge, especially considering that Anthropic and Amazon’s proprietary Bedrock service are already consuming Trainium chips at a rate that outpaces Amazon’s current production capabilities.
Across all three generations, 1.4 million Trainium chips have been deployed, with Anthropic’s Claude utilizing over 1 million of the deployed Trainium2 chips, according to company statements.
Initially, Trainium was primarily optimized for faster, more economical model training, a higher priority several years ago. However, it is now also finely tuned and actively used for inference. Inference, the process of running an AI model to generate responses, currently represents the most significant performance bottleneck within the industry.
For instance, Trainium2 manages the majority of inference traffic on Amazon’s Bedrock service. Bedrock empowers Amazon’s extensive enterprise customer base to build AI applications and integrate multiple models within these applications.
“Our customer base is just expanding as fast as we can get capacity out there,” stated King, adding, “Bedrock could be as big as EC2 one day,” referencing AWS’s colossal compute cloud service.
Beyond providing an alternative to Nvidia’s frequently backlogged and difficult-to-acquire GPUs, Amazon asserts that its new chips, when running on its specialized Trn3 UltraServers, can reduce operational costs by up to 50% for comparable performance when compared to traditional cloud servers.
In addition to Trainium3, launched in December, the AWS team also developed new Neuron switches. Carroll describes the combination of these two innovations as transformative.
“What that gives us is something huge,” Carroll remarked. These switches enable every Trainium3 chip to communicate with every other chip in a mesh configuration, thereby reducing latency. “That’s why Trainium3 is breaking all kinds of records,” particularly in terms of “price per power,” he elaborated.
Given that trillions of tokens are processed daily, such performance enhancements yield considerable benefits.
Notably, Amazon’s chip team received commendation from Apple in 2024. In a rare display of transparency from the typically secretive company, Apple’s director of AI publicly detailed the use of Graviton, another of the team’s chips—a low-power, ARM-based server CPU and the first breakthrough chip designed by this unit. Apple also praised Inferentia, a chip specifically engineered for inference, and acknowledged Trainium, which was new at the time.
These chips exemplify Amazon’s classic business strategy: identify customer demand, then develop an in-house alternative that offers competitive pricing.
Historically, a significant hurdle in the chip industry has been the cost associated with switching. Applications designed for Nvidia’s chips often require extensive re-architecting to function with other hardware, a time-consuming process that discourages developers from transitioning.
However, the AWS chip team proudly informed me that Trainium now supports PyTorch, a widely used open-source framework for building AI models. This includes many models available on Hugging Face, a vast repository where developers share open-source resources.
According to Carroll, the transition merely requires “basically a one-line change, and then recompile, and then run on Trainium.” This demonstrates Amazon’s strategic effort to progressively erode Nvidia’s market dominance wherever feasible.
This month, AWS also announced a partnership with Cerebras Systems, integrating their inference chip onto servers running Trainium. Amazon promises this will deliver exceptionally powerful, low-latency AI performance.
Amazon’s aspirations, however, extend beyond just the chips. The company also designs the servers that house these chips. In addition to networking components, this team has engineered “Nitro,” a hardware-software combination providing virtualization technology (enabling multiple software instances to run independently on a single server), advanced liquid cooling technology, and the server sleds that contain this equipment.
The overarching goal of this integrated approach is to meticulously control both cost and performance.
Amazon’s custom chip-designing division originated with the acquisition of Israeli chip designer Annapurna Labs in January 2015 for approximately $350 million. This team has thus accumulated over a decade of experience designing chips for AWS, and the unit proudly retains its Annapurna roots and name, with its logo prominently displayed throughout the office.
The chip lab itself is situated in a modern building with chrome-framed windows within Austin’s upscale “The Domain” district. This walkable area, replete with shops and restaurants, is often referred to as Austin’s Silicon Valley.
The offices exude a typical tech corporate atmosphere, featuring cubicle desks, collaborative spaces, and conference rooms. Yet, tucked away on a high floor at the rear of the building, the actual lab offers expansive views of the city.
This industrial space, roughly the size of two large conference rooms and filled with shelving, is notably noisy due to the equipment fans. Its aesthetic blends a high school shop class with a high-end Hollywood lab set, though the engineers are clad in jeans rather than traditional white lab coats.
It is important to note that this facility is not where the chips are manufactured, precluding the need for white hazmat suits. The Trainium3 is a cutting-edge 3-nanometer chip, produced by TSMC, widely considered a leader in 3-nanometer manufacturing, while other chips are produced by Marvell.
However, this is the pivotal room where the intricate process of “bring-up” unfolds.
“A silicon bring-up is when you get the chip for the first time, and it’s like a big overnight party. You stay here, like a lock-in,” King elucidated. Following 18 months of development, the chip is powered on for the first time to verify its intended functionality. The team even documented a portion of the Trainium3 bring-up process and shared it on YouTube.
Spoiler alert: The process is rarely without its challenges.
For Trainium3, the prototype chip initially utilized air cooling, consistent with previous iterations. The current version, however, incorporates liquid cooling, which offers significant energy advantages and represented a substantial engineering achievement.
During the bring-up, an issue arose where the dimensions for attaching the chip to the air-cooling heat sink were incorrect, preventing the chip from being activated.
Undeterred, the team “immediately got a grinder and just started grinding off the metal,” King recounted. To avoid disrupting the festive, pizza-fueled atmosphere of the bring-up, they discreetly performed the grinding in a conference room.
Staying up through the night to resolve problems “is what silicon bring-up is all about,” King emphasized.
The lab also features a welding station, where hardware lab engineer and master welder Isaac Guevara demonstrated the intricate task of welding tiny integrated circuit components under a microscope. This work is so exceptionally difficult that senior leader Carroll openly admitted his inability to perform it, eliciting laughter from Guevara and the other engineers present.
Furthermore, the lab is equipped with both custom-built and commercial tools for testing and diagnosing chip issues. Signal engineer Arvind Srinivasan demonstrated the lab’s method for testing each minute component on the chip.
However, the most prominent feature of the lab is an entire display showcasing every generation of the “sleds” designed by the team.
Sleds are specialized trays that house Trainium AI chips, Graviton CPU chips, and their accompanying boards and components. When these sleds are stacked together on a rack with the custom-designed networking components also developed by this team, they form the sophisticated systems fundamental to the success of Anthropic’s Claude.
One such sled was prominently featured during the AWS re:Invent conference in December.
Throughout the tour, I anticipated my guides would highlight the OpenAI deal, but they refrained from doing so.
This reticence could stem from the aforementioned potential legal uncertainties surrounding the agreement. However, my impression was that these frontline engineers, who are currently immersed in designing the next iteration, Trainium4, have not yet had significant opportunities to collaborate directly with OpenAI. Their day-to-day efforts have, to date, been primarily directed towards fulfilling the needs of Anthropic and Amazon.
Currently, the largest deployment of Trainium2 chips is within Project Rainier, one of the world’s most extensive AI compute clusters, which became operational in late 2025 with 500,000 chips and is utilized by Anthropic.
Nevertheless, a wall monitor in the main office displayed a quote detailing OpenAI’s future use of Trainium, suggesting a subtle, underlying sense of pride.
In addition to this lab, the team maintains its own private data center dedicated to quality assurance and testing. Located a short drive away, this facility does not process customer workloads and is housed at a co-location site, rather than an AWS data center.
Security protocols are stringent, with strict measures governing entry to the building and access to Amazon’s designated area within.
The data center’s cooling system is so intensely loud that earplugs are mandatory, and the air carries a distinct, acrid scent of heated metal, making it an inhospitable environment for the average visitor.
Within this data center, rows upon rows of servers are filled with sleds integrating all of Amazon’s latest custom chips: Graviton CPUs, liquid-cooled Trainium3, and Amazon Nitro, all actively engaged in computation. The liquid cooling operates on a closed system, ensuring reuse and contributing to a reduced environmental impact, as explained by the engineers.
A current Trn3 UltraServer configuration features multiple sleds positioned at the top and bottom, with the Neuron switches centrally located. Hardware development engineer David Martinez-Darrow was observed performing maintenance on a sled.
While the team has always garnered considerable attention, scrutiny has notably intensified in recent times.
Amazon CEO Andy Jassy closely monitors this lab, publicly extolling its products with a paternal pride. In December, he declared Trainium an already multi-billion-dollar business for AWS and identified it as one of the AWS technologies that excites him most. He also specifically acknowledged the chip during the announcement of the OpenAI agreement.
The team is acutely aware of this pressure. Engineers commit to working 24/7 for three to four weeks around each bring-up event, meticulously resolving any issues to prepare the chips for mass production and deployment into data centers.
“It’s very important that we get as fast as possible to prove that it’s actually going to work,” Carroll affirmed, concluding, “So far, we’ve been doing really well.”
Disclosure: Amazon provided airfare and covered the cost of one night at a local hotel. In adherence to its Leadership Principle of Frugality, this included a middle seat in the back of the plane and a modest room. Other associated travel expenses, such as ride-shares and luggage fees, were covered by TechCrunch.
The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.