OpenAI Unveils Its First In-House AI Chip: Jalapeño

OpenAI Unveils Its First In-House AI Chip: Jalapeño

OpenAI Jalapeño chip

OpenAI has officially unveiled its first in-house AI chip, named Jalapeño — a nod to the Mexican chili pepper. Co-developed with Broadcom (NASDAQ: AVGO), the chip is purpose-built to accelerate large language model (LLM) inference workloads.

According to OpenAI, Jalapeño is fully optimized for inference rather than model training. Its official positioning is a blank-slate design for modern LLM inference, with claimed compatibility across all major large language models.

Architecture & Design Priorities

The chip's architecture is built around three core design principles:

  • Reducing data movement overhead
  • Balancing resource allocation across compute, memory, and networking
  • Pushing real-world hardware utilization closer to theoretical peak performance

Jalapeño is co-developed with Broadcom, which supplies its Tomahawk switch chips for the networking subsystem. Per reports from SemiWiki and Tom's Hardware, the chip is expected to be manufactured on TSMC's 3nm process node with a systolic array architecture, paired with high-bandwidth memory (HBM3E or HBM4). Arm has custom-designed companion CPUs for this chip.

Tapeout Completed in Just Nine Months

From initial design to tapeout (submission to the foundry for physical production), the entire development cycle took only nine months — an unprecedentedly fast turnaround in the high-performance semiconductor space. By comparison, a typical high-performance ASIC usually requires two to three years to complete.

OpenAI attributes this accelerated timeline to leveraging large language models to streamline portions of the chip design and optimization workflows.

OpenAI's hardware division is led by Richard Ho, who previously worked on TPU engineering at Google and served as Senior Vice President of Silicon and Software Engineering at photonic computing startup Lightmatter. He joined OpenAI in November 2023.

"Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers. We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware's theoretical limits."

Richard Ho, Head of Hardware, OpenAI

Three-Party Division of Responsibilities

Three entities are involved in the chip's development and commercialization:

  • OpenAI: Chip architecture design, kernel optimization, and serving system development
  • Broadcom: Silicon implementation, networking technology (including Tomahawk switch chips), and production rollout
  • Celestica: PCB design, rack hardware, and overall system integration

Celestica is a Canadian electronics manufacturing services provider and the preferred manufacturing partner for Google TPUs; it now delivers system integration work for this OpenAI project.

"Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI. This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt-scale data centers with Microsoft and other partners beginning in 2026."

Hock Tan, CEO of Broadcom

In today's official announcement, Hock Tan formally named Microsoft a strategic partner. Earlier reporting from The Information noted financing hurdles surrounding this initiative — codenamed Project Nexus — with Phase 1 capital expenditure estimated at $180 billion. Broadcom had previously stipulated it would only commit funding if Microsoft agreed to purchase roughly 40% of chip output, though no formal procurement contract had been signed at that time.

Compatibility with GPT-5.3-Codex-Spark

In lab environments, Jalapeño has been running models including GPT-5.3-Codex-Spark at production-targeted frequencies and power envelopes. This model is also accessible within the Codex ecosystem.

Early benchmark testing indicates Jalapeño achieves state-of-the-art (SOTA) power efficiency in terms of performance per watt. A detailed technical whitepaper is expected to be released in the coming months.

Deployment Timeline

Preliminary deployment is scheduled to begin at the end of 2026, with scale expansion rolling out year over year. The overall project targets gigawatt (GW)-scale data center infrastructure, with Microsoft among its key partners. For reference, 10 gigawatts is roughly equivalent to the total residential electricity consumption of Beijing.

Jalapeño is the first generation in a multi-generation silicon roadmap. According to prior reports, the second-generation chip carries the codename Serrano — another variety of chili pepper.

"Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access."

Greg Brockman, Co-Founder and President of OpenAI

Chronological Timeline

  • October 2025: OpenAI announced a partnership with Broadcom to deploy 10 GW of custom-built accelerators
  • October 2025: Struck a deal with NVIDIA, under which NVIDIA may invest up to $100 billion and supply a minimum of 10 GW worth of data center systems
  • October 2025: Signed a 6 GW chip supply agreement with AMD, including an option for AMD to acquire up to a 10% equity stake
  • June 2026: Signed a 750 MW inference compute capacity agreement with Cerebras
  • June 24, 2026: Physical samples of the Jalapeño chip were delivered

In short, OpenAI is advancing four parallel compute strategies: its in-house chip program, plus partnerships with NVIDIA, AMD, and Cerebras.

Google has its TPU lineup, Amazon its Trainium family, Meta its MTIA silicon, and Microsoft its Maia chips. Now OpenAI has its own custom accelerator: Jalapeño.

Jalapeño chip deployment

Key Industry Terminology

  1. Blank-slate design: A ground-up custom design with no legacy constraints
  2. Systolic array: Core matrix computation architecture prevalent in AI accelerators
  3. ASIC: Application-Specific Integrated Circuit
  4. Tapeout: Final design handoff to wafer fabrication
  5. Gigawatt-scale: Ultra-large-scale power draw characteristic of hyperscale data centers

Reference Links

Back to blog

Leave a comment