Boosting Speed with SRAM for the 'Age of Inference'
Manufactured at Samsung Foundry

The 'Groq 3' unveiled by Nvidia is a 'new face' aimed at targeting the next phase of the artificial intelligence (AI) semiconductor competition. Some analysts have described it as a decisive move by Nvidia.


On March 16, 2026 (local time), Nvidia CEO Jensen Huang announced during his keynote speech at the developer event 'GTC' held in San Jose, USA, that the next-generation AI platform 'Rubin' will be equipped with the 'Groq 3 LPU.' The Groq 3 chip is designed to enhance inference speed, enabling large-scale AI models to generate answers to user queries more quickly.


The AI industry is undergoing a rapid shift from model training to inference, which focuses on responding to user questions and needs. The rapid rise of agentic AI such as Openclaw is further highlighting the importance of inference.

Nvidia CEO Jensen Huang is delivering the keynote speech at the annual developer conference 'GTC 2026' held on March 16, 2026 (local time) at the SAP Center in San Jose, California, USA. Photo by AFP Yonhap News

Nvidia CEO Jensen Huang is delivering the keynote speech at the annual developer conference 'GTC 2026' held on March 16, 2026 (local time) at the SAP Center in San Jose, California, USA. Photo by AFP Yonhap News

View original image

This chip is attracting attention because Nvidia introduced it just three months after acquiring the technology of AI semiconductor startup Groq. In December last year, Nvidia invested approximately 20 billion dollars to acquire Groq's key assets and technology, which had specialized in AI inference, and brought on board the related engineers.


This move signals that Nvidia is actively responding to efforts to break away from a GPU-centric AI infrastructure. While GPUs excel at large-scale training, they can create bottlenecks during inference tasks that require rapid responses.


To address these issues, Groq 3 adopted a different approach from existing AI chips. Unlike most GPUs and NPUs, which use high bandwidth memory (HBM), Groq 3 features an SRAM-based structure. SRAM is fast but expensive, and thus is typically used as ultra-high-speed memory in CPU and GPU caches. Although the memory capacity is relatively small, its high speed enables outstanding efficiency during the inference stage.


This also carries significance in the landscape of AI semiconductor competition. Recently, the market for AI computation has seen a rapid spread of competition over low-cost, inference-only chips. Cerebras, which utilizes entire wafers as chips, has strengthened its collaboration with OpenAI and Amazon AWS. In addition, the Korean government has declared its support for domestic NPUs, separately from securing Nvidia’s GPUs.



The connection with Samsung Electronics is also noteworthy. CEO Huang revealed that the Groq 3 chips are manufactured by Samsung Foundry. Samsung, with its strength in SRAM, is responsible for Groq 3 production through its foundry services. In the past, high bandwidth memory (HBM) used in training GPUs required a packaging process at TSMC in Taiwan to be combined with Nvidia GPUs.


This content was produced with the assistance of AI translation services.

© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Today’s Briefing