Securing Storage Space Through Compression Technology

If Speed Improvements Are Verified,

Semiconductor Consumption Could Actually Increase

Recent "TurboQuant Shock" Is Overstated

Google's TurboQuant Faces Scrutiny at ICLR... "AI Services Set for Expansion" (Comprehensive) View original image

"The global artificial intelligence (AI) industry is turning its attention to the International Conference on Learning Representations (ICLR), which will be held in Rio de Janeiro, Brazil from April 23 to 27. This is because Google's AI memory compression technology, 'TurboQuant,' will undergo detailed verification there." (Shin Dongju, CEO of Mobilint)


Google will officially present a paper on TurboQuant at ICLR, one of the three major conferences in the field of AI, subjecting it to peer review, and plans to release the actual program code to the world around June. This comes about a month after the TurboQuant paper was published on the Google Research Blog, which sparked heated debate about its impact on the global memory semiconductor market.


TurboQuant is an algorithm that compresses the 'KV cache' (temporary memory) used by large language models (LLMs) to remember context. By reducing memory usage to one-sixth, concerns have grown that this could negatively affect Samsung Electronics and SK hynix, which currently dominate the global memory semiconductor market. However, there has been a growing counterargument from academia and the AI semiconductor industry that the so-called 'TurboQuant shock' in the market is excessive.


Google's Quantization Method Is a Good Idea... "Will Contribute to Infrastructure Expansion"


Jinwon Lee, Chief Technology Officer (CTO) of HyperAccel, said, "The academic community has been continuously discussing techniques to structurally optimize KV cache by compressing it from the conventional 16 bits down to 3 to 4 bits using quantization methods," and added, "While Google's quantization method itself is a good idea, 4-bit quantization is not entirely new."


He further explained, "TurboQuant will increase the efficiency of AI models, which in turn will expand demand even among companies that previously could not afford to use Nvidia's graphics processing units (GPUs) due to high costs. In fact, if TurboQuant is properly validated at ICLR as a technology that not only secures memory storage efficiently but also achieves its speed targets, it could actually stimulate semiconductor consumption." Reducing memory usage to one-sixth means increased efficiency, enabling the implementation of complex and large-scale AI services. Ultimately, this is expected to lead not to a reduction in the market share of memory semiconductor manufacturers, but to increased demand through infrastructure expansion.


Shin Dongju, CEO of Mobilint, also stated, "With efficient technologies like TurboQuant, the market will expand even into areas where memory semiconductors were previously unused, so demand will continue to increase. In the future, issues on the supply side—such as the competitive dynamics among Micron in the U.S., Samsung Electronics, and SK hynix, the game of anticipation surrounding capacity expansion, and the pursuit by Chinese memory companies—will determine the future direction of the stock prices of Samsung Electronics and SK hynix."


He added, "For companies developing neural processing units (NPUs), it has become increasingly important to incorporate algorithms like TurboQuant that can resolve memory bottlenecks into their hardware. As the NPU market expands, if a disruptive algorithm emerges after TurboQuant and an NPU cannot support it, that would pose a significant risk."


Shin Dongpyeong, Director of the Technology Foresight Center at the Korea Institute of S&T Evaluation and Planning (KISTEP), commented, "TurboQuant is part of an effort to solve hardware problems through software. In the long run, TurboQuant will expand AI services, and as AI computation becomes possible on more devices, the on-device AI market will also expand."



Director Shin cited the emergence of 'post-transformer AI models' as a key turning point that could change the paradigm of the AI market in the future. He said, "New models that can replace the existing transformer architecture continue to be proposed. The emergence of post-transformer AI models could become the biggest variable capable of fundamentally changing the GPU-dominated AI semiconductor market."


This content was produced with the assistance of AI translation services.

© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Today’s Briefing