Hardware prototype and evaluation configuration. Image provided by KAIST

Hardware prototype and evaluation configuration. Image provided by KAIST

View original image


[Asia Economy Reporter Kim Bong-su] Domestic researchers have developed a graph machine learning model that is more than 7 times faster and saves over 33 times more energy compared to existing models.


KAIST announced on the 10th that Professor Myungsoo Jeong's research team from the Department of Electrical Engineering and Computer Science has succeeded in developing the world's first "Holistic Graph-based Neural Network Machine Learning Technology" (hereafter Holistic GNN), which performs graph processing, graph sampling, and neural network acceleration for graph machine learning inference near storage and SSD devices.


The research team prototyped a new type of computational storage and SSD system equipped with a custom-made programmable semiconductor (FPGA), dedicated neural network acceleration hardware for machine learning, and graph-specific processing controllers and software. Under ideal conditions, this system demonstrated a 7-fold speed improvement and 33-fold energy savings compared to machine learning acceleration computing using the latest high-performance NVIDIA GPUs.


The new machine learning model applying graph data structures can represent relationships between data, unlike conventional neural network-based machine learning techniques. It is used in a wide range of fields and applications, from large-scale social network services (SNS) such as Facebook, Google, LinkedIn, and Uber, to navigation and new drug development. For example, when analyzing user networks stored in graph structures, it enables realistic product and item recommendations and friend suggestions that resemble human reasoning, which were impossible with traditional machine learning. This emerging graph-based neural network machine learning has so far been computed by reusing general machine learning acceleration systems like GPUs. However, this approach has shown limitations in practical system applications due to severe performance bottlenecks and device memory shortages during data preprocessing steps such as loading graph data from storage to memory and sampling.


The Holistic GNN technology accelerates all inference processes according to user requests directly near the storage where the graph data itself is stored. Specifically, it utilizes a new computational storage (Computational SSD) architecture that places programmable semiconductors near storage, eliminating the movement of large-scale graph data and accelerating graph processing and graph sampling near the data (Near Storage), thereby resolving bottlenecks in the graph machine learning preprocessing stage.


Conventional computational storage had limitations due to fixed firmware and hardware configurations within the device. Beyond graph processing and graph sampling, the research team's Holistic GNN technology provides device-level software capable of programming multiple graph machine learning models and a neural network acceleration hardware framework structure that users can freely modify, supporting various hardware architectures and software needed for AI inference acceleration.


To verify the effectiveness of the Holistic GNN technology, the research team fabricated a prototype of the computational storage and implemented the developed hardware RTL (Register Transfer Logic?a circuit used in computers composed of resistors and transistors) and software framework for graph machine learning on it. Evaluating graph machine learning inference performance on the fabricated computational storage accelerator prototype and the latest high-performance NVIDIA GPU acceleration system (RTX 3090), they confirmed that Holistic GNN technology is on average 7 times faster and reduces energy consumption by 33 times compared to systems accelerating graph machine learning using existing NVIDIA GPUs under ideal conditions. Notably, as the graph size increases, the effect of alleviating preprocessing bottlenecks grows, achieving up to 201 times faster speed and 453 times energy reduction compared to conventional GPUs.


Professor Jeong stated, "We have secured a computational storage acceleration system optimized for energy savings that not only performs high-speed graph machine learning inference near storage for large-scale graphs but also can replace existing high-performance acceleration systems. It can be applied to a wide range of real-world applications such as ultra-large recommendation systems, traffic prediction systems, and new drug development."



The research results will be presented at the 'USENIX Conference on File and Storage Technologies (FAST), 2022,' an international academic conference in the field of storage systems held in the United States this coming February. Detailed information can be found on the website of KAIST's "Computer Architecture and Memory Systems Laboratory," where Professor Jeong is affiliated.


This content was produced with the assistance of AI translation services.

© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Today’s Briefing