"Three Papers Simultaneously Accepted at a Top 3 AI Conference"... UNIST Proves Competitiveness in Reinforcement Learning

by Kim Jonghwa

Published 21 Apr.2026 08:00(KST)

Only 27% of 19,000 Papers Accepted at ICLR

Solutions Proposed for Data Errors, Long-Term Tasks, and Multi-Agent Cooperation Challenges

A team of Korean researchers has demonstrated its competitiveness in reinforcement learning by having three papers simultaneously accepted at one of the world’s leading artificial intelligence (AI) conferences. On April 21, UNIST (Ulsan National Institute of Science and Technology) announced that three papers from Professor Seungyeol Han’s research group at the Graduate School of Artificial Intelligence have been accepted for presentation at the International Conference on Learning Representations (ICLR), to be held on April 23 in Rio de Janeiro, Brazil.

ICLR is considered one of the world’s top three AI conferences, alongside the Conference on Neural Information Processing Systems and the International Conference on Machine Learning. This year, around 19,000 papers were submitted globally, and only about 27% were accepted. It is rare for a single laboratory to have three papers accepted simultaneously.

Research team photo. From the left, Professor Seung-Yeol Han, and lead authors of each study, Researcher Sang-Hyun Lee, Researcher Jae-Bak Hwang, and Researcher Yong-Hyun Cho. Provided by UNIST

Overcoming Data Errors, Long-Term Tasks, and Cooperation Challenges... Breaking the Limits of Reinforcement Learning

All of these achievements were made in the field of reinforcement learning. Reinforcement learning is a core technology for "physical AI," such as robotics and autonomous driving, where AI learns optimal behavior through trial and error by interacting with its environment.

The research team first proposed the "Self-Improving Skill Learning (SISL)" method, which achieves high performance using only offline data collected in industrial settings. Even if errors are present in the data, the method autonomously identifies useful behavior patterns and removes noise, enabling stable learning.

The team also addressed inefficiencies that arise in complex, long-term tasks. Traditional reinforcement learning methods often face delayed learning when they select unattainable intermediate goals after dividing an objective into multiple stages. The team improved both learning success rates and speed by introducing the "Strict Subgoal Execution (SSE)" technique, which ensures that only achievable goals are selected.

Limitations in environments that require cooperation among multiple AIs were also overcome. In multi-agent environments, optimal actions can change depending on the situation, but existing algorithms often rely on a single solution. The research team proposed "Sequential Subvalue Q-Learning (S2Q)," which evaluates multiple alternative action values simultaneously, enabling flexible decision-making even in cooperative settings.

Sanghyun Lee, Jaebak Hwang, and Yonghyun Jo participated as first authors of each study. Professor Seungyeol Han stated, "We have confirmed the potential of applying reinforcement learning stably even with limited data and in uncertain environments," adding, "We expect these results to expand into various industrial fields, including autonomous driving, robotics, and smart manufacturing."

"Three Papers Simultaneously Accepted at a Top 3 AI Conference"... UNIST Proves Competitiveness in Reinforcement Learning

Only 27% of 19,000 Papers Accepted at ICLRSolutions Proposed for Data Errors, Long-Term Tasks, and Multi-Agent Cooperation Challenges

Overcoming Data Errors, Long-Term Tasks, and Cooperation Challenges... Breaking the Limits of Reinforcement Learning

Only 27% of 19,000 Papers Accepted at ICLR

Solutions Proposed for Data Errors, Long-Term Tasks, and Multi-Agent Cooperation Challenges