"Yellow Grapes and Purple Bananas? ... Beyond Visual Concept Understanding to 'Imaginative Artificial Intelligence' Implementation"

by Jeong Ilwoong

Published 30 Nov.2023 08:54(KST)

It has become possible to implement artificial intelligence capabilities that understand and imagine visual concepts never seen before, such as 'yellow grapes' and 'purple bananas.' Based on this, there is an expectation that the development speed in the fields related to AI's reasoning ability and imagination will accelerate.

KAIST announced on the 30th that a research team led by Professor Seongjin Ahn from the School of Computing, in an international joint study with Google DeepMind and Rutgers University in the United States, has developed a benchmark to perform AI models and related programs capable of understanding new concepts by combining visual knowledge.

(From left) Seongjin An, Professor, Department of Computer Science, KAIST; Youngbin Kim, Master's student, Department of Computer Science, KAIST; Gautam Singh, PhD student, Rutgers University; Junyoung Park, Master's student, Department of Computer Science, KAIST; Challa Gulcher, Senior Researcher, DeepMind. Provided by KAIST

Humans learn general concepts such as 'purple grapes' and 'yellow bananas,' and have the ability to separate and recombine these to 'imagine' non-existent concepts like 'yellow grapes' and 'purple bananas.'

This ability is called systematic generalization or compositional generalization and is considered a key element in the process of realizing general artificial intelligence.

Systematic generalization has remained a challenge in the AI deep learning field for 35 years since 1988, when American cognitive scientists Jerry Fodor and Zenon Pylyshyn argued that artificial neural networks could not solve this problem.

The problem of systematic generalization in AI deep learning occurs not only in language but also in visual information. However, research has mainly focused on systematic generalization in language, and studies on visual information have been relatively insufficient.

Accordingly, the international joint research team developed a benchmark to study systematic generalization of visual information, which is expected to open a new horizon in research on visual information that has remained a gap until now. However, unlike language, visual information does not have a clear 'word' or 'token' structure, so learning this structure and systematically generalizing it is expected to be a new challenge.

Professor Seongjin Ahn said, "Systematic generalization of visual information is an essential ability to achieve general artificial intelligence," and added, "Through this research, we expect the development in AI's reasoning and imagination-related fields to accelerate."

Caglar Gulcehre, a researcher responsible for this study at DeepMind and a professor at ?cole Polytechnique F?d?rale de Lausanne (EPFL) in Switzerland, said, "If systematic generalization becomes possible, it is expected that AI performance can be significantly improved with much less data than currently required."

Hot Picks Today

"Rather Than Endure a 1.5 Million KRW Stipend, I'd Rather Earn 500 Million in the U.S." Top Talent from SNU and KAIST Are Leaving [Scientists Are Disappearing] ①

Meanwhile, the joint research team's study will be presented at the 37th Conference on Neural Information Processing Systems (NeurIPS), held in New Orleans, USA, from December 10 to 16.

한글 기사 보기

This content was produced with the assistance of AI translation services.

"Yellow Grapes and Purple Bananas? ... Beyond Visual Concept Understanding to 'Imaginative Artificial Intelligence' Implementation"

Hot Picks Today

Today’s Briefing