ETRI: Multimodal AI "Catastrophic Forgetting" Solved... Knowledge Editing Performance Improved

A South Korean research team has presented a breakthrough in addressing the issue of "catastrophic forgetting" in multimodal artificial intelligence (AI). Multimodal AI systems-including ChatGPT, Gemini, and Claude-can understand both images and text simultaneously, enabling them to describe photos or answer questions about the contents of an image. However, a major challenge remains: when acquiring new information or revising existing knowledge, these systems frequently forget previously learned information-a phenomenon known as catastrophic forgetting. The research team is drawing attention for developing a core foundational technology that tackles this problem.


The Electronics and Telecommunications Research Institute of Korea (ETRI) announced on March 24 that the Language Intelligence Laboratory, led by Sujong Lim, in collaboration with Pohang University of Science and Technology (POSTECH) and Sungkyunkwan University, has developed "MemEIC," a technology for continuous and composite knowledge editing.


This technology was selected for presentation at the international AI conference "NeurIPS 2025" and was recently unveiled in San Diego, United States.


ETRI researchers are discussing the implementation process of continuous and composite knowledge editing technology (MemEIC). Electronics and Telecommunications Research Institute (ETRI)

ETRI researchers are discussing the implementation process of continuous and composite knowledge editing technology (MemEIC). Electronics and Telecommunications Research Institute (ETRI)

원본보기 아이콘

Traditional multimodal AI systems typically modify internal core parameters directly to update knowledge. This approach, akin to a "brain surgery" that fundamentally alters the model's structure, has the drawback of affecting other stored information during the editing process.


When visual and linguistic information are edited simultaneously, the two types of knowledge often become entangled, leading the AI to misunderstand and frequently generate incorrect answers to complex questions.


For example, if the AI is sequentially trained with the visual information "The dessert in the photo is Dubai Chewy Cookie (Dubai Jjonddeuk Cookie)" and the linguistic information "Dujjonku is very popular in Korea," and then asked, "In which country is this dessert popular?", traditional multimodal AI models often produce distorted answers such as, "The image in the photo is a chocolate truffle, which is popular in Europe."


MemEIC was developed to address these issues. Inspired by the structure of the human brain, the technology stores new information in external memory instead of inside the AI model itself, allowing the AI to selectively retrieve and use information as needed. Much like how the human brain separates functions between the left and right hemispheres, MemEIC enables the AI to distinguish, store, and utilize different types of knowledge independently.


ETRI emphasized that this structure makes it easy to expand the system, as it allows for the flexible addition of new information while maintaining the stability of the existing model.


In practice, an AI equipped with MemEIC accurately combined separately stored visual and linguistic data to answer the previous example question with: "The dessert in the photo is Dubai Chewy Cookie (Dujjonku), and it is very popular in Korea."


This separate storage and selective integration structure minimizes internal interference and the loss of existing knowledge by connecting pieces of information only when needed. As a result, the system enables "composite reasoning," providing correct answers to complex queries.


The joint research team constructed the Composite Continual Knowledge Editing Benchmark (CCKEB), which contains 1,278 items, to evaluate the performance of the technology. They also conducted experiments that sequentially edited hundreds of knowledge items. As a result, MemEIC achieved a composite question accuracy rate of approximately 70%, more than double that of conventional technologies, which range between 36% and 52%.


Most notably, ETRI highlighted the technology's ability to preserve "locality"-meaning that even when new knowledge is added, answers to existing questions remain unchanged, thereby maintaining response stability.


This research is significant not only for mitigating AI's forgetting phenomenon but also for simultaneously solving the dual challenges of continual knowledge editing and composite reasoning.


Sujong Lim, head of ETRI's Language Intelligence Laboratory, stated, "This research is significant because it allows multimodal AI to reflect up-to-date information required in real service environments while also ensuring reliability. The joint research team plans to further advance the technology so that it can reliably incorporate a wide variety of information from industrial sites."


This work was conducted as part of the "Next-Generation Generative AI Technology Development Project" and the "Development of learning and utilization technologies for the sustainability of generative language models and the reflection of up-to-date information over time," supported by the Ministry of Science and ICT and the Institute of Information & Communications Technology Planning & Evaluation (IITP).

© The Asia Business Daily(www.asiae.co.kr). All rights reserved.