'Federated Learning' is a method designed to enable multiple institutions to collaboratively train artificial intelligence (AI) models without directly exchanging data. This approach was devised to address the challenge of aggregating sensitive information, such as patient medical records or financial data, in a single location.


However, during this process, a limitation has emerged: AI models tend to become excessively adapted to the data of specific institutions, making them less effective when encountering new data. A Korean research team has successfully addressed this chronic issue of federated learning, securing stable performance.


(From the bottom left) KAIST Yoonho Lee PhD candidate, Sein Kim integrated MS-PhD candidate, Sungwon Kim PhD candidate, Junseok Lee PhD candidate, Yunhak Oh PhD candidate, (From the top left) Namkyung Lee PhD candidate, Seokwon Yoon PhD candidate at UNC Chapel Hill, Professor Carl Yang at Emory University, Professor Chanyoung Park at KAIST. Provided by KAIST

(From the bottom left) KAIST Yoonho Lee PhD candidate, Sein Kim integrated MS-PhD candidate, Sungwon Kim PhD candidate, Junseok Lee PhD candidate, Yunhak Oh PhD candidate, (From the top left) Namkyung Lee PhD candidate, Seokwon Yoon PhD candidate at UNC Chapel Hill, Professor Carl Yang at Emory University, Professor Chanyoung Park at KAIST. Provided by KAIST

View original image

On October 15, KAIST announced that Professor Park Chanyoung’s research team from the Department of Industrial and Systems Engineering has developed a new training method that resolves the persistent performance degradation in federated learning and enhances the generalization capability of AI models.


Previously, problems mainly arose during the fine-tuning process, where the jointly developed AI model was optimized for the specific circumstances of each institution. As vast amounts of knowledge were pooled together and diluted, the AI became overly adapted to the characteristics of a particular institution’s data, resulting in a phenomenon known as 'local overfitting.'


For example, if several banks develop a 'joint loan screening AI' and a particular bank fine-tunes it using data centered on large corporate clients, the AI at that bank may excel at screening large corporations but exhibit degraded performance when evaluating individual or startup clients-an instance of local overfitting.


To address this, the research team introduced a 'synthetic data' approach. They extracted only the core and representative features from each institution’s data to generate virtual data that does not include personal information, then applied this synthetic data during the fine-tuning process.


This allows each institution’s AI to strengthen its expertise according to its own data without sharing personal information, while also retaining the broad perspective (generalization capability) gained through collaborative learning.


The study found that this training method is particularly effective in fields where data security is paramount, such as healthcare and finance. Moreover, it is expected to have wide applicability in environments like social media and e-commerce, where new users and products are continuously added, as it can maintain and demonstrate stable performance even under such dynamic conditions.


Most importantly, the new training method developed by the research team enables AI to maintain stable performance without confusion, even when new institutions join the collaboration or when data characteristics change rapidly. This is a key strength of their approach.


Professor Park stated, "This research is significant in that it opens a path for each institution’s AI to achieve both specialization and versatility while preserving data privacy. We expect it will make a substantial contribution to fields where data collaboration is essential but security is critical, such as medical AI and financial fraud detection AI."


Meanwhile, the study was conducted with Kim Sungwon, a student at the Graduate School of Data Science, as the first author, and Professor Park Chanyoung as the corresponding author.



The research team recently achieved another milestone when their paper was selected for an oral presentation at the International Conference on Learning Representations 2025, an AI conference held in Singapore. Oral presentations are reserved for the top 1.8% of outstanding papers at the conference.


This content was produced with the assistance of AI translation services.

© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Today’s Briefing