Naver Whale Participates in 'Artificial Intelligence Data Construction Project'

Naver Whale Participates in 'Artificial Intelligence Data Construction Project' View original image


[Asia Economy Reporter Kang Nahum] Naver Whale is participating in the ‘AI Training Data Construction Support Project’ conducted by the Ministry of Science and ICT and the National Information Society Agency (NIA), leveraging its own web technology competitiveness. This project is part of the government’s Digital New Deal initiative under the ‘Data Dam’ project.


According to Naver on the 1st, Whale formed the ‘NSDev Consortium,’ consisting of various companies and research institutions such as NSDev, Seoul National University, and the Korean Phonetics Society, to participate in this project. The consortium will build a language education dataset by collecting and processing Korean speech data pronounced by foreigners. From today until November 30, they plan to secure 4,000 hours of pronunciation data and refine it into a format suitable for AI training.


The generated dataset is expected to be highly useful in the future development and advancement of various Korean language education and evaluation solutions domestically. Naver explained the background of their participation in the project, saying, "With the global popularity of Korean language education increasing, each company has joined forces with the intention of creating a dataset that helps develop effective educational solutions. Considering the lack of customized educational materials that take into account foreigners’ pronunciation systems, we have established a plan to address this issue."


Naver will create an environment where data can be conveniently collected and managed through the ‘Whale Space Platform’ and ‘Whalebook.’ Whale Space is a platform that gathers various programs on the web, allowing users to continue their work anywhere simply by logging in. Administrators can set the types of programs and permissions available for each member, which is expected to be highly useful in this project where many people work together.


In particular, the device used to record foreigners’ Korean pronunciation will be the ‘Whalebook.’ Whalebook features functional design integrated with Whale Space. Administrators can use Whale Space to collectively control the currently used Whalebooks and centrally manage the data generated from each Whalebook. Additionally, they can manage the programs and internet environment accessible via Whalebook, creating a secure environment from a security perspective.



Kim Hyo, Naver Whale’s lead responsible, said, "Whale’s web technology has the versatility to be applied across various environments regardless of the field, which is its key strength, and based on this, we will expand the scope of its applications."