[The Editors' Verdict] For the Coexistence of K-AI and Content

by Choi Ilgwon

Published 28 Aug.2023 08:10(KST)

Updated 29 Aug.2023 09:42(KST)

open/close

[The Editors' Verdict] For the Coexistence of K-AI and Content

Naver's 'HyperCLOVA X,' a native generative artificial intelligence (AI) model that has learned 6,500 times more Korean than ChatGPT, was recently unveiled. It is already being hailed as a promising player in the 'K-AI industry,' seen as a card that can compete with OpenAI, Google, and others in the Korean market.

HyperCLOVA X clearly carries the mission of being a must-success project for the development of the K-AI industry. However, paradoxically, the emphasis on Korean language capability is expected to expose an Achilles' heel that could actually weaken its competitiveness.

The core issue is copyright. The principle of generative AI is to provide users with desired answers through a process called 'text and data mining (TDM),' which involves gathering the data necessary for learning. For AI model training, the data must go through 'copying' and 'transmission' processes. The rights to copy and transmit belong to the original creators. The fact that the Korean language learning scale is 6,500 times larger means it can provide consumers with more accurate information, but it also implies a higher possibility of copyright conflicts over the data used as analysis sources.

News copyright issues are particularly challenging. Because the subject matter is broad and facts are the weapon, news is the most preferred source for generative AI that seeks reliable outputs. Naver CEO Choi Soo-yeon also stated at the HyperCLOVA X launch event, "News content is indeed the highest quality data."

Although large AI training models have been launched, there are no legal measures in place to protect copyrights. Current laws do not clearly include provisions to prevent copyright infringement of original data occurring during the generative AI training process. This is because generative AI indirectly cites news content to produce results, which differs from the traditional copyright infringement method of direct citation.

There is a strong movement to broadly interpret 'fair use,' which allows the use of news without copyright restrictions. Fair use means that if it serves the public interest, copying or using copyrighted material without the copyright holder's permission does not constitute infringement, and news is considered to fall under this category. The use of news content for generative AI development is also viewed as meeting the public interest demand for new technological development.

However, there are significant counterarguments. An IT law expert who requested anonymity said, "Generative AI outputs not only replace the demand for the original copyrighted works but are also used for commercial or profit purposes, so they do not qualify as fair use." The National Assembly's Culture, Sports and Tourism Committee also pointed out in a review report on the Copyright Act amendment, "It only broadly stipulates that fair use is possible within the scope that does not unfairly harm the author's interests," and added, "It is uncertain whether data analysis is exempt from liability."

Recently, the Korea Newspaper Association sent a statement to companies like Naver and political circles, emphasizing "copyright must not be infringed." However, judging by the political and government movements, the statement seems to be nothing more than 'paper.' The focus appears to be on loosening copyright restrictions to promote the AI industry rather than strengthening copyright protection. The Copyright Act amendment proposed by People Power Party lawmaker Lee Yong-ho, submitted to the Culture, Sports and Tourism Committee, includes a new clause under individual copyright property restrictions for "cases where large amounts of information are analyzed through data mining to generate additional information or value." This means that when using copyrighted works for data mining, copying and transmission can be done without the copyright holder's permission. The Ministry of Economy and Finance also released a 'Service Industry Digitalization Strategy' last month, which includes provisions exempting copyright infringement for AI training data.

Of course, the Ministry of Culture, Sports and Tourism, the main agency responsible for copyright law, announced 'Copyright Vision 2030' in 2020, expressing its intention to "enhance clarity in copyright protection and use to revitalize new industries in new technological environments such as 5G and artificial intelligence." However, three years have passed since then with little progress. The media industry interprets this as "the government focusing more on activating future new industries like AI rather than protecting creative works."

Hot Picks Today

"Rather Than Endure a 1.5 Million KRW Stipend, I'd Rather Earn 500 Million in the U.S." Top Talent from SNU and KAIST Are Leaving [Scientists Are Disappearing] ①

Unauthorized use of news will inevitably weaken the competitiveness of the most competitive content in generative AI. Naturally, expecting growth in the generative AI industry under these circumstances is unrealistic. According to Grand View Research, the global generative AI market is expected to grow to about 142 trillion won by 2030. For the development of the K-AI industry, news cannot be sacrificed unconditionally.

한글 기사 보기

This content was produced with the assistance of AI translation services.

[The Editors' Verdict] For the Coexistence of K-AI and Content

Hot Picks Today

Today’s Briefing