Kakao Releases AI Language Model Performance Evaluation Dataset
Kakao announced on the 27th that it has built and open-sourced 'FunctionChat-Bench,' a dataset that can evaluate the function call performance of artificial intelligence (AI) language models.
Function call refers to connecting external tools such as language models and application programming interfaces (APIs) to instruct actions that AI language models cannot perform on their own or to obtain real-time information that has not been pre-learned. It is an essential technology for implementing services based on language models. For example, by utilizing the function call feature to connect specific APIs such as maps, the model can retrieve real-time road information to provide answers.
Kakao has built the 'FunctionChat-Bench' dataset, which can comprehensively evaluate performance in Korean conversational environments. Most existing function call performance evaluation datasets are based on English and created by global companies, making Kakao the first to build a related dataset based on Korean.
The dataset consists of evaluation criteria such as ▲accuracy of function name and argument extraction ▲accuracy of delivering function call results ▲whether additional queries arise through recognition of missing information ▲detection of relevance to callable functions.
Kakao has released the dataset on the open-source community GitHub to activate the Korean AI language model ecosystem and promote an open AI environment. Going forward, Kakao plans to continuously expand usability by increasing the dataset size and adding an English version.
Hot Picks Today
Dramatic Agreement Reached on Eve of Samsung Electronics General Strike... Minister Kim Young-hoon: "Showcased Korea's Strength in Dialogue" (Update)
- "It Has Now Crossed Borders": No Vaccine or Treatment as Bundibugyo Ebola Variant Spreads [Reading Science]
- "From a 70 Million Won Loss to a 350 Million Won Profit with Samsung and SK hynix"... 'Stock Jackpot' Grandfather Gains Attention
- "Stocks Are Not Taxed, but Annual Crypto Gains Over 2.5 Million Won to Be Taxed Next Year... Investors Push Back"
- "Who Is Visiting Japan These Days?" The Once-Crowded Tourist Spots Empty Out... What's Happening?
Byunghak Kim, Performance Leader of Kakao Kanana Alpha, said, "The construction and open-source release of the FunctionChat-Bench dataset holds significance in contributing to the domestic AI technology ecosystem based on the Korean language. Since this is the first foundation for evaluating the performance of function call technology, we plan to work on enhancing the usability of the dataset."
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.