Kakao Releases AI Language Model Performance Evaluation Dataset

by Choi Yuri

Published 27 Sep.2024 10:33(KST)

Updated 27 Sep.2024 14:14(KST)

open/close

Kakao announced on the 27th that it has built and open-sourced 'FunctionChat-Bench,' a dataset that can evaluate the function call performance of artificial intelligence (AI) language models.

Function call refers to connecting external tools such as language models and application programming interfaces (APIs) to instruct actions that AI language models cannot perform on their own or to obtain real-time information that has not been pre-learned. It is an essential technology for implementing services based on language models. For example, by utilizing the function call feature to connect specific APIs such as maps, the model can retrieve real-time road information to provide answers.

Kakao Releases AI Language Model Performance Evaluation Dataset

Kakao has built the 'FunctionChat-Bench' dataset, which can comprehensively evaluate performance in Korean conversational environments. Most existing function call performance evaluation datasets are based on English and created by global companies, making Kakao the first to build a related dataset based on Korean.

The dataset consists of evaluation criteria such as ▲accuracy of function name and argument extraction ▲accuracy of delivering function call results ▲whether additional queries arise through recognition of missing information ▲detection of relevance to callable functions.

Kakao has released the dataset on the open-source community GitHub to activate the Korean AI language model ecosystem and promote an open AI environment. Going forward, Kakao plans to continuously expand usability by increasing the dataset size and adding an English version.

Hot Picks Today

Dramatic Agreement Reached on Eve of Samsung Electronics General Strike... Minister Kim Young-hoon: "Showcased Korea's Strength in Dialogue" (Update)

Byunghak Kim, Performance Leader of Kakao Kanana Alpha, said, "The construction and open-source release of the FunctionChat-Bench dataset holds significance in contributing to the domestic AI technology ecosystem based on the Korean language. Since this is the first foundation for evaluating the performance of function call technology, we plan to work on enhancing the usability of the dataset."

한글 기사 보기

This content was produced with the assistance of AI translation services.

Kakao Releases AI Language Model Performance Evaluation Dataset

Hot Picks Today

Today’s Briefing