Kakao Unveils Multimodal and MoE Models, Showcasing Its AI Technology Capabilities

by Kim Bokyung

Published 24 Jul.2025 09:33(KST)

On July 24, Kakao announced that it has demonstrated its artificial intelligence (AI) technology capabilities by releasing its lightweight multimodal language model and Mixture of Experts (MoE) model as open source.

Kakao Unveils Multimodal and MoE Models, Showcasing Its AI Technology Capabilities

On this day, Kakao released as open source via Hugging Face: ▲ the lightweight multimodal language model "Kanana-1.5-v-3b," which can understand image information and follow instructions, and ▲ the MoE language model "Kanana-1.5-15.7b-a3b."

Kakao, which participates in the government's "Independent AI Foundation Model Project," aims to enhance nationwide AI accessibility and strengthen the country's AI competitiveness by leveraging its own model development capabilities and extensive experience operating large-scale services such as KakaoTalk.

Kanana-1.5-v-3b is a multimodal language model capable of processing not only text but also image information, and is based on the Kanana 1.5 model, which was released at the end of May. Kanana 1.5 was developed entirely from scratch using Kakao's proprietary technology throughout every stage of model development.

Leveraging the strengths of the multimodal language model, it can perform ▲ image and character recognition, ▲ creation of fairy tales and poems, ▲ recognition of domestic cultural heritage and tourist attractions, ▲ chart comprehension, and ▲ solving math problems. For example, if asked, "Please briefly explain the place where this photo was taken" along with a photo of a location, it would respond, "This photo features Cheonggyecheon in Seoul as its background."

A case demonstrating the place recognition ability of the lightweight multimodal language model released by Kakao. Provided by Kakao

The MoE model operates by activating only certain expert models optimized for specific tasks, which allows for efficient use of computing resources and cost savings. Thanks to these advantages, the MoE approach has become a global trend in AI model development. The "Kanana-1.5-15.7b-a3b" model, which applies the MoE architecture, activates only about 3 billion parameters out of a total of 15.7 billion parameters during inference.

Kakao's MoE model can provide practical support to companies and research developers seeking to build high-performance AI infrastructure at low cost. In particular, due to its structural characteristic of using only a limited number of parameters during inference, it is highly suitable for implementing low-cost, high-efficiency services, making it very versatile.

Hot Picks Today

"Rather Than Endure a 1.5 Million KRW Stipend, I'd Rather Earn 500 Million in the U.S." Top Talent from SNU and KAIST Are Leaving [Scientists Are Disappearing] ①

Kakao stated, "By releasing our lightweight multimodal language model and MoE model as open source, we aim to set a new standard in the AI model ecosystem and lay the foundation for more researchers and developers to freely utilize efficient and powerful AI technologies."

한글 기사 보기

This content was produced with the assistance of AI translation services.

Kakao Unveils Multimodal and MoE Models, Showcasing Its AI Technology Capabilities

Hot Picks Today

Today’s Briefing