Kakao Unveils Multimodal and MoE Models, Showcasing Its AI Technology Capabilities
On July 24, Kakao announced that it has demonstrated its artificial intelligence (AI) technology capabilities by releasing its lightweight multimodal language model and Mixture of Experts (MoE) model as open source.
On this day, Kakao released as open source via Hugging Face: ▲ the lightweight multimodal language model "Kanana-1.5-v-3b," which can understand image information and follow instructions, and ▲ the MoE language model "Kanana-1.5-15.7b-a3b."
Kakao, which participates in the government's "Independent AI Foundation Model Project," aims to enhance nationwide AI accessibility and strengthen the country's AI competitiveness by leveraging its own model development capabilities and extensive experience operating large-scale services such as KakaoTalk.
Kanana-1.5-v-3b is a multimodal language model capable of processing not only text but also image information, and is based on the Kanana 1.5 model, which was released at the end of May. Kanana 1.5 was developed entirely from scratch using Kakao's proprietary technology throughout every stage of model development.
Leveraging the strengths of the multimodal language model, it can perform ▲ image and character recognition, ▲ creation of fairy tales and poems, ▲ recognition of domestic cultural heritage and tourist attractions, ▲ chart comprehension, and ▲ solving math problems. For example, if asked, "Please briefly explain the place where this photo was taken" along with a photo of a location, it would respond, "This photo features Cheonggyecheon in Seoul as its background."
A case demonstrating the place recognition ability of the lightweight multimodal language model released by Kakao. Provided by Kakao
View original imageThe MoE model operates by activating only certain expert models optimized for specific tasks, which allows for efficient use of computing resources and cost savings. Thanks to these advantages, the MoE approach has become a global trend in AI model development. The "Kanana-1.5-15.7b-a3b" model, which applies the MoE architecture, activates only about 3 billion parameters out of a total of 15.7 billion parameters during inference.
Kakao's MoE model can provide practical support to companies and research developers seeking to build high-performance AI infrastructure at low cost. In particular, due to its structural characteristic of using only a limited number of parameters during inference, it is highly suitable for implementing low-cost, high-efficiency services, making it very versatile.
Hot Picks Today
"Rather Than Endure a 1.5 Million KRW Stipend, I'd Rather Earn 500 Million in the U.S." Top Talent from SNU and KAIST Are Leaving [Scientists Are Disappearing] ①
- "You Might Regret Not Buying Now"... Overseas Retail Investors Stirred by News of Record-Breaking Monster Stocks' IPOs
- "Not Jealous of Winning the Lottery"... Entire Village Stunned as 200 Million Won Jackpot of Wild Ginseng Cluster Discovered at Jirisan
- How Investment Strategies Differ Between 70s and 20s Retail Investors
- "How Did an Employee Who Loved Samsung End Up Like This?"... Past Video of Samsung Electronics Union Chairman Resurfaces
Kakao stated, "By releasing our lightweight multimodal language model and MoE model as open source, we aim to set a new standard in the AI model ecosystem and lay the foundation for more researchers and developers to freely utilize efficient and powerful AI technologies."
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.