"Understanding Depth in Photos"... KENTECH Develops Prompt-Based Reasoning Technology
Korea Institute of Energy Technology (KENTECH, Acting President Park Jin-ho) announced on October 1 that Professor Lee Seokju's research team has developed a lightweight prompt learning technology that enables three-dimensional spatial reasoning in Vision Language Models (VLMs). The research team applied this technology to monocular camera-based depth estimation methods, significantly improving artificial intelligence's spatial understanding capabilities.
The multimodal Vision Language Model CLIP is an artificial intelligence system that simultaneously understands images and text, and is widely used in the convergence field of vision and natural language processing. For example, when shown the word "cat," it can identify cats among countless photos. However, it has had limitations in geometric spatial understanding tasks such as distance and depth perception.
To overcome this, the research team introduced a non-human language prompt, a new expression method optimized for machine comprehension instead of human language. This allows precise identification of object depth using only photos or videos captured by a camera.
Experimental results showed that the new technology achieved performance comparable to existing large-scale models (with over 300 million parameters) using only about 1.1 million training parameters. The number of required parameters was reduced to about one three-hundredth, yet effective learning was possible without any loss in performance. Professor Lee Seokju emphasized, "This will establish itself as a core fundamental technology applicable to a variety of spatial computing fields where lightweight solutions are essential, such as autonomous driving, robot vision, and augmented reality."
Hot Picks Today
"Stocks Are Not Taxed, but Annual Crypto Gains Over 2.5 Million Won to Be Taxed Next Year... Investors Push Back"
- "Not Jealous of Winning the Lottery"... Entire Village Stunned as 200 Million Won Jackpot of Wild Ginseng Cluster Discovered at Jirisan
- Bull Market End Signal? Securities Firm Warns: "Sell SK hynix 'At This Moment'"
- "Looks Even More Like Him in Person": Albino Water Buffalo with Golden Hair and Pink Skin Nicknamed 'Trump'
- "Even With a 90 Million Won Salary and Bonuses, It Doesn’t Feel Like Much"... A Latecomer Rookie Who Beat 70 to 1 Odds [Scientists Are Disappearing] ③
This research was supported by the Ministry of Trade, Industry and Energy, the National Research Foundation of Korea, and the Korea Astronomy and Space Science Institute. It was published online in the world-renowned journal Pattern Recognition, specializing in computer vision and machine learning, on September 26.
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.