‘GPT-4·Gemini Ultra’ AI Transparency Ranks Low

by Kim Jinyeong

Published 22 May.2024 09:38(KST)

Stanford University HAI Report
Transparency Survey on 14 Major AI Models
Aiming to Encourage Disclosure of AI Development Processes

The latest artificial intelligence (AI) models from OpenAI and Google have been found to have lower transparency compared to other competitors' models.

The Human-Centered Artificial Intelligence Institute (HAI) at Stanford University in the United States released a report on the 21st (local time) evaluating the transparency index (FMTI) of 14 major AI models. This is an update seven months after the first announcement in October last year.

[Image source=Reuters Yonhap News]

HAI's transparency index aims to encourage socially influential AI developers to disclose their development processes more transparently. It is scored out of 100 points based on criteria such as parameters, training methods, data disclosure, and explainability.

The first place was taken by ‘StarCoder,’ jointly developed by Hugging Face and ServiceNow. ‘Luminous’ and ‘Jurassic-2,’ developed by startups from Germany and Israel respectively, tied for second place with 75 points. Microsoft’s Pi-2 (62 points) ranked 5th, and Meta’s LLaMA-2 (60 points) ranked 6th.

The 14 models included in this transparency survey also featured the latest products such as OpenAI’s GPT-4, Google’s Gemini 1.0 Ultra, Meta’s LLaMA 2, Anthropic’s Claude 3, and Mistral 7B, drawing attention. However, OpenAI’s GPT-4 scored a disappointing 49 points, placing 11th. Google’s Gemini 1.0 Ultra followed with 47 points.

The HAI research team stated, "We are somewhat disappointed by the lack of progress in data or transparency, which are the driving forces behind large language models (LLMs)," and criticized, "This means that almost all developers do not disclose the information they know." However, they also noted, "Compared to last October, some positive improvements have been confirmed," and evaluated, "We hope that future progress in transparency will lead to better social outcomes such as enhanced accountability, increased innovation, and improved policies." The average score of the models surveyed rose from 37 points in October last year to 58 points.

Hot Picks Today

"Rather Than Endure a 1.5 Million KRW Stipend, I'd Rather Earn 500 Million in the U.S." Top Talent from SNU and KAIST Are Leaving [Scientists Are Disappearing] ①

Meanwhile, OpenAI is embroiled in controversy over its recently released ‘GPT-4o (Four-O)’ voice, which allegedly mimics the voice of famous American actress Scarlett Johansson. OpenAI currently denies the allegations and has temporarily suspended the use of the problematic voice service ‘Sky.’ Johansson herself expressed "shock and anger" and hinted at legal action.

한글 기사 보기

This content was produced with the assistance of AI translation services.