container
Dim

'Deepseek' Took All My Personal Information

A distinctive feature of China's generative AI DeepSeek, setting it apart from other AI services, is that it collects user information 'as a whole' without undergoing the 'tokenization' process that identifies only the necessary information.
A distinctive feature of China's generative AI DeepSeek, setting it apart from other AI services, is that it collects user information 'as a whole' without undergoing the 'tokenization' process that identifies only the necessary information.
According to DeepSeek's privacy policy on the 18th, it can collect user input information such as 'text, chat records, uploaded files' and automatically collected information including 'device and network information, location information, and payment information' as is.
It also collects account information such as the name, date of birth, and email address entered at the time of registration.

Collecting All User Information Without 'Tokenization'

In particular, there was controversy during the initial launch of DeepSeek over the collection of keyboard input patterns, but this was excluded from the collection targets as of the 14th. Keyboard input patterns are classified as sensitive information because they can not only identify individual users but also be used to infer important information entered by users, such as passwords.


This is different from the data collection practices of other AI companies. OpenAI's ChatGPT also collects user-inputted text, uploaded files, and device identifiers. However, it goes through a 'tokenization' process to ensure that the source of the information cannot be identified. 'Tokenization' is a method that allows only the necessary information to be used safely. For example, it is similar to how only the last four digits of a card number are shown on a convenience store receipt instead of the entire number. Through this process, user-identifying information is removed, and only the inputted data remains to be used for service improvement.


'Deepseek' Took All My Personal Information 원본보기 아이콘

An official from the AI industry pointed out DeepSeek's privacy policy, stating, "You can assume that all digital information you enter is transferred," and explained, "In advanced countries such as the United States, regulations on the use of specific individuals' data are clearly stipulated in privacy protection laws, making collection and use impossible."


The lack of an 'opt-out' setting that allows users to refuse the collection of information for training generative AI is also pointed out as a problem.
The lack of an 'opt-out' setting that allows users to refuse the collection of information for training generative AI is also pointed out as a problem.
AI services such as ChatGPT, Gemini (Google), and ClovaX (Naver) do not use conversation content and input data for AI service improvement if the user refuses.
The opt-out option can be easily set in each service's settings. However, DeepSeek does not provide such service settings.
This means that all information entered by users while using the service is fully utilized as data for service improvement.

No Specified Data Retention Period, No Right to Refuse

The period for which the collected data is stored is also unclear. DeepSeek's privacy policy only states that information will be retained 'for the period necessary to provide the service.' The deletion deadline for collected data is also ambiguous, described only as 'when it is no longer needed.' In contrast, ChatGPT and CLOVA X set the retention period for user-inputted data to a maximum of 30 days. Gemini allows users to set the data retention period themselves, and CLOVA X provides a function for users to directly delete their inputted data.


There are concerns that DeepSeek's personal information and data are stored on servers within China.
There are concerns that DeepSeek's personal information and data are stored on servers within China.
Based on DeepSeek's terms alone, it is possible to transfer personal information to China. The terms state that 'servers are located in the People's Republic of China, and user personal data may be processed and stored on servers within China.'
However, the problem lies in the fact that under current Chinese law, if a government agency requests information stored on domestic servers, the company must provide it. This means it is difficult to dispel concerns about information leakage.

1.21 Million Users' Personal Information Exposed... Stored on Servers in China

The Personal Information Protection Commission has already confirmed that DeepSeek user information has been transferred to ByteDance, the parent company of the Chinese social networking service TikTok. Jang Dongin, a lead professor at the Korea Advanced Institute of Science and Technology (KAIST) AI Graduate School, also commented, "DeepSeek has notified users that it will use the collected personal information at its discretion, so caution is needed."


Although the Personal Information Protection Commission has suspended new app downloads for DeepSeek, controversy is expected over the effectiveness of this measure. Users who have already downloaded the DeepSeek app can continue to use it, and the DeepSeek web page is not subject to this measure. According to the app analytics service WiseApp Retail, the number of DeepSeek app users in the fourth week of last month was 1.21 million, the second highest among generative AI apps after ChatGPT (4.93 million) during the same period. A representative from the Personal Information Protection Commission stated, "For users who have already downloaded and are using the DeepSeek app, there are no special measures that the service provider can take, so each user must be cautious when entering personal information," adding, "Due to the nature of the internet, it is not easy to block access."

top버튼