1 Static Analysis of The DeepSeek Android App
Ahmad Shade edited this page 2025-05-31 01:25:31 +00:00


I carried out a static analysis of DeepSeek, a Chinese LLM chatbot, using version 1.8.0 from the Google Play Store. The objective was to identify potential security and privacy issues.

I have actually composed about DeepSeek previously here.

Additional security and personal privacy concerns about DeepSeek have been raised.

See also this analysis by NowSecure of the iPhone variation of DeepSeek

The findings detailed in this report are based simply on static analysis. This indicates that while the code exists within the app, there is no conclusive evidence that all of it is executed in practice. Nonetheless, the existence of such code warrants analysis, bybio.co especially given the growing concerns around information privacy, monitoring, the potential misuse of AI-driven applications, and cyber-espionage characteristics between global powers.

Key Findings

Suspicious Data Handling & Exfiltration

- Hardcoded URLs direct information to external servers, raising concerns about user activity tracking, such as to ByteDance "volce.com" endpoints. NowSecure recognizes these in the iPhone app the other day as well.

  • Bespoke file encryption and information obfuscation approaches are present, with signs that they might be used to exfiltrate user details.
  • The app contains hard-coded public keys, rather than counting on the user device's chain of trust.
  • UI interaction tracking catches detailed user behavior without clear consent.
  • WebView adjustment exists, which could permit the app to gain access to personal external browser information when links are opened. More details about WebView controls is here

    Device Fingerprinting & Tracking

    A substantial portion of the evaluated code appears to focus on event device-specific details, which can be used for tracking and fingerprinting.

    - The app gathers numerous special device identifiers, consisting of UDID, Android ID, IMEI, IMSI, and provider details.
  • System properties, installed plans, and root detection systems suggest potential anti-tampering measures. E.g. probes for the presence of Magisk, a tool that personal privacy supporters and security researchers utilize to root their Android devices.
  • Geolocation and network are present, showing potential tracking abilities and making it possible for or disabling of fingerprinting regimes by region.
  • Hardcoded gadget model lists recommend the application may act in a different way depending upon the found hardware.
  • Multiple vendor-specific services are used to draw out extra device details. E.g. if it can not identify the gadget through basic Android SIM lookup (due to the fact that consent was not approved), it tries manufacturer particular extensions to access the same details.

    Potential Malware-Like Behavior

    While no definitive conclusions can be drawn without dynamic analysis, several observed habits align with recognized spyware and malware patterns:

    - The app uses reflection and UI overlays, which could facilitate unauthorized screen capture or phishing attacks.
  • SIM card details, serial numbers, and other device-specific information are aggregated for unknown functions.
  • The app carries out country-based gain access to constraints and "risk-device" detection, recommending possible monitoring mechanisms.
  • The app implements calls to load Dex modules, where additional code is loaded from files with a.so extension at runtime.
  • The.so submits themselves turn around and make additional calls to dlopen(), which can be used to load additional.so files. This center is not generally examined by Google Play Protect and other static analysis services.
  • The.so files can be carried out in native code, such as C++. Making use of native code includes a layer of complexity to the analysis procedure and obscures the full extent of the app's abilities. Moreover, native code can be leveraged to more easily intensify privileges, possibly exploiting vulnerabilities within the operating system or gadget hardware.

    Remarks

    While information collection prevails in modern applications for debugging and improving user experience, aggressive fingerprinting raises substantial privacy concerns. The DeepSeek app needs users to visit with a legitimate email, which should already offer enough authentication. There is no legitimate reason for the app to strongly collect and transfer unique gadget identifiers, IMEI numbers, SIM card details, and other non-resettable system properties.

    The extent of tracking observed here exceeds common analytics practices, potentially enabling relentless user tracking and re-identification throughout gadgets. These behaviors, combined with obfuscation strategies and network communication with third-party tracking services, call for a higher level of scrutiny from security scientists and users alike.

    The work of runtime code loading in addition to the bundling of native code recommends that the app could allow the deployment and execution of unreviewed, remotely delivered code. This is a serious prospective attack vector. No proof in this report is provided that from another location released code execution is being done, just that the center for this appears present.

    Additionally, the app's technique to spotting rooted gadgets appears excessive for an AI chatbot. Root detection is typically warranted in DRM-protected streaming services, where security and material protection are vital, or in competitive video games to avoid unfaithful. However, there is no clear reasoning for such rigorous procedures in an application of this nature, raising additional questions about its intent.

    Users and organizations thinking about installing DeepSeek must be mindful of these possible dangers. If this application is being used within a business or federal government environment, additional vetting and security controls should be implemented before permitting its release on managed devices.

    Disclaimer: The analysis provided in this report is based upon static code evaluation and does not imply that all found functions are actively utilized. Further examination is needed for conclusive conclusions.