📲Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents

If our project helps you, please give us a star ⭐ on GitHub to support us. 🥸🥸

🔥 News

2025-08-28 🎉 🌟 We are happy to release the SAPA-Bench. You can find the SAPA-Bench from .

TODO

Release the SAPA-Bench.
Release the latest evaluation code.
···

📖SAPA-Bench Overview

Smartphones offer great convenience but also collect vast amounts of personal information.
With the rise of MLLM-powered smartphone agents, automation performance has improved significantly—yet at the cost of extensive access to sensitive user data.

To systematically evaluate this issue, we introduce the first large-scale benchmark (7,138 scenarios) for privacy awareness in smartphone agents. Each scenario is annotated with:

🔑 Privacy Type (e.g., Account Credentials)
⚠️ Sensitivity Level
📍 Location

We benchmarked seven mainstream smartphone agents and found:

Overall privacy awareness (RA) remains below 60%, even with explicit hints.
Closed-source agents generally perform better; Gemini 2.0-flash achieved the highest RA (67%).
Privacy detection strongly correlates with sensitivity level—higher sensitivity makes scenarios more identifiable.

👉 Our results highlight the urgent need to rethink the utility–privacy tradeoff in the design of smartphone agents.

🛠️Evaluation

🌟 Star History

📑 Citation

If you find SAPA-Bench useful for your research and applications, please cite using this BibTeX:

@article{lin2025sapa,
  title   = {Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents},
  author  = {Lin, Zhixin and Li, Jungang and Pan, Shidong and Shi, Yibo and Yao, Yue and Xu, Dongliang},
  journal = {arXiv preprint arXiv:2508.19493},
  year    = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

📲Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents

🔥 News

TODO

📖SAPA-Bench Overview

🛠️Evaluation

🌟 Star History

📑 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

Uh oh!

Zhixin-L/SAPA-Bench

Folders and files

Latest commit

History

Repository files navigation

📲Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents

🔥 News

TODO

📖SAPA-Bench Overview

🛠️Evaluation

🌟 Star History

📑 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages