GUIrilla: A Scalable Framework for Automated Desktop UI Exploration

This repository contains the codebase for the paper "GUIrilla: A Scalable Framework for Automated Desktop UI Exploration". It implements a fully automated system for exploring macOS applications by interacting with their user interfaces and capturing the resulting UI changes. These interactions are structured into a graph-based representation, enabling the scalable collection of tasks across macOS applications.

Dataset and models

Models:

🔧 Requirements

macOS: Version 13.2 or later
Python: Version 3.11
OpenAI API Key (optional, set env variable OPENAI_API_KEY in .env)
macOS System Pass Key: Set env variable SYSTEM_PASS in .env
Sentry Client Public Key: (optional, set env variable SENTRY_CLIENT_PUBLIC_KEY_URL in .env)
Mac App Store CLI (mas) (optional): Required for automatic app installation
- Install via mas GitHub page
- Or run:
```
brew install mas
```
- Then set -m /Path/to/mas to simply mas

🛡️ Accessibility Permissions

➡️ Ensure the Python interpreter has Accessibility access:

System Settings > Privacy & Security > Accessibility

Add the following:

Terminal
Python (or your IDE, e.g., PyCharm or VS Code)
Any GUI runner you use

⚙️ Installation

python3.11 -m venv parser_venv
source parser_venv/bin/activate
pip install -r requirements.txt
chmod +x ./run_me.sh ./run_me_bulk.sh

🚀 Usage

🔹 Single App Processing

./run_me.sh -a 'Calculator,com.apple.calculator,,os' -o ./output -m /Path/to/mas -h False -c False -l False -q 5 -t True

🔹 Bulk App Processing

./run_me_bulk.sh -i app_details_small.txt -o ./output -m /Path/to/mas -l False

⚙️ Configuration Options

The crawler can be controlled via several flags to modify its behavior:

🧠 1. GPT-4 Assistance (Optional)

To use GPT-4 for input generation, element sorting and task generation, ensure an OpenAI API key is available.
Disable it by setting -l False. This will disable AI-based reasoning, falling back to deterministic inputs, element ordering and handling of login pages.

🖱️ 2. Cursor-Based Interaction

Enable cursor movements before actions using -c True. This helps visualize element interactions, such as hover states, by showing cursor positioning as separate actions in the interaction graph.

🗂️ 3. Task Collection

To collect UI interaction data without generating action descriptions, use --tasks False. This is useful for building raw interaction graphs or debugging the UI crawling logic.

🕔 4. Maximal duration of parsing

The -q argument controls the maximal duration of time used by GUIrilla crawler for parsing. It should be specified in minutes and is an upper bound on the time for processing a single application. By default, it is set to 120 minutes.

📁 Input Format

For bulk runs, provide an app_details.txt file formatted like:

Calculator,com.apple.calculator,,os
Stocks,com.apple.Stocks,,os
...

📤 Output

Outputs include segmented UI graphs, screenshots, and logs, stored in the specified output directory (-o flag).

🛠️ Task postprocessing

Run the following command to postprocess the tasks with GPT-4 based Task Agent and add processed_task key to a task graph:

python src/generate_task.py -a Calculator,com.apple.calculator,,os

macapptree

As part of the same publication, the macapptree library provides complementary functionality to this project. You can find it at MacPaw/macapptree.

License

This project is licensed under the MIT License.

Citation

@article{garkot2025guirilla,
  title={GUIrilla: A Scalable Framework for Automated Desktop UI Exploration},
  author={Garkot, Sofiya and Shamrai, Maksym and Synytsia, Ivan and Hirna, Mariya},
  journal={arXiv preprint arXiv:2510.16051},
  year={2025},
  url={https://arxiv.org/abs/2510.16051}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
example_output		example_output
images		images
src		src
templates		templates
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app_details.txt		app_details.txt
app_details_small.txt		app_details_small.txt
requirements.txt		requirements.txt
run_me.sh		run_me.sh
run_me_bulk.sh		run_me_bulk.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GUIrilla: A Scalable Framework for Automated Desktop UI Exploration

Dataset and models

🔧 Requirements

🛡️ Accessibility Permissions

⚙️ Installation

🚀 Usage

🔹 Single App Processing

🔹 Bulk App Processing

⚙️ Configuration Options

🧠 1. GPT-4 Assistance (Optional)

🖱️ 2. Cursor-Based Interaction

🗂️ 3. Task Collection

🕔 4. Maximal duration of parsing

📁 Input Format

📤 Output

🛠️ Task postprocessing

macapptree

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

MacPaw/GUIrilla

Folders and files

Latest commit

History

Repository files navigation

GUIrilla: A Scalable Framework for Automated Desktop UI Exploration

Dataset and models

🔧 Requirements

🛡️ Accessibility Permissions

⚙️ Installation

🚀 Usage

🔹 Single App Processing

🔹 Bulk App Processing

⚙️ Configuration Options

🧠 1. GPT-4 Assistance (Optional)

🖱️ 2. Cursor-Based Interaction

🗂️ 3. Task Collection

🕔 4. Maximal duration of parsing

📁 Input Format

📤 Output

🛠️ Task postprocessing

macapptree

License

Citation

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages