UI-TARS CLI

A command-line interface for automating Android devices via ADB using self-hosted UI-TARS or OpenAI-compatible vision-language models.

Overview

UI-TARS CLI enables automated control of Android devices through natural language commands. It leverages vision-language models to interpret screenshots and execute actions via Android Debug Bridge (ADB), providing an intelligent automation layer for mobile testing, scripting, and task automation.

Prerequisites

Node.js 18 or higher
ADB (Android Debug Bridge) installed and available in PATH
Android Device with USB debugging enabled
Vision-Language Model API - OpenAI-compatible endpoint (GPT-4 Vision, Claude, UI-TARS, etc.)

Installation

git clone https://github.com/Djanghao/UI-TARS-CLI-ADB
cd UI-TARS-CLI-ADB
npm install
npm run build

Quick Start

Interactive Mode (Recommended)

Launch the CLI with interactive prompts:

npm start

On first run, you will be prompted to configure:

Base URL: API endpoint (e.g., http://localhost:8000/v1)
API Key: Authentication key (use any non-empty string for servers without auth)
Model: Model identifier (e.g., ui-tars, gpt-4-vision-preview)

Configuration is saved to ui-tars-cli.config.json:

{
  "baseURL": "http://localhost:8000/v1",
  "apiKey": "dummy",
  "model": "ui-tars"
}

One-Shot Mode

Execute tasks directly with command-line arguments:

npm start -- \
  --baseURL "http://localhost:8000/v1" \
  --apiKey "dummy" \
  --model "ui-tars" \
  --query "Go to home and open Settings"

Configuration

The CLI automatically creates and reads ui-tars-cli.config.json in the project directory. You can manually edit this file or let the interactive mode generate it.

Command-Line Options

Option	Description
`--query <text>`	Natural language task instruction
`--baseURL <url>`	OpenAI-compatible API base URL (overrides config)
`--apiKey <key>`	API authentication key (overrides config)
`--model <name>`	Model identifier (overrides config)
`--device <id>`	Target device ID (from `adb devices`)

Device Management

The CLI automatically detects connected Android devices. If multiple devices are available, you will be prompted to select one, or you can specify a device directly:

npm start -- --device <device-id> --query "Take a screenshot"

List connected devices:

adb devices

Local Deployment with vLLM

Requirements

vLLM >= 0.6.1 (recommended: 0.6.6)
CUDA-compatible GPU with sufficient VRAM
UI-TARS model weights

Install vLLM

pip install -U transformers
VLLM_VERSION=0.6.6
CUDA_VERSION=cu124
pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}

Available UI-TARS Models

ui-tars-2B-SFT - Smallest, fastest
ui-tars-7B-SFT - Balanced performance
ui-tars-7B-DPO - Recommended for 7B tier
ui-tars-72B-SFT - High accuracy
ui-tars-72B-DPO - Best performance (recommended)

Launch vLLM Server

# For 7B models (tensor parallel = 1)
python -m vllm.entrypoints.openai.api_server \
  --served-model-name ui-tars \
  --model <path-or-hf-repo> \
  --limit-mm-per-prompt image=5 \
  --tp 1

# For 72B models (tensor parallel = 4)
python -m vllm.entrypoints.openai.api_server \
  --served-model-name ui-tars \
  --model <path-or-hf-repo> \
  --limit-mm-per-prompt image=5 \
  --tp 4

Default endpoint: http://localhost:8000

Connect CLI to vLLM

npm start -- \
  --baseURL "http://localhost:8000/v1" \
  --apiKey "dummy" \
  --model "ui-tars" \
  --query "Open Weibo, go to Trending, scroll to read 10 items"

Notes

The CLI captures screenshots and sends them to the vision-language model via OpenAI's image_url data URL format
--limit-mm-per-prompt image=5 in vLLM matches the CLI's internal limit of 5 images per prompt
For servers without authentication, any non-empty string can be used as --apiKey (required by OpenAI SDK)

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bin		bin
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
rslib.config.ts		rslib.config.ts
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

UI-TARS CLI

Overview

Prerequisites

Installation

Quick Start

Interactive Mode (Recommended)

One-Shot Mode

Configuration

Command-Line Options

Device Management

Local Deployment with vLLM

Requirements

Install vLLM

Available UI-TARS Models

Launch vLLM Server

Connect CLI to vLLM

Notes

License

Related Resources

About

Uh oh!

Releases

Packages

Languages

Djanghao/UI-TARS-CLI

Folders and files

Latest commit

History

Repository files navigation

UI-TARS CLI

Overview

Prerequisites

Installation

Quick Start

Interactive Mode (Recommended)

One-Shot Mode

Configuration

Command-Line Options

Device Management

Local Deployment with vLLM

Requirements

Install vLLM

Available UI-TARS Models

Launch vLLM Server

Connect CLI to vLLM

Notes

License

Related Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages