UI-TARS Desktop

UI-TARS Desktop is a GUI Agent application based on UI-TARS (Vision-Language Model) that allows you to control your computer using natural language.

📑 Paper | 🤗 Hugging Face Models |
🖥️ Desktop Application | 👓 Midscene (use in browser)

⚠️ Important Announcement: GGUF Model Performance

The GGUF model has undergone quantization, but unfortunately, its performance cannot be guaranteed. As a result, we have decided to downgrade it.

💡 Alternative Solution: You can use Cloud Deployment or Local Deployment [vLLM](If you have enough GPU resources) instead.

We appreciate your understanding and patience as we work to ensure the best possible experience.

Showcases

Instruction	Video
Get the current weather in SF using the web browser	new_mac_action_weather.mp4
Send a twitter with the content "hello world"	new_send_twitter_windows.mp4

Features

🤖 Natural language control powered by Vision-Language Model
🖥️ Screenshot and visual recognition support
🎯 Precise mouse and keyboard control
💻 Cross-platform support (Windows/MacOS)
🔄 Real-time feedback and status display

Quick Start

Download

You can download the latest release version of UI-TARS Desktop from our releases page.

Install

MacOS

Drag UI TARS application into the Applications folder

Enable the permission of UI TARS in MacOS:

System Settings -> Privacy & Security -> Accessibility
System Settings -> Privacy & Security -> Screen Recording

Then open UI TARS application, you can see the following interface:

Windows

Still to run the application, you can see the following interface:

Deployment

Cloud Deployment

We recommend using HuggingFace Inference Endpoints for fast deployment. We provide two docs for users to refer:

English version: GUI Model Deployment Guide

中文版: GUI模型部署教程

Local Deployment [vLLM]

We recommend using vLLM for fast deployment and inference. You need to use vllm>=0.6.1.

pip install -U transformers
VLLM_VERSION=0.6.6
CUDA_VERSION=cu124
pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}

Download the Model

We provide three model sizes on Hugging Face: 2B, 7B, and 72B. To achieve the best performance, we recommend using the 7B-DPO or 72B-DPO model (based on your hardware configuration):

Start an OpenAI API Service

Run the command below to start an OpenAI-compatible API service:

python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model <path to your model>

Input your API information

Note: VLM Base Url is OpenAI compatible API endpoints (see OpenAI API protocol document for more details).

Development

Just simple two steps to run the application:

pnpm install
pnpm run dev

Note: On MacOS, you need to grant permissions to the app (e.g., iTerm2, Terminal) you are using to run commands.

Testing

# Unit test
pnpm run test
# E2E test
pnpm run test:e2e

System Requirements

Node.js >= 20
Supported Operating Systems:
- Windows 10/11
- macOS 10.15+

License

UI-TARS Desktop is licensed under the Apache License 2.0.

Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝

@article{uitars2025,
  author    = {Yujia Qin, Yining Ye, Junjie Fang, Haoming Wang, Shihao Liang, Shizuo Tian, Junda Zhang, Jiahao Li, Yunxin Li, Shijue Huang, Wanjun Zhong, Kuanye Li, Jiale Yang, Yu Miao, Woyu Lin, Longxiang Liu, Xu Jiang, Qianli Ma, Jingyu Li, Xiaojun Xiao, Kai Cai, Chuang Li, Yaowei Zheng, Chaolin Jin, Chen Li, Xiao Zhou, Minchao Wang, Haoli Chen, Zhaojian Li, Haihua Yang, Haifeng Liu, Feng Lin, Tao Peng, Xin Liu, Guang Shi},
  title     = {UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
  journal   = {arXiv preprint arXiv:2501.12326},
  url       = {https://github.com/bytedance/UI-TARS},
  year      = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.changeset		.changeset
.github/workflows		.github/workflows
.husky		.husky
.vscode		.vscode
build		build
e2e		e2e
images		images
packages		packages
resources		resources
scripts		scripts
src		src
static		static
.commitlintrc.cjs		.commitlintrc.cjs
.editorconfig		.editorconfig
.env.example		.env.example
.eslintignore		.eslintignore
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.lintstagedrc.mjs		.lintstagedrc.mjs
.node-version		.node-version
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc.mjs		.prettierrc.mjs
LICENSE		LICENSE
README.md		README.md
dev-app-update.yml		dev-app-update.yml
electron-builder.yml		electron-builder.yml
electron.vite.config.ts		electron.vite.config.ts
forge.config.ts		forge.config.ts
package.json		package.json
playwright.config.ts		playwright.config.ts
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.web.json		tsconfig.web.json
vitest.config.mts		vitest.config.mts
vitest.workspace.mts		vitest.workspace.mts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UI-TARS Desktop

⚠️ Important Announcement: GGUF Model Performance

Showcases

Features

Quick Start

Download

Install

MacOS

Windows

Deployment

Cloud Deployment

Local Deployment [vLLM]

Download the Model

Start an OpenAI API Service

Input your API information

Development

Testing

System Requirements

License

Citation

About

Releases 2

Packages

Contributors 5

Languages

License

bytedance/UI-TARS-desktop

Folders and files

Latest commit

History

Repository files navigation

UI-TARS Desktop

⚠️ Important Announcement: GGUF Model Performance

Showcases

Features

Quick Start

Download

Install

MacOS

Windows

Deployment

Cloud Deployment

Local Deployment [vLLM]

Download the Model

Start an OpenAI API Service

Input your API information

Development

Testing

System Requirements

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 5

Languages

Packages