Current Position:fig. beginning " AI Professional Tools

Bytebot: Automating Desktop Tasks in Linux Containers with Natural Language

2025-08-02

2.2 K 9

make a copy of

Bytebot is an open source, self-hosted AI desktop agent that runs in a containerized Linux environment and automates computer tasks through natural language commands. It mimics the way a human operates a computer, using the keyboard, mouse, and screen to perform tasks such as web browsing, data processing, file management, etc. Bytebot emphasizes privacy and customizability, with data not leaving the user's infrastructure, and support for users to use their own API keys for AI models, such as Claude, OpenAI, or Gemini. its core design is based on simplicity and generalizability, suitable for developers to build automated workflows. The project is hosted on GitHub and is suitable for developers to build automated workflows. The project is hosted on GitHub and is easy to deploy and scale for personal and enterprise use.

Function List

natural language task processingThe user describes a task in natural language, such as "Search for flights from New York to London next month" or "Fill out a web form", and Bytebot executes it automatically.
UI Bots: Simulate keyboard input, mouse clicks and screen readings, operate browsers, office software, etc.
Containerizing Linux Environments: A lightweight desktop environment based on Ubuntu and Xfce4, running in Docker containers for isolation and security.
Multi-model support: Support for Claude, OpenAI and Gemini A variety of large-scale language models, such as the user can choose according to the needs of the user.
Real-time desktop monitoring: See the AI agent in action in real time via the VNC viewer.
API Integration: Provides REST and MCP API for precise control of mouse, keyboard and screenshots.
Customizable environments: Users can install customized software or configure the desktop environment to meet specific needs.
Privacy: All tasks and data run locally and do not rely on cloud services.

Using Help

Installation process

Bytebot is easy to install and is based on Docker and Railway deployment. Here are the detailed steps:

Cloning Codebase
Open a terminal and run the following command to clone the Bytebot repository:
```
git clone https://github.com/bytebot-ai/bytebot.git
cd bytebot
```

Configuring API Keys
Bytebot supports API keys from Anthropic, OpenAI or Google. Select a model and configure the key:

echo "ANTHROPIC_API_KEY=your_api_key_here" > docker/.env  # 用于 Claude
# 或
echo "OPENAI_API_KEY=your_api_key_here" > docker/.env    # 用于 OpenAI
# 或
echo "GOOGLE_API_KEY=your_api_key_here" > docker/.env    # 用于 Gemini

Ensure that the key is valid and stored in the docker/.env file to avoid leakage.

Deployment services
Start the service using Docker Compose:
```
docker-compose -f docker/docker-compose.yml up -d
```
The first startup may take 2-3 minutes to download the image, subsequent startups will be faster. Once the service is started, Bytebot's UI can be found in the http://localhost:9992 Access.

Verify Installation
Check service logs to ensure proper operation:

docker-compose -f docker/docker-compose.yml logs -f bytebot-agent

Railway deployment (optional)
If deployed using the Railway platform:
- Visit Bytebot's Railway template page.
- Enter your API key (e.g. ANTHROPIC_API_KEY).
- Click "Deploy Now" and Railway will complete the deployment in a few minutes and provide a public URL.

Using the main functions

Bytebot provides an intuitive Next.js interface combined with a VNC viewer and task management features. Below is a flow of how the main features work:

Creating Tasks
show (a ticket) http://localhost:9992, go to the Bytebot UI. enter natural language commands in the task input box, for example:
```
搜索下个月纽约到伦敦的航班
```
Click Submit and Bytebot will launch the viewer and perform the task. You can monitor the operation in real time through the VNC viewer.

API Control
Developers can precisely control tasks through the REST API. For example, create a task:

curl -X POST http://localhost:9991/tasks \
-H "Content-Type: application/json" \
-d '{"description": "搜索下个月纽约到伦敦的航班", "type": "browser_task"}'

Check the status of the task:

curl http://localhost:9991/tasks/{task_id}

Controls the keyboard or mouse:

curl -X POST http://localhost:9990/api/computer \
-H "Content-Type: application/json" \
-d '{"action": "type_text", "text": "Hello, Bytebot!"}'

real time monitoring
Watch how Bytebot operates your browser or desktop application from the VNC viewer in the UI interface. The viewer displays real-time screen content and is suitable for debugging or verification tasks.
Customizing the desktop environment
modifications docker/desktop/Dockerfile.custom file to install additional software. For example, add LibreOffice and GIMP:
```
FROM bytebot/desktop:latest
RUN apt-get update && apt-get install -y libreoffice gimp
COPY configs/.config /home/user/.config
```
Rebuild the image and start the container:
```
docker-compose -f docker/docker-compose.yml up --build
```

Featured Function Operation

web automation
Bytebot specializes in web tasks. For example, extracting web page data:

import { BytebotClient, Table, Column, Text } from "@bytebot/sdk";
const bytebot = new BytebotClient({ apiKey: "YOUR_API_KEY" });
async function run() {
const session = await bytebot.browser.startSession("https://www.example.com");
await bytebot.browser.act({ sessionId: session.sessionId, prompt: "点击搜索按钮" });
await bytebot.browser.endSession(session.sessionId);
}
run();

This code starts a browser session, performs a click action, and ends the session.

Documents processing
Bytebot can work with local files. For example, the command "Fill web form from CSV file" will automatically read the file and fill the form. Make sure the path to the CSV file is correct and enter the command in the UI.
Multi-model switching
exist docker/.env Change the API key in to switch to a different model. For example, restart the service after replacing it with the OpenAI key:
```
docker-compose -f docker/docker-compose.yml restart
```

caveat

safety: Change the default VNC password to avoid using the default settings in a production environment.
update: Periodically update the container image for security patches:
```
docker-compose -f docker/docker-compose.yml pull
```

application scenario

Web Data Extraction
Bytebot automatically extracts data from websites, such as crawling product prices or news content, to generate structured tables suitable for market research or data analysis.
Automated form filling
For tasks that require repetitive filling of web forms, such as registering for an account or submitting an application, Bytebot reads the data from the CSV file and automates the process.
Office software operation
Bytebot can operate LibreOffice or VSCode, handle document editing, code debugging and other tasks, suitable for the need to batch file processing scenarios.
Enterprise Workflow Automation
Organizations can use Bytebot to automatically update user permissions on SaaS tools or generate weekly reports to improve internal efficiency.

QA

What AI models does Bytebot support?
Support for Claude, OpenAI, and Gemini is available to users in the docker/.env file to configure the corresponding API key.
How do you ensure data privacy?
Bytebot runs in a local container and the data does not leave the user's infrastructure, making it suitable for scenarios with high privacy requirements.
Are programming skills required?
Regular users can enter natural language commands through the UI without programming. Developers can implement more complex functionality through APIs.
How long does it take to deploy?
The first deployment takes about 2-3 minutes, with subsequent startups taking only a few seconds, and Railway deployments are typically completed in minutes.

AI open source project Desktop Automation Intelligence

AI productivity tools " Bytebot: Automating Desktop Tasks in Linux Containers with Natural Language Posted on 2025-08-02, please contact us if you find the URL is out of date, or inaccessible.

0Bookmarked

0kudos

Bytebot: Automating Desktop Tasks in Linux Containers with Natural Language

Function List

Using Help

Installation process

Using the main functions

Featured Function Operation

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Bytebot: Automating Desktop Tasks in Linux Containers with Natural Language

Function List

Using Help

Installation process

Using the main functions

Featured Function Operation

caveat

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool