Overseas access: www.kdjingpai.com
Bookmark Us

Bytebot is an open source, self-hosted AI desktop agent that runs in a containerized Linux environment and automates computer tasks through natural language commands. It mimics the way a human operates a computer, using the keyboard, mouse, and screen to perform tasks such as web browsing, data processing, file management, etc. Bytebot emphasizes privacy and customizability, with data not leaving the user's infrastructure, and support for users to use their own API keys for AI models, such as Claude, OpenAI, or Gemini. its core design is based on simplicity and generalizability, suitable for developers to build automated workflows. The project is hosted on GitHub and is suitable for developers to build automated workflows. The project is hosted on GitHub and is easy to deploy and scale for personal and enterprise use.

 

Function List

  • natural language task processingThe user describes a task in natural language, such as "Search for flights from New York to London next month" or "Fill out a web form", and Bytebot executes it automatically.
  • UI Bots: Simulate keyboard input, mouse clicks and screen readings, operate browsers, office software, etc.
  • Containerizing Linux Environments: A lightweight desktop environment based on Ubuntu and Xfce4, running in Docker containers for isolation and security.
  • Multi-model support: Support for Claude, OpenAI, Gemini and many other large-scale language models, which can be selected according to the user's needs.
  • Real-time desktop monitoring: See the AI agent in action in real time via the VNC viewer.
  • API Integration: Provides REST and MCP APIs for precise control of the mouse, keyboard and screenshots.
  • Customizable environments: Users can install customized software or configure the desktop environment to meet specific needs.
  • Privacy: All tasks and data run locally and do not rely on cloud services.

Using Help

Installation process

Bytebot is easy to install and is based on Docker and Railway deployment. Here are the detailed steps:

  1. Cloning Codebase
    Open a terminal and run the following command to clone the Bytebot repository:

    git clone https://github.com/bytebot-ai/bytebot.git
    cd bytebot
    
  2. Configuring API Keys
    Bytebot supports API keys from Anthropic, OpenAI or Google. Select a model and configure the key:

    echo "ANTHROPIC_API_KEY=your_api_key_here" > docker/.env  # 用于 Claude
    # 或
    echo "OPENAI_API_KEY=your_api_key_here" > docker/.env    # 用于 OpenAI
    # 或
    echo "GOOGLE_API_KEY=your_api_key_here" > docker/.env    # 用于 Gemini
    

    Ensure that the key is valid and stored in the docker/.env file to avoid leakage.

  3. Deployment services
    Start the service using Docker Compose:

    docker-compose -f docker/docker-compose.yml up -d
    

    The first startup may take 2-3 minutes to download the image, subsequent startups will be faster. Once the service is started, Bytebot's UI can be found in the http://localhost:9992 Access.

  4. Verify Installation
    Check service logs to ensure proper operation:

    docker-compose -f docker/docker-compose.yml logs -f bytebot-agent
    
  5. Railway deployment (optional)
    If deployed using the Railway platform:

    • Visit Bytebot's Railway template page.
    • Enter your API key (e.g. ANTHROPIC_API_KEY).
    • Click "Deploy Now" and Railway will complete the deployment in a few minutes and provide a public URL.

Using the main functions

Bytebot provides an intuitive Next.js interface combined with a VNC viewer and task management features. Below is a flow of how the main features work:

  • Creating Tasks
    show (a ticket) http://localhost:9992, go to the Bytebot UI. enter natural language commands in the task input box, for example:

    搜索下个月纽约到伦敦的航班
    

    Click Submit and Bytebot will launch the viewer and perform the task. You can monitor the operation in real time through the VNC viewer.

  • API Control
    Developers can precisely control tasks through the REST API. For example, create a task:

    curl -X POST http://localhost:9991/tasks \
    -H "Content-Type: application/json" \
    -d '{"description": "搜索下个月纽约到伦敦的航班", "type": "browser_task"}'
    

    Check the status of the task:

    curl http://localhost:9991/tasks/{task_id}
    

    Controls the keyboard or mouse:

    curl -X POST http://localhost:9990/api/computer \
    -H "Content-Type: application/json" \
    -d '{"action": "type_text", "text": "Hello, Bytebot!"}'
    
  • real time monitoring
    Watch how Bytebot operates your browser or desktop application from the VNC viewer in the UI interface. The viewer displays real-time screen content and is suitable for debugging or verification tasks.
  • Customizing the desktop environment
    modifications docker/desktop/Dockerfile.custom file to install additional software. For example, add LibreOffice and GIMP:

    FROM bytebot/desktop:latest
    RUN apt-get update && apt-get install -y libreoffice gimp
    COPY configs/.config /home/user/.config
    

    Rebuild the image and start the container:

    docker-compose -f docker/docker-compose.yml up --build
    

Featured Function Operation

  • web automation
    Bytebot specializes in web tasks. For example, extracting web page data:

    import { BytebotClient, Table, Column, Text } from "@bytebot/sdk";
    const bytebot = new BytebotClient({ apiKey: "YOUR_API_KEY" });
    async function run() {
    const session = await bytebot.browser.startSession("https://www.example.com");
    await bytebot.browser.act({ sessionId: session.sessionId, prompt: "点击搜索按钮" });
    await bytebot.browser.endSession(session.sessionId);
    }
    run();
    

    This code starts a browser session, performs a click action, and ends the session.

  • Documents processing
    Bytebot can work with local files. For example, the command "Fill web form from CSV file" will automatically read the file and fill the form. Make sure the path to the CSV file is correct and enter the command in the UI.
  • Multi-model switching
    exist docker/.env Change the API key in to switch to a different model. For example, restart the service after replacing it with the OpenAI key:

    docker-compose -f docker/docker-compose.yml restart
    

caveat

  • safety: Change the default VNC password to avoid using the default settings in a production environment.
  • update: Periodically update the container image for security patches:
    docker-compose -f docker/docker-compose.yml pull
    

application scenario

  1. Web Data Extraction
    Bytebot automatically extracts data from websites, such as crawling product prices or news content, to generate structured tables suitable for market research or data analysis.
  2. Automated form filling
    For tasks that require repetitive filling of web forms, such as registering for an account or submitting an application, Bytebot reads the data from the CSV file and automates the process.
  3. Office software operation
    Bytebot can operate LibreOffice or VSCode, handle document editing, code debugging and other tasks, suitable for the need to batch file processing scenarios.
  4. Enterprise Workflow Automation
    Organizations can use Bytebot to automatically update user permissions on SaaS tools or generate weekly reports to improve internal efficiency.

QA

  1. What AI models does Bytebot support?
    Support for Claude, OpenAI, and Gemini is available to users in the docker/.env file to configure the corresponding API key.
  2. How do you ensure data privacy?
    Bytebot runs in a local container and the data does not leave the user's infrastructure, making it suitable for scenarios with high privacy requirements.
  3. Are programming skills required?
    Regular users can enter natural language commands through the UI without programming. Developers can implement more complex functionality through APIs.
  4. How long does it take to deploy?
    The first deployment takes about 2-3 minutes, with subsequent startups taking only a few seconds, and Railway deployments are typically completed in minutes.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish