OpenAdapt is an open-source software tool that connects powerful Large Multimodal Models (LMMs) to a computer's Graphical User Interface (GUI) with the aim of automating processes. Traditionally, a great deal of mental labor has been wasted on repetitive computer operations, and OpenAdapt aims to solve this problem. It works similarly to Robotic Process Automation (RPA), but the core driver is an advanced AI model rather than a traditional RPA tool. The tool learns by recording what users actually do on their computers (including screenshots and typing actions) and then uses that data to generate automated tasks. This method of learning from human demonstrations makes the automation tasks closer to the actual process and reduces the likelihood of the AI generating incorrect operations. As a model-agnostic open source project, it is applicable to all kinds of desktop applications, even virtualized environments (e.g. Citrix) and web pages.
Function List
- Record user actions: Ability to capture screen shots and associated user input (e.g., mouse clicks, keyboard input) to provide learning data for automation.
- visualization development: Provide tools to aggregate and visualize recorded data for easy understanding and debugging by developers.
- Generate automation scripts: Convert user action records into a format that AI models can understand and generate automated tasks that can be replayed over and over again.
- Multiple playback strategies: Support for different automated execution strategies, from simple direct playback to smarter playback using GPT-4 or visual models.
- Browser Integration: Provides a Chrome extension to record in-browser action events for more accurate web automation.
- Privacy: Built-in industry-leading privacy information erasure to remove personally identifiable information (PII) and protected health information (PHI) with tools like AWS Comprehend, Microsoft Presidio, and more. [cite:1. 1]
- Performance Monitoring: Detailed performance monitoring tools are integrated to help developers analyze and optimize their programs.
- Cross-platform support: Provides installation and usage instructions for major operating systems such as Windows and macOS.
Using Help
OpenAdapt allows AI models to learn how to mimic your behavior to complete repetitive tasks by recording your computer actions (such as mouse clicks and keyboard strokes) along with screen shots.
Installation process
OpenAdapt provides a convenient scripted installation for users of different operating systems.
Windows systems.
- check or refer to
Windows
key, type "powershell" and press Enter to open PowerShell. - Copy and paste the following commands into a PowerShell window, and then press the Enter key to execute them. If you are prompted for User Account Control, click Yes.
Start-Process powershell -Verb RunAs -ArgumentList '-NoExit', '-ExecutionPolicy', 'Bypass', '-Command', "iwr -UseBasicParsing -Uri 'https://raw.githubusercontent.com/OpenAdaptAI/OpenAdapt/main/install/install_openadapt.ps1' | Invoke-Expression"
macOS systems.
- First make sure you have the
Git
cap (a poem)Python 3.10
The - check or refer to
Command+Space
key combination, type "terminal" and press enter to open the terminal. - Copy and paste the following command into the terminal window and press Enter to execute it:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/OpenAdaptAI/OpenAdapt/HEAD/install/install_openadapt.sh)"
Core Function Operation
Once the installation is complete, you can use OpenAdapt with a few core commands.Before using it, make sure you have passed the cd OpenAdapt
command into the root directory of the project and execute the poetry shell
The virtual environment is activated.
1. Launching the system tray and web backend
Run the following command to launch OpenAdapt's system tray icon and web dashboard for easy management and viewing of tasks.
python -m openadapt.entrypoint
2. Recording a new mission
utilization openadapt.record
command to start a new recording. You will need a descriptive name for the task you are recording, such as "testing out openadapt".
python -m openadapt.record "testing out openadapt"
When the terminal display event writers (screen, action, window) have been started, you can start operating the computer.OpenAdapt will record your mouse movements, clicks, and keyboard input. When you are done, press CTRL+C
key combination to stop recording.
take note of: The current version suggests keeping the recording short (e.g. less than a minute) to avoid using too much memory.
3. Visualization of recorded content
Once the recording is complete, you can quickly view what was recorded. Run the following command:
python -m openadapt.visualize
This command automatically generates an HTML file and opens it in your browser. You will see a detailed view with all the steps and corresponding screenshots.
4. Playback (execution) of automated tasks
utilization openadapt.replay
command to automate the task you just recorded. You need to specify a playback policy, the simplest of which is the NaiveReplayStrategy
The
python -m openadapt.replay NaiveReplayStrategy
In addition, OpenAdapt offers other smarter playback strategies such as VisualReplayStrategy
, which will use visual models to recognize elements on the screen. Some advanced strategies also allow you to add new instructions to modify the original task, for example:
python -m openadapt.replay VanillaReplayStrategy --instructions "calculate 9-8"
This instruction tells the AI to adapt its behavior to the new instruction ("Calculate 9-8") as it performs the task.
Browser Automation Integration
If you want to record operations in Google Chrome, you need to set up the browser extension additionally:
- In the Chrome address bar type
chrome://extensions
And open. - Turn on the "Developer Mode" switch in the upper right corner.
- Click on "Load unzipped extensions" in the upper left corner.
- In the file selection window that pops up, locate and select the OpenAdapt project directory in the
chrome_extension
Folder. - Make sure the OpenAdapt extension is enabled.
- modifications
openadapt/data/config.json
file, which willRECORD_BROWSER_EVENTS
is set to the value oftrue
The
application scenario
- Automated data entry
For repetitive tasks that require copying information from one software (e.g., PDF documents, email) and pasting it into another (e.g., Excel sheets, databases), you can use OpenAdapt to record the flow of the operation once, and then let it automate all subsequent similar data-entry tasks. - Software operating aids
For users unfamiliar with a complex piece of software, it is possible to have someone else pre-record a series of standard operating procedures. Users can simply play back these procedures through OpenAdapt to automate specific tasks, lowering the barriers to using the software. - Software regression testing
During software development, developers can record a series of standardized test cases. These test cases can be automatically played back whenever the software is updated to check whether the new version introduces new problems, thus improving testing efficiency. - Automation of personal daily tasks
It can be used to automate daily tasks on your PC, such as organizing desktop files on a regular daily basis, batch renaming photos, automatically logging into websites and signing in, etc., thus saving your personal time.
QA
- What is OpenAdapt?
OpenAdapt is an open source process automation software. It automates repetitive tasks by recording user actions on a computer and using large multimodal models (LMMs) to learn and mimic those actions. - How is it different from traditional RPA tools?
Traditional RPA tools usually rely on preset rules and scripts to perform tasks, which are less adaptable. OpenAdapt, on the other hand, adopts an "AI-first" strategy and learns by observing human demonstrations, enabling it to better understand task intent and adapt to dynamic scenarios such as interface changes, making it more flexible and intelligent. - Do I need to pay to use OpenAdapt?
No. OpenAdapt is an open source project based on the MIT license and is free for anyone to use, modify and distribute. - What operating systems does it support?
OpenAdapt currently provides detailed installation scripts and manual setup guides for Windows and macOS that can be used on both major desktop operating systems. - How does OpenAdapt handle my private data?
OpenAdapt has built-in industry-leading Privacy Information Erasure that automatically recognizes and removes Personally Identifiable Information (PII) and Protected Health Information (PHI) during recording to keep user data secure.