Technical boundaries of the current version
As a technical preview, Agent TARS has three main limitations: the operating system only supports macOS, the model relies on external APIs, and the success rate for complex tasks is about 781 TP3 T. These limitations stem from the following technical factors:
- The system adaptation layer requires the development of drivers for different platforms, and the Windows/Linux version is expected to require a 6-month development cycle.
- The core AI model is currently based on the Azure OpenAI service, with future plans to open source a localized version of the mini-model
- Accuracy of visual localization in dynamic web pages is affected by element loading speed and AJAX
The project roadmap shows that the next version will focus on improving capabilities in three areas: support for Safari/Chrome plug-in extensions, adding a local model caching mechanism, and optimizing task backtracking. The development team recommends that users split complex tasks into multiple subtasks at this stage to get the best results.
This answer comes from the articleAgent TARS: An Open Source Intelligence Using Vision and Commands to Operate ComputersThe































