Overseas access: www.kdjingpai.com
Ctrl + D Favorites

WebAgent is an open source project developed by Alibaba Tongyi Lab, focusing on intelligent web information search and processing. It consists of three main components: WebWalker, WebDancer and WebSailor. these tools use advanced language modeling and reinforcement learning techniques to help users efficiently complete complex web information search tasks. webAgent's design goal is to achieve autonomy in information access, applicable to a variety of scenarios from academic research to daily query. The project is open-sourced on GitHub, with code and some data freely available to developers, and has attracted a great deal of attention, with more than 4,000 stars and hundreds of forks.WebAgent continues to improve its performance through continuous updates and community support, and the WebSailor-72B model has excelled in a number of benchmarks for complex browsing, approaching the level of commercial search engines.

WebAgent: An Intelligent Web Information Search and Processing Tool-1

Function List

  • WebWalker: Provide web traversal benchmarking to evaluate the performance of language models in web navigation and support multi-agent collaboration in information search tasks.
  • WebDancer: Native agent search model, focusing on autonomous information search, integrating ReAct framework that provides efficient search reasoning capabilities.
  • WebSailor: A newly released agent model that excels at handling complex information search tasks, supports one-click deployment, and has the highest performance among open source models.
  • SailorFog-QA dataset: Provide high level Q&A datasets, generated by graph sampling and information fuzzification, to support model training and evaluation.
  • Enhanced Learning Optimization: The DUPO algorithm is used to combine supervised fine-tuning and reinforcement learning to improve the generalization ability of the model in complex tasks.

Using Help

Installation process

WebAgent is an open source project, mainly through the GitHub repository to obtain the code and model. Here are the installation steps for WebDancer and WebSailor (WebDancer for example, WebSailor similar). Users need to have a basic Python programming environment and Git tools.

  1. environmental preparation::
    • Make sure Python 3.12 or later is installed.
    • Install Git for cloning repositories.
    • mounting conda Package management tool for creating virtual environments.
  2. clone warehouse::
    Run the following command in the terminal to get the WebAgent code:

    git clone https://github.com/Alibaba-NLP/WebAgent.git
    

    Go to the WebDancer folder:

    cd WebAgent/WebDancer
    
  3. Creating a Virtual Environment::
    utilization conda Create a separate Python environment to avoid dependency conflicts:

    conda create -n webdancer python=3.12
    

    Activate the environment:

    conda activate webdancer
    
  4. Installation of dependencies::
    In the WebDancer folder, run the following command to install the required dependencies:

    pip install -r requirements.txt
    

    Dependencies requirements.txt Contains all the Python packages needed to run WebDancer.

  5. Model deployment::
    WebSailor supports one-click deployment through Alibaba Cloud's FunctionAI. Users need to register for an Alibaba Cloud account, log in to the FunctionAI platform, follow the prompts to select the WebSailor-3B or WebSailor-72B model, and click the deployment button to complete. The deployment time is about 10 minutes. [](https://github.com/Alibaba-NLP/WebAgent)

Operating the WebDancer

WebDancer is a native agent search model suitable for handling web search tasks that require deep reasoning. Here are the steps to use it:

  1. priming model::
    In the virtual environment, go to the WebDancer directory and run the startup script (you need to refer to the README file in the repository for the exact commands). Usually:

    python run_webdancer.py
    
  2. Enter a query::
    WebDancer accepts text input. Users can enter a search task, such as "Find information about the latest AI conferences in 2025," at the command line or in an interactive interface. The model automatically parses the query, traverses the page, and extracts relevant information.
  3. View Results::
    WebDancer returns structured search results, including text summaries, web links and related data. Users can further filter or export the results.
  4. Debugging and Optimization::
    If the search results are not satisfactory, you can adjust the model parameters (such as search depth or keyword weights), the specific configuration method refers to the document config.yaml Documentation.

Operating WebSailor

WebSailor is the newest component of WebAgent, more powerful and suitable for ultra-complex tasks. The steps to use it are as follows:

  1. Deployment models::
    Deploy WebSailor via AliCloud FunctionAI as described earlier.After the deployment is complete, obtain the API endpoint address.
  2. Calling the API::
    WebSailor provides an API interface. Users can send query requests via Python scripts:

    import requests
    url = "YOUR_API_ENDPOINT"
    query = {"task": "查找 2025 年开源 AI 模型的最新进展"}
    response = requests.post(url, json=query)
    print(response.json())
    
  3. Handling complex tasks::
    WebSailor specializes in multi-step tasks. For example, when querying "Compare the performance difference between open source AI models and commercial models in 2025," the model automatically breaks down the task, searches multiple sources, consolidates the information, and generates a comparison report.
  4. View Log::
    WebSailor supports logging to make it easy for users to examine search paths and reasoning processes. Log files are usually stored in the deployment directory's logs/ folder.

Operating WebWalker

WebWalker is a benchmarking tool for developers to evaluate model performance. How to use it is as follows:

  1. Download Dataset::
    WebWalker provides the WebWalkerQA dataset, located in the repository's dataset/ Catalog. Run the following command to download:

    wget https://github.com/Alibaba-NLP/WebAgent/raw/main/dataset/webwalkerqa.jsonl
    
  2. operational test::
    Run benchmark tests using test scripts:

    python evaluate_webwalker.py --dataset webwalkerqa.jsonl
    
  3. analysis::
    After the test is completed, the model performance report displays metrics such as accuracy, recall, etc., saved in the results/ Catalog.

Featured Function Operation

  • SailorFog-QA dataset: Users can download directly sailorfog-QA.jsonl file for training or evaluating other models. File path:
    WebAgent/dataset/sailorfog-QA.jsonl
    
  • Enhanced Learning Optimization: WebAgent uses the DUPO algorithm to optimize the model. Developers can refer to train/ script in the directory to adjust hyperparameters to improve model performance.
  • Interactive Demo: WebDancer provides an online demo interface (you need to visit the demo link in the repository). Users can experience the search capability of the model by entering a query through a browser.

caveat

  • Ensure a stable internet connection, some functions require access to external web pages.
  • The WebSailor-72B model has high hardware requirements and a high performance GPU or cloud service is recommended.
  • Regularly check the GitHub repository for updates to get the latest models and data.

application scenario

  1. academic research
    WebAgent is ideal for researchers searching for academic papers, conference information or technical reports. For example, if you type in "Find paper topics for ACL 2025", WebSailor will automatically crawl the official website and organize the list of topics and related links.
  2. market analysis
    Business users can utilize WebAgent to gather information on market trends or competing products. For example, if you query "Latest AI Chip Market Update 2025," the model will integrate news, reports, and social media data.
  3. Daily Information Inquiry
    Regular users can use WebDancer to quickly find information about their lives, such as "Recommended Best Destinations for 2025," and the model will provide detailed descriptions of locations and travel suggestions.
  4. Developer Testing
    WebWalker is suitable for developers to test the performance of models in web navigation, to help optimize search algorithms or to build new models.

QA

  1. What languages does WebAgent support?
    WebAgent mainly supports English and Chinese search tasks and performs well in BrowseComp-en (English) and BrowseComp-zh (Chinese) benchmarks.
  2. How does WebSailor compete with commercial search engines?
    WebSailor-72B approaches the level of commercial search engines in complex tasks, especially in multi-step reasoning and information integration. Its open source nature makes it more flexible and suitable for customization needs.
  3. How do I get the latest updates to WebAgent?
    Users can follow notifications on the GitHub repository (https://github.com/Alibaba-NLP/WebAgent) or check the @Alibaba-NLP account on the X platform for updates.
  4. Is there a fee for WebAgent?
    WebAgent is an open source project, the code and part of the data for free.WebSailor's cloud deployment may involve the cost of AliCloud services, the specific price should refer to the AliCloud official website.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

inbox

Contact Us

Top

en_USEnglish