Overseas access: www.kdjingpai.com
Bookmark Us

ReCall is an open source framework designed to train Large Language Models (LLMs) for tool invocation and inference through reinforcement learning, without relying on supervised data. It allows models to autonomously use and combine external tools, such as search, calculators, etc., to solve complex tasks.ReCall supports user-defined tools and is suitable for developing general-purpose intelligences. The project is based on the Qwen2.5 model and provides synthetic datasets SynTool and MuSiQue datasets to support multi-step task inference. ReCall is an upgraded version of ReSearch, which is more comprehensive and suitable for multi-scenario tool inference development.

 

Function List

  • Training large models through reinforcement learning without supervised data and supporting autonomous tool calls.
  • Support user-defined arbitrary tools, flexible adaptation to a variety of task scenarios.
  • Provides SynTool synthetic datasets with diverse environments and complex multi-step tasks.
  • Integrated FlashRAG evaluation environment for multi-hop Q&A task validation.
  • Efficient sandboxing and modeling services based on FastAPI and SGLang.
  • Support for MuSiQue datasets, combined with Wikipedia search tools for data preprocessing.
  • Detailed scripts and documentation are provided to facilitate user-defined data and model training.

 

Using Help

ReCall is an open source project for developers, hosted on GitHub, that allows users to get started quickly by cloning code, installing dependencies, and running sample scripts. Below is a detailed user guide covering installation, key features, and the development process to help you get started with ReCall from the ground up.

Installation process

  1. Cloning Codebase
    Open a terminal and run the following command to clone the ReCall repository:

    git clone https://github.com/Agent-RL/ReCall.git
    cd ReCall
    
  2. Installation of dependencies
    ReCall depends on Python environment, Python 3.8 or above is recommended. Install the core dependencies:

    pip3 install -e .
    pip3 install flash-attn --no-build-isolation
    

    If you need to run a FlashRAG-based Wikipedia RAG System, additional installation required faiss-gpu::

    conda install -c pytorch -c nvidia faiss-gpu=1.8.0
    

    Dependencies include transformers,vllm==0.8.4,sglang etc., for a complete list see setup.pyThe

  3. Download preprocessed data
    ReCall provides preprocessed SynTool and MuSiQue datasets for training and evaluation. Users can download them directly:

    # 访问提供的下载链接(见 GitHub README)
    

    Alternatively, the user can run data/prepare_musique_recall.py Scripts to customize the generation of MuSiQue data, combined with the Wikipedia search tool.

  4. Starting the modeling service
    ReCall uses SGLang for modeling services. Example of starting a service:

    python3 -m sglang.launch_server \
    --served-model-name {trained/model/name} \
    --model-path {trained/model/path} \
    --tp 2 \
    --context-length 8192 \
    --enable-metrics \
    --dtype bfloat16 \
    --host 0.0.0.0 \
    --port 80 \
    --trust-remote-code \
    --disable-overlap \
    --disable-radix-cache
    

    User replaceable {trained/model/name} cap (a poem) {trained/model/path} for your own model name and path.

Main Functions

The core function of ReCall is to train large models for tool call inference through reinforcement learning. The following are the main features and operation flow:

  1. tool invocation inference
    ReCall allows models to select and use tools autonomously, such as search, calculators, or custom tools. Core Classes ReCall Responsible for the coordination of model generation and tool execution. Users can refer to the scripts/inference/re_call_use_case.py Sample scripts to see how to invoke the tool to accomplish a task. Example:

    • Enter a complex question (e.g. multi-hop quiz).
    • The model selects the appropriate tool (e.g., Wikipedia search) through reinforcement learning.
    • Returns structured inference results.
      The user simply loads the trained model, calls the ReCall class interface, just pass in the question and tool configuration.
  2. Synthetic dataset generation
    ReCall provides SynTool datasets to support the generation of diverse environmental and multi-step task data. Users can run data/prepare_musique_recall.py Scripts that generate custom datasets. Example:

    python data/prepare_musique_recall.py
    

    The script generates training data based on user-defined tools (e.g., Wikipedia search) for complex inference tasks. The generated training data can be used directly for model training.

  3. Multi-hop quiz assessment
    ReCall uses FlashRAG as the evaluation environment for multi-hop quizzes. Users can download evaluation datasets and run tests:

    # 下载 FlashRAG 评估数据(见 GitHub 说明)
    

    The evaluation results validate the model's performance in multi-step inference tasks and are suitable for testing tool-calling capabilities.

  4. Customized tool development
    The user can define any tool to extend the functionality of ReCall. For example, add a calculator tool:

    • Define the tool interface in the code to specify the input and output formats.
    • Integrate tool configurations into ReCall In Class.
    • Run training or inference scripts to test the effectiveness of new tools.
      The implementation can be found in the documentation and sample code provided by GitHub.

Featured Function Operation

ReCall features tool-call inference through reinforcement learning without the need for supervised data. Here is how the featured functionality is used:

  • Intensive Learning Training: ReCall Usage verl framework for reinforcement learning training. Users can configure training parameters and run training scripts:
    # 示例训练命令(具体参数见 GitHub 文档)
    python train.py --config {config_file}
    

    The training process utilizes the SynTool and MuSiQue datasets to optimize the tool selection and inference capabilities of the model.

  • Flexible toolset: Users can define multiple tools (e.g., search, database query, etc.) through a configuration file, and the model automatically selects and combines tools based on the task. Operation steps:
    1. Edit the tool configuration file (e.g. in YAML format).
    2. Load to ReCall Class.
    3. Run the inference script and observe how the model dynamically invokes the tool.
  • Efficient Modeling Services: SGLang-based modeling service supports highly concurrent reasoning and is suitable for production environments. Users can call models via API to handle real-time tasks.

caveat

  • Ensure hardware support (e.g. GPU for faiss-gpu cap (a poem) flash-attn).
  • Regularly check your GitHub repository for updates to get the latest features and fixes.
  • Training and reasoning require large memory and storage space, and advance preparation is recommended.

application scenario

  1. Multi-hop quiz system development
    ReCall can be used to develop question-and-answer systems that require multi-step reasoning. For example, to answer the question "What was the 19th century capital of the country where a historical figure was born?" The model can use Wikipedia search tool for step-by-step reasoning to get an accurate answer. It is suitable for education and knowledge Q&A platform development.
  2. Automated Task Processing
    ReCall can automate complex tasks in conjunction with a variety of tools (e.g., calculators, database queries). For example, organizations can use ReCall to develop intelligence to automatically analyze sales data and generate reports.
  3. AI Research and Experimentation
    Researchers can leverage ReCall's synthetic datasets and reinforcement learning framework to explore the performance of large models in tool invocation and complex reasoning tasks suitable for academic research and algorithm development.

 

QA

  1. What tools does ReCall support?
    ReCall supports user-defined tools such as searches, calculators, database queries, and more. You can add new tools via configuration files, see GitHub for examples.
  2. How do I get started with ReCall?
    Clone the GitHub repository, install the dependencies, download the preprocessed data or generate custom data, and then run the sample script. See the Help for details.
  3. What is the difference between ReCall and ReSearch?
    ReCall is an upgraded version of ReSearch that supports more tools and more complex reasoning tasks and can be used as a direct replacement for ReSearch.
  4. How much storage space is required?
    Training and evaluation require large storage space, e.g. FlashRAG evaluation data can take up tens of gigabytes, so it is recommended to check hardware capacity in advance.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish