The core value of the SailorFog-QA dataset is its innovative difficulty design:
- defuzzificationThe original data is reconstructed using graph sampling techniques, and key entities (e.g., names of people and organizations) are replaced with synonyms or generalized with attributes to simulate the incompleteness of information in real scenarios. For example, "Transformer model" is rewritten as "Attention architecture proposed by a Google".
- Multi-Jump Reasoning Challenge: Problem 40% needs to be derived across more than 3 sources of information, e.g., "Predicting Tesla's 2025 Battery Technology Route" needs to integrate three types of content: patent data, executive interviews, and academic papers.
- Richness of assessment dimensions: In addition to conventional accuracy rates, characteristic indicators such as information traceability (quality of reference links provided) and reasoning interpretability (completeness of logical chains) were designed.
This dataset contains 120,000 samples in English and Chinese and has been applied to the fine-tuning phase of reinforcement learning for WebSailor, resulting in a 22.5% improvement in the model's F1 value in fuzzy query scenarios.The researcher can use the WebAgent/dataset/sailorfog-QA.jsonl To get the data, the file is in JSON Lines format and each entry contains fields such as: original ask, fuzzy ask, golden path, supporting evidence, and so on.
This answer comes from the articleWebAgent: An Intelligent Web Information Search and Processing ToolThe





























