Level-Navi Agent: A Framework and benchmark for Chinese Web Search Agents

Author: Chuanrui Hu ¹ Shichong Xie ^1,2 Baoxin Wang ¹ Bin Chen ¹ Xiaofeng Cong ¹ Jun Zhang ²
Institute:

1. 360 AI Research

2. School of Artificial Intelligence, Anhui University
Correspondent： Jun Zhang Email:junzhang@ahu.edu.cn
Submit Time:2024-12-25 14:15:19

Abstract: Large language models (LLMs), adopted to understand human language, drive the development of artificial intelligence (AI) web search agents. Compared to traditional search engines, LLM-powered AI search agents are capable of understanding and responding to complex queries with greater depth, enabling more accurate operations and better context recognition. However, little attention and effort has been paid to the Chinese web search, which results in that the capabilities of open-source models have not been uniformly and fairly evaluated. The difficulty lies in lacking three aspects: an unified agent framework, an accurately labeled dataset, and a suitable evaluation metric. To address these issues, we propose a general-purpose and training-free web search agent by level-aware navigation, Level-Navi Agent, accompanied by a well-annotated dataset (Web24) and a suitable evaluation metric. Level-Navi Agent can think through complex user questions and conduct searches across various levels on the internet to gather information for questions. Meanwhile, we provide a comprehensive evaluation of state-of-the-art LLMs under fair settings. To further facilitate future research, source code is available at Github.

Web Search Agent Benchmarking Evaluation Metrics Large Language Model

From: 谢世翀
Subject: Computer Science >> Natural Language Understanding and Machine Translation
Contribution： No Submitted
Cite as: ChinaXiv:202412.00330 (or this version ChinaXiv:202412.00330V1)
DOI:10.12074/202412.00330
CSTR:32003.36.ChinaXiv.202412.00330
TXID： 05bcc0a6-14ab-4bce-8e31-90a814824b5e
Recommended references： Chuanrui Hu,Shichong Xie,Baoxin Wang,Bin Chen,Xiaofeng Cong,Jun Zhang.Level-Navi Agent: A Framework and benchmark for Chinese Web Search Agents.中国科学院科技论文预发布平台.[DOI:10.12074/202412.00330] (Click&Copy)

Version History

[V1]

2024-12-25 14:15:19

ChinaXiv:202412.00330V1

Download

Related Paper

1. Unraveling the Black-box Magic: An Analysis of Neural Networks’ Dynamic Local Extrema	2025-07-08
2. MDPO: Multi-Granularity Direct Preference Optimization for Mathematical Reasoning	2025-06-10
3. Semantic structures within natural language and their cognitive functions	2025-06-03
4. Physical models realizing the transformer architecture of large language models	2025-05-27
5. DO-RAG: A Domain-Specific QA Framework Using Knowledge Graph-Enhanced Retrieval-Augmented Generation	2025-05-20
6. Mathematical formalism and physical models for generative artificial intelligence	2025-05-07
7. What surface characteristics truly affect thermal contact resistance -- An interpretability study based on deep learning and convolutional neural networks	2025-04-11
8. The Thermal Contact Resistance Dataset and the Artificial Intelligence-Driven Prediction of Thermal Contact Resistance in Multi-material Systems	2025-04-11
9. Utilizing Large Language Models to Analyze PSR.exe Recorded Input for Computer Use	2025-03-21
10. Recent Advances in Robotic Navigation via Large Language Models	2025-03-06


Public comments Anonymous comments Send only to author