MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
#MobilityBench #Route-planning agents #Large language models #Evaluation benchmark #Real-world scenarios #Amap #API-replay sandbox #Preference-constrained planning
📌 Key Takeaways
- MobilityBench provides a scalable benchmark for evaluating LLM-based route-planning agents
- The benchmark uses real-world anonymized queries from Amap across multiple cities
- Researchers developed a deterministic API-replay sandbox for reproducible evaluations
- Current models struggle with preference-constrained route planning tasks
- The benchmark, toolkit, and documentation have been publicly released for research use
📖 Full Retelling
🏷️ Themes
Artificial Intelligence, Route Planning, Evaluation Benchmarking, Human Mobility
📚 Related People & Topics
AutoNavi
Corporation of digital map content and navigation and location-based solutions
AutoNavi Software Co., Ltd. (simplified Chinese: 高德软件有限公司; traditional Chinese: 高德軟件有限公司; pinyin: Gāodé Ruǎnjiàn Yǒuxiàn Gōngsī) is a Chinese web mapping, navigation and location-based services provider, founded in 2001. One of its subsidiary companies, Beijing Mapabc Co.
Large language model
Type of machine learning model
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...
Entity Intersection Graph
No entity connections available yet for this article.