NoRD achieves competitive autonomous driving performance using less than 60% of training data
The model eliminates the need for dense reasoning annotations
It uses 3x fewer tokens than existing VLA models
The Dr. GRPO algorithm overcomes difficulty bias in small, reasoning-free datasets
📖 Full Retelling
Researchers Ishaan Rawal, Shubh Gupta, Yihan Hu, and Wei Zhan introduced NoRD, a data-efficient Vision-Language-Action model for autonomous driving that eliminates the need for reasoning annotations, in a paper submitted to arXiv on February 24, 2026 and accepted to CVPR 2026, addressing the expensive requirements of massive dataset collection and dense reasoning annotations faced by current autonomous driving systems. Vision-Language-Action models represent a significant advancement in autonomous technology by replacing traditional modular pipelines with unified end-to-end architectures that process visual, linguistic, and action data simultaneously. However, the researchers identified that existing VLAs face two particularly expensive requirements: the need for massive dataset collection and dense reasoning annotations, which create substantial barriers to development and deployment. NoRD overcomes these challenges by achieving competitive performance while being fine-tuned on less than 60% of the data and requiring no reasoning annotations, resulting in 3× fewer computational tokens than existing models. The researchers discovered that standard Group Relative Policy Optimization fails to yield significant improvements when applied to policies trained on such small, reasoning-free datasets due to difficulty bias, which disproportionately penalizes reward signals from scenarios producing high-variance rollouts. To address this limitation, NoRD incorporates Dr. GRPO, an algorithm designed to mitigate difficulty bias in Large Language Models, enabling the system to achieve competitive performance on Waymo and NAVSIM benchmarks with a fraction of the training data and no reasoning overhead.
🏷️ Themes
Artificial Intelligence, Autonomous Driving, Data Efficiency, Computer Vision
Data efficiency refers to efficiency of the many processes that can be applied to data such as storage, access, filtering, sharing, etc., and whether or not the processes lead to the desired outcome within resource constraints.
A management definition of data efficiency would be the measure of how d...
Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form of decisions. "Understanding" in this context signifies th...
--> Computer Science > Artificial Intelligence arXiv:2602.21172 [Submitted on 24 Feb 2026] Title: NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning Authors: Ishaan Rawal , Shubh Gupta , Yihan Hu , Wei Zhan View a PDF of the paper titled NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning, by Ishaan Rawal and 3 other authors View PDF HTML Abstract: Vision-Language-Action models are advancing autonomous driving by replacing modular pipelines with unified end-to-end architectures. However, current VLAs face two expensive requirements: (1) massive dataset collection, and (2) dense reasoning annotations. In this work, we address both challenges with \modelname (\textbf \textbf easoning for \textbf riving). Compared to existing VLAs, \modelname achieves competitive performance while being fine-tuned on $<$60\% of the data and no reasoning annotations, resulting in 3$\times$ fewer tokens. We identify that standard Group Relative Policy Optimization fails to yield significant improvements when applied to policies trained on such small, reasoning-free datasets. We show that this limitation stems from difficulty bias, which disproportionately penalizes reward signals from scenarios that produce high-variance rollouts within GRPO. \modelname overcomes this by incorporating Dr.~GRPO, a recent algorithm designed to mitigate difficulty bias in LLMs. As a result, \modelname achieves competitive performance on Waymo and NAVSIM with a fraction of the training data and no reasoning overhead, enabling more efficient autonomous systems. Comments: Accepted to CVPR 2026 Subjects: Artificial Intelligence (cs.AI) ; Computer Vision and Pattern Recognition (cs.CV) Cite as: arXiv:2602.21172 [cs.AI] (or arXiv:2602.21172v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2602.21172 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Shubh Gupta [ view email ] [v1] Tue, 24 Feb 2026...