Hugging Face has partnered with AI startup Yaak to expand its LeRobot platform with a new self-driving training dataset, Learning to Drive (L2D). This dataset, over a petabyte in size, consists of sensor data collected from German driving schools, including camera, GPS, and vehicle dynamics data from both instructors and students navigating diverse road conditions.
Unlike existing self-driving datasets from companies like Waymo and Comma AI, which emphasize object detection and tracking, L2D is designed to support end-to-end learning. This approach allows AI models to predict actions, such as when a pedestrian might cross the street, based directly on sensor inputs.
Harsimrat Sandhawalia, Yaak’s co-founder, and Remi Cadene from Hugging Face emphasized that L2D aims to be the largest open-source self-driving dataset, offering AI researchers a rich set of driving “episodes” to train spatial intelligence models. Hugging Face and Yaak plan real-world closed-loop testing of AI models trained with L2D and LeRobot, deploying them on vehicles with safety drivers.
The companies are encouraging AI developers to submit models and test scenarios, including complex driving tasks like roundabouts and parking maneuvers, to advance self-driving AI capabilities.