Available free to researchers, the Waymo Open Dataset comprises high-resolution sensor data collected by Waymo self-driving vehicles. The company's vehicles have collected over 10 million autonomous miles in 25 cities.
"This rich and diverse set of real world experiences," says the company, "has helped our engineers and researchers develop Waymo’s self-driving technology and innovative models and algorithms. We believe it is one of the largest, richest, and most diverse self-driving datasets ever released for research."
The dataset covers a wide variety of environments, from dense urban centers to suburban landscapes, as well as data collected during day and night, at dawn and dusk, in sunshine and rain, and includes camera footage from the company's high-definition cameras and 1.2 million 2D labels:
- Size and coverage: The release contains data from 1,000 driving segments. Each segment captures 20 seconds of continuous driving, corresponding to 200,000 frames at 10 Hz per sensor. Such continuous footage gives researchers the opportunity to develop models to track and predict the behavior of other road users.
- Diverse driving environments: The dataset covers dense urban and suburban environments across Phoenix, AZ, Kirkland, WA, Mountain View, CA and San Francisco, CA capturing a wide spectrum of driving conditions (day and night, dawn and dusk, sun and rain).
- High-resolution, 360° view: Each segment contains sensor data from five high-resolution Waymo lidars and five front-and-side-facing cameras.
- Dense labeling: The dataset includes lidar frames and images with vehicles, pedestrians, cyclists, and signage carefully labeled, capturing a total of 12 million 3D labels and 1.2 million 2D labels.
- Camera-lidar synchronization: The company has been working on 3D perception models that fuse data from multiple cameras and lidar and has designed the entire self-driving system — including hardware and software — to work seamlessly together, which includes choice of sensor placement and high quality temporal synchronization.
"When it comes to research in machine learning, having access to data can turn an idea into a