Overview | The Data

What is Argoverse?

  • One dataset with 3D tracking annotations for 113 scenes
  • One dataset with 327,793 interesting vehicle trajectories extracted from over 1000 driving hours
  • Two high-definition (HD) maps with lane centerlines, traffic direction, ground height, and more
  • One API to connect the map data with sensor information

Terms of Use

We created Argoverse to support advancements in 3D tracking, motion forecasting, and other perception tasks for self-driving vehicles. We offer it free of charge under a creative commons share-alike license. Please visit our Terms of Use for details on licenses and all applicable terms and conditions.

Citation

In June 2019, we released the Argoverse datasets to coincide with the appearance of our publication, Argoverse: 3D Tracking with Forecasting and Rich Maps, in CVPR 2019. When referencing this publication or any of the materials we provide, please use the following citation:

@INPROCEEDINGS { Argoverse,
  author = {Ming-Fang Chang and John W Lambert and Patsorn Sangkloy and Jagjeet Singh
       and Slawomir Bak and Andrew Hartnett and De Wang and Peter Carr
       and Simon Lucey and Deva Ramanan and James Hays},
  title = {Argoverse: 3D Tracking and Forecasting with Rich Maps},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2019}
}

Where was the data collected?

The data in Argoverse comes from a subset of the area in which Argo AI’s self-driving test vehicles are operating in Miami and Pittsburgh — two US cities with distinct urban driving challenges and local driving habits. We include recordings of our sensor data, or "log segments," across different seasons, weather conditions, and times of day to provide a broad range of real-world driving scenarios.

Total lane coverage: 204 linear kilometers in Miami and 86 linear kilometers in Pittsburgh.

Miami

Beverly Terrace, Edgewater, Town Square

Pittsburgh

Downtown, Strip District, Lower Lawrenceville

How was the data collected?

We collected all of our data using a fleet of identical Ford Fusion Hybrids, fully integrated with Argo AI self-driving technology. We include data from two LiDAR sensors, seven ring cameras and two front-facing stereo cameras. All sensors are roof-mounted:

LiDAR

  • 2 roof-mounted LiDAR sensors
  • Overlapping 40° vertical field of view
  • Range of 200m
  • On average, our LiDAR sensors produce a point cloud with ~ 107,000 points at 10 Hz

Localization

We use a city-specific coordinate system for vehicle localization. We include 6-DOF localization for each timestamp, from a combination of GPS-based and sensor-based localization methods.

Cameras

  • Seven high-resolution ring cameras (1920 x 1200) recording at 30 Hz with a combined 360° field of view
  • Two front-view facing stereo cameras (2056 x 2464) sampled at 5 Hz

Calibration

Sensor measurements for each driving session are stored in “logs.” For each log, we provide intrinsic and extrinsic calibration data for LiDAR and all nine cameras.


Argoverse Maps

Our maps contain rich geometric and semantic metadata for better 3D scene understanding. From ground height to the distance remaining until the next intersection, our maps enable researchers to explore the potential of HD maps in robotic perception.

There are three distinct components that set our maps apart:

Vector Map: Lane-Level Geometry

Our semantic vector map conveys useful lane-level detail, such as lane centerlines, traffic direction, and intersection annotations. Through these features and more, users can explore the many ways traffic flows through city streets and complicated intersections in our test areas, and access a comprehensive picture of what comes before and after each scene.

Rasterized Map: Drivable Area

Our maps include binary drivable area labels at one-meter grid resolution. A drivable area is an area in which it is possible for a vehicle to drive (though not necessarily legally). Our track annotations, outlined in 3D Tracking, extend to five meters beyond the drivable area. We call this larger area our "region of interest."

Rasterized Map: Ground Height

Our maps include real-valued ground height at one-meter resolution. With our map tools, users can remove LiDAR returns on uneven ground surfaces for easier object detection.

Argoverse Map API

Our Map API enables users to more easily leverage our rich data and even take advantage of open source mapping tools built to handle OpenStreetMap formats.

We provide this API in Python and include a handful of useful functions, such as:

Function Names Description
remove_non_driveable_area_points
Use rasterized driveable area ROI to decimate LiDAR point cloud to only ROI points.
remove_ground_surface
Remove all 3D points within 30cm of the ground surface.
get_ground_height_at_xy
Get ground height at provided (x,y) coordinates.
render_local_map_bev_cv2
Render a Bird's Eye View (BEV) in OpenCV.
render_local_map_bev_mpl
Render a Bird's Eye View (BEV) in Matplotlib.
get_nearest_centerline
Retrieve nearest lane center polyline.
get_lane_direction
Retrieve most probable tangent vector ∈ℝ² to lane centerline.
get_semantic_label_of_lane
Provide boolean values regarding the lane segment, including is_intersection_turn_direction, and has_traffic_control.
get_lane_ids_in_xy_bbox
Get all lane IDs with a Manhattan distance search radius in the xy plane.
get_lane_segment_predecessor_ids
Retrieve all lane IDs with an incoming edge into the query lane segment in the semantic graph.
get_lane_segment_adjacent_ids
Retrieve all lane IDs that are left or right neighbors of the query lane segment.
get_lane_segment_centerline
Retrieve polyline coordinates of query lane segment ID.
get_lane_segment_polygon
Hallucinate a lane polygon based around a centerline using average lane width.
get_lane_segments_containing_xy
Find lane IDs whose hallucinated lane polygons contain the query point.

To download our maps, visit our downloads page.

To access our API, visit Argoverse GitHub.


Argoverse 3D Tracking

A dataset to train and validate 3D tracking models

Argoverse 3D Tracking is a collection of 113 log segments with 3D object tracking annotations. These log segments, which we call “sequences,” vary in length from 15 to 30 seconds and collectively contain a total of 11,319 tracks.

Each sequence in our training and validation sets includes annotations for all objects within 5 meters of what we identify as the “drivable area” — the area in which it is possible for a vehicle to drive.

What makes this dataset stand out?

Users can build algorithms that take advantage of the detailed information in Argoverse's HD Maps. For example, an algorithm can use the map to perform ground removal on LiDAR returns or to constrain vehicle orientation based on lane direction.

Segment duration:

15-30 seconds

Total number of segments:

113

Total tracked objects:

11,319

Data Annotation

Argoverse contains amodal 3D bounding cuboids on all objects of interest on or near the drivable area. By “amodal” we mean that the 3D extent of each cuboid represents the spatial extent of the object in 3D space — and not simply the extent of observed pixels or observed LiDAR returns, which is smaller for occluded objects and ambiguous for objects seen from only one face.

Our amodal annotations are automatically generated by fitting cuboids to each object’s LiDAR returns observed throughout an entire tracking sequence. If the full spatial extent of an object is ambiguous in one frame, information from previous or later frames can be used to constrain the shape. The size of amodal cuboids is fixed over time. A few objects in the dataset dynamically change size (e.g. a car opening a door) and cause imperfect amodal cuboid fit.

To create amodal cuboids, we identify the points that belong to each object at every timestep. This information, as well as the orientation of each object, come from human annotators.

We provide ground truth labels for 17 object classes. Two of these classes include static and dynamic objects that lie outside of the key categories we defined, and are called ON_ROAD_OBSTACLE and OTHER_MOVER. The distribution of these object classes across all of the annotated objects in Argoverse 3D Tracking looks like this:

For more information on our 3D tracking dataset, see our tutorial.

To download, visit our downloads page.


Argoverse Motion Forecasting

A dataset to train and validate motion forecasting models

Argoverse Motion Forecasting is a curated collection of 327,793 scenarios, each 5 seconds long, for training and validation. Each scenario contains the 2D, birds-eye-view centroid of each tracked object sampled at 10 Hz.

To create this collection, we sifted through more than 1000 hours of driving data from our fleet of self-driving test vehicles to find the most challenging segments — including segments that show vehicles at intersections, vehicles taking left or right turns, and vehicles changing lanes.

What makes this dataset stand out?

This dataset is far larger than what can currently be mined from publicly available self-driving datasets, and our HD maps make it easier to predict the motion of objects.

Segment duration:

5 seconds

Total number of segments:

327,793

Total time:

320 hours

A still frame from one of the motion forecasting sequences, showing the trajectories for the agent of interest (red), self-driving vehicle (green), and all other objects of interest in the scene (light blue).

For more information on our Motion Forecasting dataset, see our tutorial.

To download, scroll down and select the Motion Forecasting dataset file.


Download. Get Started.

Argoverse is provided free of charge under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public license. Argoverse code and APIs are provided under the MIT license. Before downloading, please view our full Terms of Use.

Curious about the quality of our datasets? Sample one log from each. To get started, access our API and download our maps in the menu below.