The Argoverse Stereo and Argoverse Motion Forecasting challenges closed on June 13, 2021. Want to learn about the winning methods? Check out our presentation for the CVPR 2021 Workshop on Autonomous Driving, where we unpack all of that and more.
Our leaderboards remain open for 3D Detection and 3D Tracking. In addition, Carnegie Mellon University’s Streaming Perception Challenge uses Argoverse data as part of the Argoverse competition series. While Argo AI does not endorse the results or conclusions of any third-party, we are pleased to see applications of Argoverse data and encourage participation.
Be sure to subscribe to our mailing list to receive future Argoverse announcements.
The goal of the 3D tracking task is to annotate and track objects in 15-30 second log segments, using 17 object classes. To do this, users leverage the 113 sequences in the Argoverse 3D Tracking dataset.
The 89 training and validation sets include 3D cuboid annotations. The test sequences are limited to sensor data only.
The goal of the motion forecasting task is to predict the location of a tracked object 3 seconds into the future, given an initial 2-second observation. To do this, users leverage the 324,557 sequences in the Argoverse Motion Forecasting dataset.
The training, validation, and test sequences are taken from different areas of Pittsburgh and Miami so that there is no geographical overlap. Each sequence includes one interesting tracked object, which we label as the “agent.” Agents are objects that follow more complex trajectories, such as changing lanes, navigating intersections, and turning.
Each training and validation sequence is 5 seconds long. The test sequences are 2 seconds long. In these, users are given the first 2 seconds (20 frames) and tasked with predicting the coordinates the agent will travel to within the next 3 seconds (30 frames) of the full 5-second segment (50 frames).
To explore this task ourselves, we ran a few baselines — one of which you can view in the qualitative results below. The orange trajectory represents the motion of the observed agent over the initial 2 seconds, the green trajectory(s) represents our top-k forecasted trajectory, and the red trajectory represents the ground truth.