The Argoverse Stereo and Argoverse Motion Forecasting challenges are open through June 13th, 2021, and feature a total of $8,000 in prizes:
● First place: $2,000
● Honorable mentions: $1,000
Winners will be spotlighted during our presentation at the CVPR 2021 Workshop on Autonomous Driving.
To access the challenge rules, please visit our evaluation servers. For any questions, contact our team on GitHub. We look forward to seeing your submissions!
While submissions won’t be considered for our CVPR 2021 prizes, our leaderboards remain open for 3D Detection and 3D Tracking, and we encourage you to use them. In addition, Carnegie Mellon University’s Streaming Perception Challenge uses Argoverse data as part of the Argoverse competition series. Winners of the Streaming Perception Challenge will be featured during the CVPR 2021 workshop alongside winners of Argoverse Stereo and Argoverse Motion Forecasting. While Argo AI does not endorse the results or conclusions of any third-party, we are pleased to see applications of Argoverse data and encourage participation.
Be sure to subscribe to our mailing list to receive future Argoverse announcements.
The goal of the 3D tracking task is to annotate and track objects in 15-30 second log segments, using 17 object classes. To do this, users leverage the 113 sequences in the Argoverse 3D Tracking dataset.
The 89 training and validation sets include 3D cuboid annotations. The test sequences are limited to sensor data only.
The goal of the motion forecasting task is to predict the location of a tracked object 3 seconds into the future, given an initial 2-second observation. To do this, users leverage the 324,557 sequences in the Argoverse Motion Forecasting dataset.
The training, validation, and test sequences are taken from different areas of Pittsburgh and Miami so that there is no geographical overlap. Each sequence includes one interesting tracked object, which we label as the “agent.” Agents are objects that follow more complex trajectories, such as changing lanes, navigating intersections, and turning.
Each training and validation sequence is 5 seconds long. The test sequences are 2 seconds long. In these, users are given the first 2 seconds (20 frames) and tasked with predicting the coordinates the agent will travel to within the next 3 seconds (30 frames) of the full 5-second segment (50 frames).
To explore this task ourselves, we ran a few baselines — one of which you can view in the qualitative results below. The orange trajectory represents the motion of the observed agent over the initial 2 seconds, the green trajectory(s) represents our top-k forecasted trajectory, and the red trajectory represents the ground truth.