AV2 2026 Scenario Mining Challenge Announcement

Challenge Overview

Autonomous Vehicles (AVs) collect and pseudo-label terabytes of multi-modal data localized to HD maps during normal fleet tests. However, identifying interesting and safety critical scenarios from uncurated data streams is prohibitively time-consuming and error-prone. Retrieving and processing specific scenarios for ego-behavior evaluation, safety testing, or active learning at scale remains a major challenge. While prior works have explored this problem in the context of structured queries and hand-crafted heuristics, we are hosting this challenge to solicit better end-to-end solutions to this important problem.

Our benchmark includes 10,000 planning-centric natural language queries. Challenge participants can use all RGB frames, Lidar sweeps, HD Maps, and track annotations from the AV2 sensor dataset to find relevant actors in each log. Methods will be evaluated at three levels of spatial and temporal granularity. First, methods must determine if a scenario (defined by a natural language query) occurs in the log. A scenario is a set of objects, actions, map elements, and/or interactions that occur over a specified timeframe. If the scenario occurs in the log, methods must temporally localize (e.g. find the start and end time) the scenario. Lastly, methods must detect and track all objects relevant to the text description. Our primary evaluation metrics is HOTA-Temporal, a spatial tracking metric that only considers the scenario objects during the timeframe when the scenario is occuring.

🚨 Top performing teams can win cash prizes, generously sponsored by Uber! 🚨

In this iteration of the challenge, we are hosting separate tracks for Temporal and Spatio-Temporal Scenario Mining.

The temporal track evaluates how well methods can select the video timestamps in which a referred object or action occurs. 🥇 1st Place: $2,500

The spatio-temporal tracks evaluates how well methods can localize the relevant actors in 3D. 🥇 1st Place: $2,500

Finally, we will be giving a 💡 $2,500 innovation award 💡 enouraging methods that make progress toward fast end-to-end scenario mining.

To be eligible for prizes, teams must submit a technical report, open source their code, and provide instructions on how to reproduce their results. Teams must also beat our best performing official baseline and make their submission visible by the end of the competition to be eligible for prizes.

The test split and EvalAI leaderboard will both open on Februrary 18th, 2026. The updated scenario mining train and val splits are available for download now. You may test your method using the available val leaderboard. The val split results are not factored into the competition.

Getting the Data

Please see the Argoverse User Guide for detailed instructions on how to download the sensor dataset and scenario mining add-on.

Baselines

Please see our baselines to get started.

Preparing your Submission

Please see the scenario-mining submission tutorial for a guide on preparing your submission.

Submit to the Scenario Mining Spatio-Temporal Track

  • 30 different object classes (car, pedestrian, etc.)
  • 3 categories for relevancy to scenario (referred, related, and unrelated)
  • 50m range evaluation
  • Lidar, synchronized camera imagery, and HD maps available
  • Performance is ranked by HOTA-Temporal

Submit to the Scenario Mining Temporal Track

  • 30 different object classes (car, pedestrian, etc.)
  • 3 categories for relevancy to scenario (referred, related, and unrelated)
  • 50m range evaluation
  • Lidar, synchronized camera imagery, and HD maps available
  • Performance is ranked by HOTA-Temporal

Citation

If you find our paper and code repository useful, please cite us:

@article{davidson2025refav,
  title={RefAV: Towards Planning-Centric Scenario Mining},
  author={Davidson, Cainan and Ramanan, Deva and Peri, Neehar},
  journal={arXiv preprint arXiv:2505.20981},
  year={2025}
}