Bridging the gap between complex scientific research and the curious minds eager to explore it.

Computer Science, Computer Vision and Pattern Recognition

Benchmarking Video Models with Perception Test: First Challenge Outcome

Benchmarking Video Models with Perception Test: First Challenge Outcome

Making progress towards building perception systems that can understand scenes like humans is a crucial research goal. To help achieve this, there needs to be robust and comprehensive evaluation methods to guide the process. This workshop-challenge organized by the ECCV 2023 conference focuses on evaluating state-of-the-art (SOTA) perception models.

Evaluation Modes

Participants were asked to indicate the evaluation mode they used, such as fine-tuning, few-shot, or zero-shot evaluation. In some tracks, participants also had to indicate if they used audio data in addition to video (for action and sound localization tasks). For test submissions, participants were required to provide a short report explaining their methodology, including the architecture, pre-training datasets, and other relevant details.

Submissions

There were 23 submissions for the validation phase, but the number of submissions for the test phase was not limited. The submissions included models trained on different datasets, such as COCO and Open Images, and using various architectures, including transformers and convolutional neural networks (CNNs).

Results

The results showed that SOTA perception models can achieve high scores in fine-tuning and few-shot evaluation modes. However, there is still room for improvement when it comes to zero-shot evaluation, where models need to perform well on unseen data. The participants’ reports revealed that different architectures and pre-training datasets were used, which highlights the diversity of approaches in the field.

Conclusion

In conclusion, this workshop-challenge demonstrated the importance of rigorous evaluation methods for perception systems. By comparing SOTA models in different evaluation modes, we can better understand their strengths and limitations and identify areas for future research. The diverse range of approaches used by participants highlights the richness of the field and the potential for innovation.