Desafíos 2024

Lanzados los desafíos para 2024 en investigación egocéntrica.

https://egovis.github.io/cvpr24/

First Joint Egocentric Vision (EgoVis) Workshop

Held in Conjunction with CVPR 2024

17 June 2024 – Seattle, USA

This joint workshop aims to be the focal point for the egocentric computer vision community to meet and discuss progress in this fast growing research area, addressing egocentric vision in a comprehensive manner including key research challenges in video understanding, multi-modal data, interaction learning, self-supervised learning, AR/VR with applications to cognitive science and robotics.

Overview

Wearable cameras, smart glasses, and AR/VR headsets are gaining importance for research and commercial use. They feature various sensors like cameras, depth sensors, microphones, IMUs, and GPS. Advances in machine perception enable precise user localization (SLAM), eye tracking, and hand tracking. This data allows understanding user behavior, unlocking new interaction possibilities with augmented reality. Egocentric devices may soon automatically recognize user actions, surroundings, gestures, and social relationships. These devices have broad applications in assistive technology, education, fitness, entertainment, gaming, eldercare, robotics, and augmented reality, positively impacting society.

Previously, research in this field faced challenges due to limited datasets in a data-intensive environment. However, the community’s recent efforts have addressed this issue by releasing numerous large-scale datasets covering various aspects of egocentric perception, including HoloAssist, Aria Digital Twin, Aria Synthetic Environments, Ego4D, Ego-Exo4D, and EPIC-KITCHENS.

HoloAssist

Aria Digital Twin

Aria Synthetic Environments

Ego-Exo4D

Ego4D

EPIC-Kitchens

The goal of this workshop is to provide an exciting discussion forum for researchers working in this challenging and fast-growing area, and to provide a means to unlock the potential of data-driven research with our datasets to further the state-of-the-art.

Challenges

We welcome submissions to the challenges from March to May (see important dates) through the leaderboards linked below. Participants to the challenges are are requested to submit a technical report on their method. This is a requirement for the competition. Reports should be 2-6 pages including references. Submissions should use the CVPR format and should be submitted through the CMT website.

HoloAssist Challenges

Challenge ID	Challenge Name	Challenge Lead	Challenge Link
1	Action Recognition	Mahdi Rad, Microsoft, Switzerland	Coming Soon…
2	Mistake Detection	Ishani Chakraborty, Microsoft, US	Coming Soon…
3	Intervention Prediction	Xin Wang, Microsoft, US	Coming Soon…
4	3D hand pose forecasting	Taein Kwon, ETH Zurich, Switzerland	Coming Soon…

Aria Digital Twin Challenges

Challenge ID	Challenge Name	Challenge Lead	Challenge Link
1	Few-shots 3D Object detection & tracking	Xiaqing Pan, Meta, US	Link
2	3D Object detection & tracking	Xiaqing Pan, Meta, US	Link

Aria Synthetic Environments Challenges

Challenge ID	Challenge Name	Challenge Lead	Challenge Link
1	Scene Reconstruction using structured language	Vasileios Baltnas, Meta, UK	Link

Ego4D Challenges

Challenge ID	Challenge Name	Challenge Lead	Challenge Link
1	Visual Queries 2D	Santhosh Kumar Ramakrishnan, University of Texas, Austin, US	Link
2	Visual Queries 3D	Vincent Cartillier, Georgia Tech, US	Link
3	Natural Language Queries	Satwik Kottur, Meta, US	Link
4	Moment Queries	Chen Zhao & Merey Ramazanova, KAUST, SA	Link
5	EgoTracks	Hao Tang & Weiyao Wang, Meta, US	Link
6	Goal Step	Yale Song, Meta, US	Link
7	Ego Schema	Karttikeya Mangalam, Raiymbek Akshulakov, UC Berkeley, US	Link
8	PNR temporal localization	Yifei Huang, University of Tokyo, JP	Link
9	Localization and Tracking	Hao Jiang, Meta, US	Link
10	Speech Transcription	Leda Sari Jachym Kolar & Vamsi Krishna Ithapu, Meta Reality Labs, US	Link
11	Looking at me	Eric Zhongcong Xu, National University of Singapore, Singapore	Link
12	Short-term Anticipation	Francesco Ragusa, University of Catania, IT	Link
13	Long-term Anticipation	Tushar Nagarajan, FAIR, US	Link

Ego-Exo4D Challenges

Challenge ID	Challenge Name	Challenge Lead	Challenge Link
1	coming soon…	coming soon…	coming soon…

EPIC-Kitchens Challenges

Please check the EPIC-KITCHENS website for more information on the EPIC-KITCHENS challenges. Links to individual challenges are also reported below.

Challenge ID	Challenge Name	Challenge Lead	Challenge Link
1	Action Recognition	Jacob Chalk, University of Bristol, UK	Link
2	Action Anticipation	Antonino Furnari and Francesco Ragusa University of Catania, IT	Link
3	Action Detection	Francesco Ragusa and Antonino Furnari, University of Catania, IT	Link
4	Domain Adaptation for Action Recognition	Toby Perrett, University of Bristol, UK	Link
5	Multi-Instance Retrieval	Michael Wray, University of Bristol, UK	Link
6	Semi-Supervised Video-Object Segmentation	Ahmad Dar Khalil, University of Bristol, UK	Link
7	Hand-Object Segmentation	Dandan Shan, University of Michigan, US	Link
8	EPIC-SOUNDS Audio-Based Interaction Recognition	Jacob Chalk, University of Bristol, UK	Link
9	TREK-150 Object Tracking	Matteo Dunnhofer, University of Udine, IT	Link

Call for Abstracts

You are invited to submit extended abstracts to the first edition of joint egocentric vision workshop which will be held alongside CVPR 2024 in Seattle.

These abstracts represent existing or ongoing work and will not be published as part of any proceedings. We welcome all works that focus within the Egocentric Domain, it is not necessary to use the Ego4D dataset within your work. We expect a submission may contain one or more of the following topics (this is a non-exhaustive list):

Egocentric vision for human activity analysis and understanding, including action recognition, action detection, audio-visual action perception and object state change detection
Egocentric vision for anticipating human behaviour, actions and objects
Egocentric vision for 3D perception and interaction, including dynamic scene reconstruction, hand-object reconstruction, long-term object tracking, NLQ and visual queries, long-term video understanding
Head-mounted eye tracking and gaze estimation including attention modelling and next fixation prediction
Egocentric vision for object/event recognition and retrieval
Egocentric vision for summarization
Daily life and activity monitoring
Egocentric vision for human skill learning, assistance, and robotics
Egocentric vision for social interaction and human behaviour understanding
Privacy and ethical concerns with wearable sensors and egocentric vision
Egocentric vision for health and social good
Symbiotic human-machine vision systems, human-wearable devices interaction
Interactive AR/VR and Egocentric online/real-time perception

Format

The length of the extended abstracts is 2-4 pages, including figures, tables, and references. We invite submissions of ongoing or already published work, as well as reports on demonstrations and prototypes. The 1^st joint egocentric vision workshop gives opportunities for authors to present their work to the egocentric community to provoke discussion and feedback. Accepted work will be presented as either an oral presentation (either virtual or in-person) or as a poster presentation. The review will be single-blind, so there is no need to anonymize your work, but otherwise will follow the format of the CVPR submissions, information can be found here. Accepted abstracts will not be published as part of a proceedings, so can be uploaded to ArXiv etc. and the links will be provided on the workshop’s webpage. The submission will be managed with the CMT website.

Important Dates

Challenges Leaderboards Open	Mar 2024
Challenges Leaderboards Close	30 May 2024
Challenges Technical Reports Deadline (on CMT)	5 June 2024 (23:59 PT)
Extended Abstract Deadline	10 May 2024 (23:59 PT)
Extended Abstract Notification to Authors	29 May 2024
Extended Abstracts ArXiv Deadline	12 June 2024
Workshop Date	17 June 2024

Program

All dates are local to Seattle’s time, PST.
Workshop Location: Room TBD

A tentative programme is shown below.

Time	Event
08:45-09:00	Welcome and Introductions
09:00-09:30	Invited Keynote 1: Takeo Kanade, Carnegie Mellon University, US
09:30-10:20	HoloAssist Challenges
10:20-11:20	Coffee Break and Poster Session
11:20-11:50	Invited Keynote 2: Diane Larlus, Naver Labs Europe and MIAI Grenoble, FR
11:50-12:40	EPIC-KITCHENS Challenges
12:40-13:40	Lunch Break
13:40-14:10	EgoVis 2022/2023 Distinguished paper Awards
14:10-14:40	Invited Keynote 3: Michael C. Frank & Bria Long, Stanford University, US
14:40-15:30	Aria Digital Twin & Synthetic Environments Challenges
15:30-16:00	Coffee Break
16:00-16:30	Invited Keynote 4: Fernando de La Torre, Carnegie Mellon University, US
16:30-17:40	Ego4D Challenges
17:40-18:10	Invited Keynote 5: Jim Rehg, University of Illinois Urbana-Champaign, US
18:10-18:15	Conclusion