Desafíos 2024

Lanzados los desafíos para 2024 en investigación egocéntrica.

First Joint Egocentric Vision (EgoVis) Workshop

Held in Conjunction with CVPR 2024

17 June 2024 – Seattle, USA

This joint workshop aims to be the focal point for the egocentric computer vision community to meet and discuss progress in this fast growing research area, addressing egocentric vision in a comprehensive manner including key research challenges in video understanding, multi-modal data, interaction learning, self-supervised learning, AR/VR with applications to cognitive science and robotics.


Wearable cameras, smart glasses, and AR/VR headsets are gaining importance for research and commercial use. They feature various sensors like cameras, depth sensors, microphones, IMUs, and GPS. Advances in machine perception enable precise user localization (SLAM), eye tracking, and hand tracking. This data allows understanding user behavior, unlocking new interaction possibilities with augmented reality. Egocentric devices may soon automatically recognize user actions, surroundings, gestures, and social relationships. These devices have broad applications in assistive technology, education, fitness, entertainment, gaming, eldercare, robotics, and augmented reality, positively impacting society.

Previously, research in this field faced challenges due to limited datasets in a data-intensive environment. However, the community’s recent efforts have addressed this issue by releasing numerous large-scale datasets covering various aspects of egocentric perception, including HoloAssist, Aria Digital Twin, Aria Synthetic Environments, Ego4D, Ego-Exo4D, and EPIC-KITCHENS.

Hololens 2 Image


Aria Digital Twin

Aria Digital Twin

Aria Synthetic Environments

Aria Synthetic Environments

Ego-Exo4D Logo


Ego4D Logo


EPIC-Kitchens Logo


The goal of this workshop is to provide an exciting discussion forum for researchers working in this challenging and fast-growing area, and to provide a means to unlock the potential of data-driven research with our datasets to further the state-of-the-art.


We welcome submissions to the challenges from March to May (see important dates) through the leaderboards linked below. Participants to the challenges are are requested to submit a technical report on their method. This is a requirement for the competition. Reports should be 2-6 pages including references. Submissions should use the CVPR format and should be submitted through the CMT website.

HoloAssist Challenges

Challenge IDChallenge NameChallenge LeadChallenge Link
1Action RecognitionMahdi Rad, Microsoft, SwitzerlandComing Soon…
2Mistake DetectionIshani Chakraborty, Microsoft, USComing Soon…
3Intervention PredictionXin Wang, Microsoft, USComing Soon…
43D hand pose forecastingTaein Kwon, ETH Zurich, SwitzerlandComing Soon…

Aria Digital Twin Challenges

Challenge IDChallenge NameChallenge LeadChallenge Link
1Few-shots 3D Object detection & trackingXiaqing Pan, Meta, USLink
23D Object detection & trackingXiaqing Pan, Meta, USLink

Aria Synthetic Environments Challenges

Challenge IDChallenge NameChallenge LeadChallenge Link
1Scene Reconstruction using structured languageVasileios Baltnas, Meta, UKLink

Ego4D Challenges

Challenge IDChallenge NameChallenge LeadChallenge Link
1Visual Queries 2DSanthosh Kumar Ramakrishnan, University of Texas, Austin, USLink
2Visual Queries 3DVincent Cartillier, Georgia Tech, USLink
3Natural Language QueriesSatwik Kottur, Meta, USLink
4Moment QueriesChen Zhao & Merey Ramazanova, KAUST, SALink
5EgoTracksHao Tang & Weiyao Wang, Meta, USLink
6Goal StepYale Song, Meta, USLink
7Ego SchemaKarttikeya Mangalam, Raiymbek Akshulakov, UC Berkeley, USLink
8PNR temporal localizationYifei Huang, University of Tokyo, JPLink
9Localization and TrackingHao Jiang, Meta, USLink
10Speech TranscriptionLeda Sari Jachym Kolar & Vamsi Krishna Ithapu, Meta Reality Labs, USLink
11Looking at meEric Zhongcong Xu, National University of Singapore, SingaporeLink
12Short-term AnticipationFrancesco Ragusa, University of Catania, ITLink
13Long-term AnticipationTushar Nagarajan, FAIR, USLink

Ego-Exo4D Challenges

Challenge IDChallenge NameChallenge LeadChallenge Link
1coming soon…coming soon…coming soon…

EPIC-Kitchens Challenges

Please check the EPIC-KITCHENS website for more information on the EPIC-KITCHENS challenges. Links to individual challenges are also reported below.

Challenge IDChallenge NameChallenge LeadChallenge Link
1Action RecognitionJacob Chalk, University of Bristol, UKLink
2Action AnticipationAntonino Furnari and Francesco Ragusa University of Catania, ITLink
3Action DetectionFrancesco Ragusa and Antonino Furnari, University of Catania, ITLink
4Domain Adaptation for Action RecognitionToby Perrett, University of Bristol, UKLink
5Multi-Instance RetrievalMichael Wray, University of Bristol, UKLink
6Semi-Supervised Video-Object SegmentationAhmad Dar Khalil, University of Bristol, UKLink
7Hand-Object SegmentationDandan Shan, University of Michigan, USLink
8EPIC-SOUNDS Audio-Based Interaction RecognitionJacob Chalk, University of Bristol, UKLink
9TREK-150 Object TrackingMatteo Dunnhofer, University of Udine, ITLink

Call for Abstracts

You are invited to submit extended abstracts to the first edition of joint egocentric vision workshop which will be held alongside CVPR 2024 in Seattle.

These abstracts represent existing or ongoing work and will not be published as part of any proceedings. We welcome all works that focus within the Egocentric Domain, it is not necessary to use the Ego4D dataset within your work. We expect a submission may contain one or more of the following topics (this is a non-exhaustive list):

  • Egocentric vision for human activity analysis and understanding, including action recognition, action detection, audio-visual action perception and object state change detection
  • Egocentric vision for anticipating human behaviour, actions and objects
  • Egocentric vision for 3D perception and interaction, including dynamic scene reconstruction, hand-object reconstruction, long-term object tracking, NLQ and visual queries, long-term video understanding
  • Head-mounted eye tracking and gaze estimation including attention modelling and next fixation prediction
  • Egocentric vision for object/event recognition and retrieval
  • Egocentric vision for summarization
  • Daily life and activity monitoring
  • Egocentric vision for human skill learning, assistance, and robotics
  • Egocentric vision for social interaction and human behaviour understanding
  • Privacy and ethical concerns with wearable sensors and egocentric vision
  • Egocentric vision for health and social good
  • Symbiotic human-machine vision systems, human-wearable devices interaction
  • Interactive AR/VR and Egocentric online/real-time perception


The length of the extended abstracts is 2-4 pages, including figures, tables, and references. We invite submissions of ongoing or already published work, as well as reports on demonstrations and prototypes. The 1st joint egocentric vision workshop gives opportunities for authors to present their work to the egocentric community to provoke discussion and feedback. Accepted work will be presented as either an oral presentation (either virtual or in-person) or as a poster presentation. The review will be single-blind, so there is no need to anonymize your work, but otherwise will follow the format of the CVPR submissions, information can be found here. Accepted abstracts will not be published as part of a proceedings, so can be uploaded to ArXiv etc. and the links will be provided on the workshop’s webpage. The submission will be managed with the CMT website.

Important Dates

Challenges Leaderboards OpenMar 2024
Challenges Leaderboards Close30 May 2024
Challenges Technical Reports Deadline (on CMT)5 June 2024 (23:59 PT)
Extended Abstract Deadline10 May 2024 (23:59 PT)
Extended Abstract Notification to Authors29 May 2024
Extended Abstracts ArXiv Deadline12 June 2024
Workshop Date17 June 2024


All dates are local to Seattle’s time, PST.
Workshop Location: Room TBD

A tentative programme is shown below.

08:45-09:00Welcome and Introductions
09:00-09:30Invited Keynote 1: Takeo Kanade, Carnegie Mellon University, US
09:30-10:20HoloAssist Challenges
10:20-11:20Coffee Break and Poster Session
11:20-11:50Invited Keynote 2: Diane Larlus, Naver Labs Europe and MIAI Grenoble, FR
11:50-12:40EPIC-KITCHENS Challenges
12:40-13:40Lunch Break
13:40-14:10EgoVis 2022/2023 Distinguished paper Awards
14:10-14:40Invited Keynote 3: Michael C. Frank & Bria Long, Stanford University, US
14:40-15:30Aria Digital Twin & Synthetic Environments Challenges
15:30-16:00Coffee Break
16:00-16:30Invited Keynote 4: Fernando de La Torre, Carnegie Mellon University, US
16:30-17:40Ego4D Challenges
17:40-18:10Invited Keynote 5: Jim Rehg, University of Illinois Urbana-Champaign, US

Invited Speakers

Takeo Kanade

Carnegie Mellon University, USA

Jim Rehg

University of Illinois Urbana-Champaign, USA

Diane Larlus

Naver Labs Europe and MIAI Grenoble

Fernando De la Torre

Carnegie Mellon University, USA

Michael C. Frank

Stanford University, USA

Bria Long

University of California, San Diego, USA

Workshop Organisers

Antonino Furnari

University of Catania

Angela Yao

National University of Singapore

Xin Wang

Microsoft Research

Tushar Nagarajan

FAIR, Meta

Huiyu Wang

FAIR, Meta

Jing Dong


Jakob Engel

FAIR, Meta

Siddhant Bansal

University of Bristol

Takuma Yagi

National Institute of Advanced Industrial Science and Technology

Co-organizing Advisors

Dima Damen

University of Bristol

Giovanni Maria Farinella

University of Catania

Kristen Grauman

UT Austin

Jitendra Malik

UC Berkeley

Richard Newcombe

Reality Labs Research

Marc Pollefeys

ETH Zurich

Yoichi Sato

University of Tokyo

David Crandall

Indiana University

Related Past Events

This workshop follows the footsteps of the following previous events:

EPIC-Kitchens and Ego4D Past Workshops:

Human Body, Hands, and Activities from Egocentric and Multi-view Cameras Past Workshops:

Project Aria Past Tutorials:

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *