University of Birmingham > Talks@bham > Speech Recognition by Synthesis Seminars > Multimodal First-Person Activity Recognition and Summarization

Multimodal First-Person Activity Recognition and Summarization

Add to your list(s) Download to your calendar using vCal

  • UserDr. Alptekín Temízel, Associate Professor - Graduate School of Informatics, Middle East Technical University (METU); Visiting Academic - Electronic, Electrical and Systems Engineering, University of Birmingham
  • ClockWednesday 18 May 2016, 14:00-15:00
  • HouseGisbert Kapp, N123.

If you have a question about this talk, please contact Dr. Philip Weber.

First-person (egocentric) videos are captured using a camera on a person and reflects the first person view perspective. In these videos, the observer itself is involved in the events and the camera undergoes large amounts of ego-motion. In typical third-person perspective videos, the camera is usually stationary and it is away from the actors involved in the events. These different characteristics of first-person videos makes it difficult to use the existing approaches directly and necessitate different approaches to the problem. In addition to the video data, use of additional modalities has the potential to contribute positively by bringing complementary information. An important modality is audio as it is readily accessible and allows detecting different activities and interactions. On the other hand, fusion of different modalities also brings new challenges. In this talk, I will be talking about the current state-of-the-art and particular challenges regarding the analysis of multimodal first-person data.

This talk is part of the Speech Recognition by Synthesis Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


Talks@bham, University of Birmingham. Contact Us | Help and Documentation | Privacy and Publicity.
talks@bham is based on from the University of Cambridge.