12 November 2011, Barcelona Spain

During the past decades, important research efforts in computer vision have been focused on developing theories, methods and systems applied to the description of human behaviors in image sequences. Although a critical goal is the estimation of quantitative parameters describing where is motion, recent trends are focused on the analysis of image sequences by incorporating motion understanding: the challenge is focused then on the generation of qualitative descriptions about the meaning of human motion, therefore understanding not only where, but also why behaviors are observed in multimedia footage.

This goal has become a key task in many promising areas, such as Internet vision, scene understanding, video indexing and retrieval, video surveillance and advanced human-computer interfaces. However, understanding human behaviors in image sequences implies to cope with multiple difficulties mainly due to the so-called sensory and semantic gaps. On the one hand, huge appearance variability is found due to acquisition conditions, clothes, lighting, and posture changes. On the other hand, conceptual interpretations of human behaviors include uncertainty due to the inherent complexity for the behavior patterns to be modeled and to the vagueness of the semantic terms used for behavior description. So multiple key-topics on computer vision and pattern recognition should be tackled, such as: object/human detection; articulated motion tracking; human pose categorization; concept formation; interpretation and reasoning; video indexing and retrieval; etc.

In order to face these issues, the advancement of novel capabilities for video understanding does increase the cross-fertilization between multiple computer vision and pattern recognition research topics. ARTEMIS 2011 provided a holistic view on the interpretation and description of human behaviors in multimedia content such as sports, news, documentaries, movies and surveillance footage. Therefore data can be extracted either from camera/audio sensors, motion capture systems, Internet streams or video footage.

This ICCV 2011 workshop aims at encouraging links between those complementary research topics sharing the common goal of understanding human behavior in image sequences.


Comments are closed.