Automatic temporal segment detection and affect recognition from face and body display

Gunes, H; Piccardi, M

Automatic temporal segment detection and affect recognition from face and body display

Gunes, H Piccardi, M

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2009, 39 (1), pp. 64 - 84
Issue Date:: 2009-01-01

Closed Access

	Filename	Description	Size
	2008001853OK.pdf		1.52 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Gunes, H	en_US
dc.contributor.author	Piccardi, M https://orcid.org/0000-0001-9250-6604	en_US
dc.date.issued	2009-01-01	en_US
dc.identifier.citation	IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2009, 39 (1), pp. 64 - 84	en_US
dc.identifier.issn	1083-4419	en_US
dc.identifier.uri	http://hdl.handle.net/10453/8942
dc.description.abstract	Psychologists have long explored mechanisms with which humans recognize other humans' affective states from modalities, such as voice and face display. This exploration has led to the identification of the main mechanisms, including the important role played in the recognition process by the modalities' dynamics. Constrained by the human physiology, the temporal evolution of a modality appears to be well approximated by a sequence of temporal segments called onset, apex, and offset. Stemming from these findings, computer scientists, over the past 15 years, have proposed various methodologies to automate the recognition process. We note, however, two main limitations to date. The first is that much of the past research has focused on affect recognition from single modalities. The second is that even the few multimodal systems have not paid sufficient attention to the modalities' dynamics: The automatic determination of their temporal segments, their synchronization to the purpose of modality fusion, and their role in affect recognition are yet to be adequately explored. To address this issue, this paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes affective states based on phase synchronization/alignment. The experimental results obtained show the following: 1) affective face and body displays are simultaneous but not strictly synchronous; 2) explicit detection of the temporal phases can improve the accuracy of affect recognition; 3) recognition from fused face and body modalities performs better than that from the face or the body modality alone; and 4) synchronized feature-level fusion achieves better performance than decision-level fusion. © 2008 IEEE.	en_US
dc.relation.ispartof	IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics	en_US
dc.relation.isbasedon	10.1109/TSMCB.2008.927269	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.subject.mesh	Humans	en_US
dc.subject.mesh	Facial Expression	en_US
dc.subject.mesh	Normal Distribution	en_US
dc.subject.mesh	Gestures	en_US
dc.subject.mesh	Recognition (Psychology)	en_US
dc.subject.mesh	Movement	en_US
dc.subject.mesh	Algorithms	en_US
dc.subject.mesh	Time Factors	en_US
dc.subject.mesh	Artificial Intelligence	en_US
dc.subject.mesh	Image Processing, Computer-Assisted	en_US
dc.subject.mesh	Video Recording	en_US
dc.subject.mesh	Pattern Recognition, Automated	en_US
dc.subject.mesh	Recognition, Psychology	en_US
dc.title	Automatic temporal segment detection and affect recognition from face and body display	en_US
dc.type	Journal Article
utslib.citation.volume	1	en_US
utslib.citation.volume	39	en_US
utslib.for	0806 Information Systems	en_US
utslib.for	0102 Applied Mathematics	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
utslib.copyright.status	closed_access
pubs.issue	1	en_US
pubs.publication-status	Published	en_US
pubs.volume	39	en_US

Abstract:

Psychologists have long explored mechanisms with which humans recognize other humans' affective states from modalities, such as voice and face display. This exploration has led to the identification of the main mechanisms, including the important role played in the recognition process by the modalities' dynamics. Constrained by the human physiology, the temporal evolution of a modality appears to be well approximated by a sequence of temporal segments called onset, apex, and offset. Stemming from these findings, computer scientists, over the past 15 years, have proposed various methodologies to automate the recognition process. We note, however, two main limitations to date. The first is that much of the past research has focused on affect recognition from single modalities. The second is that even the few multimodal systems have not paid sufficient attention to the modalities' dynamics: The automatic determination of their temporal segments, their synchronization to the purpose of modality fusion, and their role in affect recognition are yet to be adequately explored. To address this issue, this paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes affective states based on phase synchronization/alignment. The experimental results obtained show the following: 1) affective face and body displays are simultaneous but not strictly synchronous; 2) explicit detection of the temporal phases can improve the accuracy of affect recognition; 3) recognition from fused face and body modalities performs better than that from the face or the body modality alone; and 4) synchronized feature-level fusion achieves better performance than decision-level fusion. © 2008 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/8942