Automatic Emotion Recognition systems have recently been gaining attention as an essential technology that can be applied in various real-life applications. Audio and visual modalities are each representative cues for revealing emotions in real life, ...
Automatic Emotion Recognition systems have recently been gaining attention as an essential technology that can be applied in various real-life applications. Audio and visual modalities are each representative cues for revealing emotions in real life, and both combinations increase emotion recognition task performance for subtle emotion analysis. This study aims to accurately predict human emotion above the Valence-Arousal dimension based on Audio-Visual modalities fusion methods.