Emotional Verification Experiments (EmoVerE) focuses on emotion recognition in speech, and aims to automatically identify the emotional state of the speaker from their voice in order to better understand their wants and needs. EmoVerE builds on previous work into the development of high quality emotional speech corpora and combines this with advanced machine learning and speech analysis techniques. This technology will enhance interactions with such systems as computer gaming and online learning while forming the basis of more expressive speech synthesis and 3D character animation.

Project Details



Technical Information

The accurate detection and categorisation of emotional content in human speech is a difficult but crucial task in order to achieve natural interaction with systems, particularly in the entertainment industry. This project aims to make innovative use of higher-level acoustic attributes in association with machine learning techniques to model and predict emotional content in speech audio files. Existing work has developed a state-of-the-art benchmark emotional speech corpus that will be utilised and extended by this research. Specific consideration will be given to acoustic prominence and contour features, annotation techniques and the use of advanced machine learning techniques for automatic categorisation.


The initial contribution of the project is the labelling of a naturalistic emotional speech dataset through the use of crowdsourcing. Labels are determined by ratings from performed listening tasks. The task is performed online via a specifically developed rating tool, which operates on an ongoing basis. To contribute to the ratings and/or download the corpus please visiti the site here.

Research Outputs

Snel, J., Tarasov, A., Cullen, C., Delany, S.J.: A Crowdsourcing Approach to Labelling a Mood Induced Speech Corpora. 4th International Workshop on Corpora for Research on Emotion Sentiment & Social Signals (ES³ 2012).

Tarasov, A , Delany, S. J., Mac Namee, B. : Dynamic Estimation of Rater Reliability in Regression Tasks using Multi-Armed Bandit Techniques, Workshop on Machine Learning in Human Computation and Crowdsourcing, in conjunction with ICML 2012.

Tarasov, A., Delaney, S.J., MacNamee, B.:Dynamic Estimation of Rater Reliability in Subjective Tasks Using Multi-Armed Bandits. Published in the Proceedings of 2012 ASE/IEEE International Conference on Social Computing.

Tarasov, A., Delany, S.:Benchmarking Classification Models for Emotion Recognition in Natural Speech: a Multi-Corporal Study. EmoSPACE Workshop (in conjunction with IEEE FG 2011 conference), 2011.

Snel, J., Cullen, C.: Obtaining speech assets for judgement analysis on low-pass filtered emotional speech. EmoSPACE 2011 workshop (in conjunction with IEEE FG 2011 conference), 2011.

Tarasov, A., Cullen, C., Delany, S: Using Crowdsourcing for labeling emotional speech assets. W3C workshop on Emotion ML, Paris, France. 2010.

Digital Media Centre
DIT Aungier Street
Dublin 2

tel +353 (0)1 402 3092
fax +353 (0)1 402 3293

view map

© DMC 2010