User Identity Protection in Automatic Emotion Recognition through Disguised Speech

التفاصيل البيبلوغرافية
العنوان:	User Identity Protection in Automatic Emotion Recognition through Disguised Speech
المؤلفون:	Fasih Haider, Pierre Albert, Saturnino Luz
المصدر:	AI, Vol 2, Iss 4, Pp 636-649 (2021)
بيانات النشر:	MDPI AG, 2021.
سنة النشر:	2021
المجموعة:	LCC:Electronic computers. Computer science
مصطلحات موضوعية:	privacy preservation, affect recognition, health technologies, emotion recognition, Ambient Assisted Living, social signal processing, Electronic computers. Computer science, QA75.5-76.95
الوصف:	Ambient Assisted Living (AAL) technologies are being developed which could assist elderly people to live healthy and active lives. These technologies have been used to monitor people’s daily exercises, consumption of calories and sleep patterns, and to provide coaching interventions to foster positive behaviour. Speech and audio processing can be used to complement such AAL technologies to inform interventions for healthy ageing by analyzing speech data captured in the user’s home. However, collection of data in home settings presents challenges. One of the most pressing challenges concerns how to manage privacy and data protection. To address this issue, we proposed a low cost system for recording disguised speech signals which can protect user identity by using pitch shifting. The disguised speech so recorded can then be used for training machine learning models for affective behaviour monitoring. Affective behaviour could provide an indicator of the onset of mental health issues such as depression and cognitive impairment, and help develop clinical tools for automatically detecting and monitoring disease progression. In this article, acoustic features extracted from the non-disguised and disguised speech are evaluated in an affect recognition task using six different machine learning classification methods. The results of transfer learning from non-disguised to disguised speech are also demonstrated. We have identified sets of acoustic features which are not affected by the pitch shifting algorithm and also evaluated them in affect recognition. We found that, while the non-disguised speech signal gives the best Unweighted Average Recall (UAR) of 80.01%, the disguised speech signal only causes a slight degradation of performance, reaching 76.29%. The transfer learning from non-disguised to disguised speech results in a reduction of UAR (65.13%). However, feature selection improves the UAR (68.32%). This approach forms part of a large project which includes health and wellbeing monitoring and coaching.
نوع الوثيقة:	article
وصف الملف:	electronic resource
اللغة:	English
تدمد:	2673-2688
العلاقة:	https://www.mdpi.com/2673-2688/2/4/38Test; https://doaj.org/toc/2673-2688Test
DOI:	10.3390/ai2040038
الوصول الحر:	https://doaj.org/article/ea3d6777f7f04255bebc3021672b7bd2Test
رقم الانضمام:	edsdoj.3d6777f7f04255bebc3021672b7bd2
قاعدة البيانات:	Directory of Open Access Journals

View record in DOAJ

الوصف
تدمد:	26732688
DOI:	10.3390/ai2040038