R&B -- Rhythm and Brain: Cross-subject Decoding of Music from Human Brain Activity

التفاصيل البيبلوغرافية
العنوان: R&B -- Rhythm and Brain: Cross-subject Decoding of Music from Human Brain Activity
المؤلفون: Ferrante, Matteo, Ciferri, Matteo, Toschi, Nicola
سنة النشر: 2024
المجموعة: Computer Science
Quantitative Biology
مصطلحات موضوعية: Quantitative Biology - Neurons and Cognition, Computer Science - Artificial Intelligence, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: Music is a universal phenomenon that profoundly influences human experiences across cultures. This study investigates whether music can be decoded from human brain activity measured with functional MRI (fMRI) during its perception. Leveraging recent advancements in extensive datasets and pre-trained computational models, we construct mappings between neural data and latent representations of musical stimuli. Our approach integrates functional and anatomical alignment techniques to facilitate cross-subject decoding, addressing the challenges posed by the low temporal resolution and signal-to-noise ratio (SNR) in fMRI data. Starting from the GTZan fMRI dataset, where five participants listened to 540 musical stimuli from 10 different genres while their brain activity was recorded, we used the CLAP (Contrastive Language-Audio Pretraining) model to extract latent representations of the musical stimuli and developed voxel-wise encoding models to identify brain regions responsive to these stimuli. By applying a threshold to the association between predicted and actual brain activity, we identified specific regions of interest (ROIs) which can be interpreted as key players in music processing. Our decoding pipeline, primarily retrieval-based, employs a linear map to project brain activity to the corresponding CLAP features. This enables us to predict and retrieve the musical stimuli most similar to those that originated the fMRI data. Our results demonstrate state-of-the-art identification accuracy, with our methods significantly outperforming existing approaches. Our findings suggest that neural-based music retrieval systems could enable personalized recommendations and therapeutic applications. Future work could use higher temporal resolution neuroimaging and generative models to improve decoding accuracy and explore the neural underpinnings of music perception and emotion.
Comment: The first two authors contributed equally to this work
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/2406.15537Test
رقم الانضمام: edsarx.2406.15537
قاعدة البيانات: arXiv