Contributions to generic and affective visual concept recognition ; Contribution à la reconnaissance de concepts visuels génériques et émotionnels

التفاصيل البيبلوغرافية
العنوان: Contributions to generic and affective visual concept recognition ; Contribution à la reconnaissance de concepts visuels génériques et émotionnels
المؤلفون: Liu, Ningning
المساهمون: Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)
المصدر: https://hal.science/hal-01466005Test ; 2013.
بيانات النشر: HAL CCSD
سنة النشر: 2013
المجموعة: Portail HAL de l'Université Lumière Lyon 2
مصطلحات موضوعية: Generic visual concept recognition, Affective visual concept recognition, Visual features, Textual features, Classification, Feature fusion, Reconnaissance de concepts visuels génériques, Reconnaissance de concepts visuels émotionnels, Descripteur visuel, Descripteur textuel, Classification multimodale, Fusion de descripteurs, Reconnaissance des formes (informatique), Théorie de Dempster-Shafer, [INFO]Computer Science [cs]
الوصف: This Ph.D thesis is dedicated to visual concept recognition (VCR). Due to many realistic difficulties, it is still considered to be one of the most challenging problems in computer vision and pattern recognition. In this context, we have proposed some innovative contributions for the task of VCR, particularly in building multimodal approaches that efficiently combine visual and textual information. Firstly, we have proposed semantic features for VCR and have investigated the efficiency of different types of low-level visual features for VCR including color, texture and shape. Specifically, we believe that different concepts require different features to efficiently characterize them for the recognition. Therefore, we have investigated in the context of VCR various visual representations, not only global features including color, shape and texture, but also the state-of-the-art local visual descriptors such as SIFT, Color SIFT, HOG, DAISY, LBP, Color LBP. To help bridging the semantic gap between low-level visual features and high level semantic concepts, and particularly those related to emotions and feelings, we have proposed mid-level visual features based on the visual harmony and dynamism semantics using Itten’s color theory and psychological interpretations. Moreover, we have employed a spatial pyramid strategy to capture the spatial information when building our mid-level features harmony and dynamism. We have also proposed a new representation of color HSV histograms by employing a visual attention model to identify the regions of interest in images. Secondly, we have proposed a novel textual feature designed for VCR. Indeed, most of online-shared photos provide textual descriptions in the form of tags or legends. In fact, these textual descriptions are a rich source of semantic information on visual data that is interesting to consider for the purpose of VCR or multimedia information retrieval. We propose the Histograms of Textual Concepts (HTC) to capture the semantic relatedness of concepts. The ...
نوع الوثيقة: report
اللغة: English
العلاقة: hal-01466005; https://hal.science/hal-01466005Test
الإتاحة: https://hal.science/hal-01466005Test
رقم الانضمام: edsbas.BA73FC2
قاعدة البيانات: BASE