Classify social image by integrating multi-modal content
العنوان: | Classify social image by integrating multi-modal content |
---|---|
المؤلفون: | Zhoujun Li, Xu Zhang, Xiong Li, Senzhang Wang, Xiaoming Zhang |
المصدر: | Multimedia Tools and Applications. 77:7469-7485 |
بيانات النشر: | Springer Science and Business Media LLC, 2017. |
سنة النشر: | 2017 |
مصطلحات موضوعية: | Normalization (statistics), Standard test image, Contextual image classification, Computer Networks and Communications, business.industry, Computer science, Linear classifier, Pattern recognition, 02 engineering and technology, 010501 environmental sciences, Machine learning, computer.software_genre, 01 natural sciences, Modal, Social image, Hardware and Architecture, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Software, 0105 earth and related environmental sciences |
الوصف: | There is a growing volume of social images with the development of social networks and digital cameras. Usually, these images are annotated with textual tags besides the visual content. It is quite urgent to automatically organize and manage this large number of social images. Image classification is the basic task of these applications and has attracted great research efforts. Though there are many researches on image classification, it is of considerable challenge to integrate the multi-modal content of social images simultaneously for classification, since the textual content and visual content are represented in two heterogeneous feature spaces. In this paper, we proposed a multi-modal learning method to integrate multi-modal features through their correlation seamlessly. Specifically, we learn two linear classification modules for the two types of features, and then they are integrated by the l 2 normalization method via a joint model. Each classier is normalized with l 2,1 to reduce the effect of the noisy features by selecting a subset of more important features. With the joint model, the classification based on visual features can be reinforced by the classification based on textual features, and vice verse. Then, the test image is classified based on both the textual features and visual features by combing the results of the two classifiers. Experiments conducted on real-world social image datasets demonstrate the superiority of our proposed method compared with the representative baselines. |
تدمد: | 1573-7721 1380-7501 |
الوصول الحر: | https://explore.openaire.eu/search/publication?articleId=doi_________::b9285c6cd65faf443ca37dde2c1410d8Test https://doi.org/10.1007/s11042-017-4657-2Test |
حقوق: | CLOSED |
رقم الانضمام: | edsair.doi...........b9285c6cd65faf443ca37dde2c1410d8 |
قاعدة البيانات: | OpenAIRE |
تدمد: | 15737721 13807501 |
---|