دورية أكاديمية
Multi-Modal Prompt Learning on Blind Image Quality Assessment ...
العنوان: | Multi-Modal Prompt Learning on Blind Image Quality Assessment ... |
---|---|
المؤلفون: | Pan, Wensheng, Gao, Timin, Zhang, Yan, Hu, Runze, Zheng, Xiawu, Zhang, Enwei, Gao, Yuting, Liu, Yutao, Shen, Yunhang, Li, Ke, Zhang, Shengchuan, Cao, Liujuan, Ji, Rongrong |
بيانات النشر: | arXiv |
سنة النشر: | 2024 |
المجموعة: | DataCite Metadata Store (German National Library of Science and Technology) |
مصطلحات موضوعية: | Computer Vision and Pattern Recognition cs.CV, FOS Computer and information sciences |
الوصف: | Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Currently, leveraging semantic information to enhance IQA is a crucial research direction. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. However, the generalist nature of these pre-trained Vision-Language (VL) models often renders them suboptimal for IQA-specific tasks. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. Existing prompt-based VL models overly focus on incremental semantic information from text, neglecting the rich insights available from visual data analysis. This imbalance limits their performance improvements in IQA tasks. This paper introduces an innovative multi-modal prompt-based methodology for IQA. Our approach employs carefully crafted prompts ... |
نوع الوثيقة: | article in journal/newspaper report |
اللغة: | unknown |
DOI: | 10.48550/arxiv.2404.14949 |
الإتاحة: | https://doi.org/10.48550/arxiv.2404.14949Test https://arxiv.org/abs/2404.14949Test |
حقوق: | arXiv.org perpetual, non-exclusive license ; http://arxiv.org/licenses/nonexclusive-distrib/1.0Test/ |
رقم الانضمام: | edsbas.75EF1814 |
قاعدة البيانات: | BASE |
DOI: | 10.48550/arxiv.2404.14949 |
---|