دورية أكاديمية

GSC-MIM: Global semantic integrated self-distilled complementary masked image model for remote sensing images scene classification

التفاصيل البيبلوغرافية
العنوان: GSC-MIM: Global semantic integrated self-distilled complementary masked image model for remote sensing images scene classification
المؤلفون: Xuying Wang, Yunsheng Zhang, Zhaoyang Zhang, Qinyao Luo, Jingfan Yang
المصدر: Frontiers in Ecology and Evolution, Vol 10 (2022)
بيانات النشر: Frontiers Media S.A., 2022.
سنة النشر: 2022
المجموعة: LCC:Evolution
LCC:Ecology
مصطلحات موضوعية: self-supervised learning (SSL), masked image modeling (MIM), self-distillation, remote sensing images (RSIs), scene classification, Evolution, QH359-425, Ecology, QH540-549.5
الوصف: Masked image modeling (MIM) is a learning method in which the unmasked components of the input are utilized to learn and predict the masked signal, enabling learning from large amounts of unannotated data. However, due to the scale diversity and complexity of features in remote sensing images (RSIs), existing MIMs face two challenges in the RSI scene classification task: (1) If the critical local patches of small-scale objects are randomly masked out, the model will be unable to learn its representation. (2) The reconstruction of MIM relies on the visible local contextual information surrounding the masked regions and overemphasizing this local information will potentially lead the model to disregard the global semantic information of the input RSI. Regarding the above considerations, we proposed a global semantic integrated self-distilled complementary masked image model (GSC-MIM) for RSI scene classification. To prevent information loss, we proposed an information-preserved complementary masking strategy (IPC-Masking), which generates two complementary masked views for the same image to resolve the problem of masking critical areas of small-scale objects. To incorporate global information into the MIM pre-training process, we proposed the global semantic distillation strategy (GSD). Specifically, we introduced an auxiliary network pipeline to extract the global semantic information from the full input RSI and transfer the knowledge to the MIM by self-distillation. The proposed GSC-MIM is validated on three publicly available datasets of AID, NWPU-RESISC45, and UC-Merced Land Use, and the results show that the proposed method's Top-1 accuracy surpasses the baseline approaches in three datasets by up to 4.01, 3.87, and 5.26%, respectively.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2296-701X
العلاقة: https://www.frontiersin.org/articles/10.3389/fevo.2022.1083801/fullTest; https://doaj.org/toc/2296-701XTest
DOI: 10.3389/fevo.2022.1083801
الوصول الحر: https://doaj.org/article/baafacd9505248ddb194af7aee6d3e24Test
رقم الانضمام: edsdoj.baafacd9505248ddb194af7aee6d3e24
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:2296701X
DOI:10.3389/fevo.2022.1083801