دورية أكاديمية

PVTReID: A Quick Person Reidentification-Based Pyramid Vision Transformer

التفاصيل البيبلوغرافية
العنوان: PVTReID: A Quick Person Reidentification-Based Pyramid Vision Transformer
المؤلفون: Ke Han, Qianlong Wang, Mingming Zhu, Xiyan Zhang
المصدر: Applied Sciences, Vol 13, Iss 17, p 9751 (2023)
بيانات النشر: MDPI AG, 2023.
سنة النشر: 2023
المجموعة: LCC:Technology
LCC:Engineering (General). Civil engineering (General)
LCC:Biology (General)
LCC:Physics
LCC:Chemistry
مصطلحات موضوعية: ReID, Pyramid Vision Transformer, local feature clustering, side information embeddings, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
الوصف: Person re-identification (ReID) has attracted the attention of a large number of researchers due to its wide range of applications. However, due to the difficulty of extracting robust features and the complexity of the feature extraction process, ReID is difficult to truly apply in practice. In this paper, we utilize Pyramid Vision Transformer (PVT) as the backbone for feature extraction and propose a PVT-based ReID method in conjunction with other studies. First, we establish a basic model using powerful methods verified on CNN-based ReID. Second, to further improve the robustness of the features extracted from the PVT backbone, we design two new modules: (1) a local feature clustering (LFC) module is used to select the most discrete local features and cluster them individually by calculating the distance between local and global features, and (2) side information embeddings (SIE) are used to encode nonvisual information and send it to the network for use training in order to reduce its impact on the features. Our experiments show that the proposed PVTReID achieves an mAP of 63.2% on MSMT17 and 80.5% on DukeMTMC-reID. In addition, we evaluated the inference speed for images achieved by different methods, proving that image inference is faster with our proposed method. These results clearly illustrate that using PVT as a backbone network with LFC and SIE modules can improve inference speed while extracting robust features.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2076-3417
العلاقة: https://www.mdpi.com/2076-3417/13/17/9751Test; https://doaj.org/toc/2076-3417Test
DOI: 10.3390/app13179751
الوصول الحر: https://doaj.org/article/08ec3ea1700b4195885ae645ceba511aTest
رقم الانضمام: edsdoj.08ec3ea1700b4195885ae645ceba511a
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:20763417
DOI:10.3390/app13179751