دورية أكاديمية

Scene Text Segmentation via Multi-Task Cascade Transformer With Paired Data Synthesis

التفاصيل البيبلوغرافية
العنوان: Scene Text Segmentation via Multi-Task Cascade Transformer With Paired Data Synthesis
المؤلفون: Quang-Vinh Dang, Guee-Sang Lee
المصدر: IEEE Access, Vol 11, Pp 67791-67805 (2023)
بيانات النشر: IEEE, 2023.
سنة النشر: 2023
المجموعة: LCC:Electrical engineering. Electronics. Nuclear engineering
مصطلحات موضوعية: Scene text segmentation, paired data synthesis, GANs, transformer, multi-task cascade, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
الوصف: The scene text segmentation task provides a wide range of practical applications. However, the number of images in the available datasets for scene text segmentation is not large enough to effectively train deep learning-based models, leading to limited performance. To solve this problem, we employ paired data generation to secure sufficient data samples for text segmentation via Text Image-conditional GANs. Furthermore, existing models implicitly model text attributes such as size, layout, font, and structure, which hinders their performance. To remedy this, we propose a Multi-task Cascade Transformer network that explicitly learns these attributes using large volumes of generated synthetic data. The transformer-based network includes two auxiliary tasks and one main task for text segmentation. The auxiliary tasks help the network learn text regions to focus on, as well as the structure of the text through different words and fonts, to support the main task. To bridge the gap between different datasets, we train the proposed network on paired synthetic data before fine-tuning it on real data. Our experiments on publicly available scene text segmentation datasets show that our method outperforms existing methods.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2169-3536
العلاقة: https://ieeexplore.ieee.org/document/10172213Test/; https://doaj.org/toc/2169-3536Test
DOI: 10.1109/ACCESS.2023.3292264
الوصول الحر: https://doaj.org/article/206d206d833e41d88533c9cc4cef6e98Test
رقم الانضمام: edsdoj.206d206d833e41d88533c9cc4cef6e98
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:21693536
DOI:10.1109/ACCESS.2023.3292264