ASM Based Synthesis of Handwritten Arabic Text Pages

التفاصيل البيبلوغرافية
العنوان: ASM Based Synthesis of Handwritten Arabic Text Pages
المؤلفون: Sherif El-etriby, Ahmed Ghoneim, Ayoub Al-Hamadi, Moftah Elzobi, Laslo Dinges
المصدر: The Scientific World Journal, Vol 2015 (2015)
The Scientific World Journal
بيانات النشر: Hindawi Limited, 2015.
سنة النشر: 2015
مصطلحات موضوعية: Article Subject, Computer science, Speech recognition, lcsh:Medicine, computer.software_genre, lcsh:Technology, General Biochemistry, Genetics and Molecular Biology, Synthetic data, Preprocessor, Segmentation, lcsh:Science, General Environmental Science, Ground truth, lcsh:T, business.industry, Character (computing), lcsh:R, General Medicine, Spotting, Unicode, lcsh:Q, Artificial intelligence, business, computer, Natural language processing, Word (computer architecture), Research Article
الوصف: Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
وصف الملف: text/xhtml
تدمد: 1537-744X
2356-6140
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0fae96a19f112f60a382f281c8af6f08Test
https://doi.org/10.1155/2015/323575Test
حقوق: OPEN
رقم الانضمام: edsair.doi.dedup.....0fae96a19f112f60a382f281c8af6f08
قاعدة البيانات: OpenAIRE