Enhancing Cluster Quality of Numerical Datasets with Domain Ontology

التفاصيل البيبلوغرافية
العنوان: Enhancing Cluster Quality of Numerical Datasets with Domain Ontology
المؤلفون: Heiyanthuduwage, Sudath Rohitha, Rahman, Md Anisur, Islam, Md Zahidul
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Databases, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
الوصف: Ontology-based clustering has gained attention in recent years due to the potential benefits of ontology. Current ontology-based clustering approaches have mainly been applied to reduce the dimensionality of attributes in text document clustering. Reduction in dimensionality of attributes using ontology helps to produce high quality clusters for a dataset. However, ontology-based approaches in clustering numerical datasets have not been gained enough attention. Moreover, some literature mentions that ontology-based clustering can produce either high quality or low-quality clusters from a dataset. Therefore, in this paper we present a clustering approach that is based on domain ontology to reduce the dimensionality of attributes in a numerical dataset using domain ontology and to produce high quality clusters. For every dataset, we produce three datasets using domain ontology. We then cluster these datasets using a genetic algorithm-based clustering technique called GenClust++. The clusters of each dataset are evaluated in terms of Sum of Squared-Error (SSE). We use six numerical datasets to evaluate the performance of our ontology-based approach. The experimental results of our approach indicate that cluster quality gradually improves from lower to the higher levels of a domain ontology.
Comment: 6 Pages, IEEE CSDE2022 Conference Paper
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/2304.00653Test
رقم الانضمام: edsarx.2304.00653
قاعدة البيانات: arXiv