DIR-ST$^2$: Delineation of Imprecise Regions Using Spatio--Temporal--Textual Information

التفاصيل البيبلوغرافية
العنوان: DIR-ST$^2$: Delineation of Imprecise Regions Using Spatio--Temporal--Textual Information
المؤلفون: Tran, Cong, Shin, Won-Yong, Choi, Sang-Il
سنة النشر: 2018
المجموعة: Computer Science
Statistics
مصطلحات موضوعية: Computer Science - Information Retrieval, Computer Science - Learning, Statistics - Machine Learning
الوصف: An imprecise region is referred to as a geographical area without a clearly-defined boundary in the literature. Previous clustering-based approaches exploit spatial information to find such regions. However, the prior studies suffer from the following two problems: the subjectivity in selecting clustering parameters and the inclusion of a large portion of the undesirable region (i.e., a large number of noise points). To overcome these problems, we present DIR-ST$^2$, a novel framework for delineating an imprecise region by iteratively performing density-based clustering, namely DBSCAN, along with not only spatio--textual information but also temporal information on social media. Specifically, we aim at finding a proper radius of a circle used in the iterative DBSCAN process by gradually reducing the radius for each iteration in which the temporal information acquired from all resulting clusters are leveraged. Then, we propose an efficient and automated algorithm delineating the imprecise region via hierarchical clustering. Experiment results show that by virtue of the significant noise reduction in the region, our DIR-ST$^2$ method outperforms the state-of-the-art approach employing one-class support vector machine in terms of the $\mathcal{F}_1$ score from comparison with precisely-defined regions regarded as a ground truth, and returns apparently better delineation of imprecise regions. The computational complexity of DIR-ST$^2$ is also analytically and numerically shown.
Comment: 11 pages, 12 figures, 3 tables
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/1806.03482Test
رقم الانضمام: edsarx.1806.03482
قاعدة البيانات: arXiv