Sketching the Heat Kernel: Using Gaussian Processes to Embed Data

التفاصيل البيبلوغرافية
العنوان: Sketching the Heat Kernel: Using Gaussian Processes to Embed Data
المؤلفون: Gilbert, Anna C., O'Neill, Kevin
سنة النشر: 2024
المجموعة: Computer Science
Mathematics
Statistics
مصطلحات موضوعية: Computer Science - Machine Learning, Mathematics - Numerical Analysis, Statistics - Machine Learning
الوصف: This paper introduces a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on computing realizations of a Gaussian process depending on the geometry of the data. This type of embedding first appeared in (Adler et al, 2018) as a theoretical model for a generic manifold in high dimensions. In particular, we take the covariance function of the Gaussian process to be the heat kernel, and computing the embedding amounts to sketching a matrix representing the heat kernel. The Karhunen-Lo\`eve expansion reveals that the straight-line distances in the embedding approximate the diffusion distance in a probabilistic sense, avoiding the need for sharp cutoffs and maintaining some of the smaller-scale structure. Our method demonstrates further advantage in its robustness to outliers. We justify the approach with both theory and experiments.
Comment: 28 pages
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/2403.07929Test
رقم الانضمام: edsarx.2403.07929
قاعدة البيانات: arXiv