رسالة جامعية

Mixture modelling of multiresolution 0-1 data

التفاصيل البيبلوغرافية
العنوان: Mixture modelling of multiresolution 0-1 data
المؤلفون: Adhikari, Prem Raj
المساهمون: Hollmén, Jaakko, Perustieteiden korkeakoulu, School of Science, Tietotekniikan laitos, Kaski, Samuel, Aalto-yliopisto, Aalto University
سنة النشر: 2010
المجموعة: Aalto University Publication Archive (Aaltodoc) / Aalto-yliopiston julkaisuarkistoa
مصطلحات موضوعية: mixture models, multiresolution data, 0-1 data, model selection, cross-validation, chromosomal aberration, upsampling, downsampling, cancer genetics
الوصف: Biological systems are complex and measurements in biology are made with high throughput and high resolution techniques often resulting in data in multiple resolutions. Furthermore, ISCN [1] has defined five different resolutions of the chromosome band. Currently, available standard algorithms can only handle data in one resolution at a time. Hence, transformation of the data to the same resolution is inevitable before the data can be fed to the algorithm. Furthermore comparing the results of an algorithm on data in different resolutions can produce interesting results which aids in determining suitable resolution of data. In addition, experiments in different, resolutions can be helpful in determining the appropriate resolution for computational methods. In this thesis, one method for up sampling and three different methods of down sampling 0-1 data are proposed, implemented and experiments are performed on different resolutions. Suitability of the proposed methods is validated and the results are compared across different resolutions. The proposed methods produce plausible results showing that the significant patterns in the data are retained in the transformed resolution. Thereafter, the mixture models are trained on the data original data and the results are analyzed. However, machine learning methods such as mixture models require high amounts of data to produce plausible results. Therefore, the major aim of the data transformation procedure was the integration of databases. Hence, two different datasets available in two different resolutions were integrated after transforming them to a single resolution and mixture models were trained on them. Trained models can be used to classify cancers and cluster the data. The results on integrated data showed significant improvements compared with the data in the original resolution.
نوع الوثيقة: master thesis
اللغة: English
العلاقة: https://aaltodoc.aalto.fi/handle/123456789/99063Test; URN:NBN:fi:aalto-2020122357890
الإتاحة: https://aaltodoc.aalto.fi/handle/123456789/99063Test
حقوق: closedAccess
رقم الانضمام: edsbas.484BA4EA
قاعدة البيانات: BASE