رسالة جامعية

Network-based support vector machines for classification of microarray gene expression data.

التفاصيل البيبلوغرافية
العنوان: Network-based support vector machines for classification of microarray gene expression data.
المؤلفون: Zhu, Yanni
سنة النشر: 2009
المجموعة: University of Minnesota Digital Conservancy
مصطلحات موضوعية: Gene expression, Gene network, Penalization, Support vector machine
الوصف: University of Minnesota Ph.D. dissertation. September 2009. Major: Biostatistics. Advisor: Wei Pan. 1 computer file (PDF); xii, 98 pages. ; The importance of network-based approach to identifying biological markers for diag- nostic classification and prognostic assessment in the context of microarray has been increasingly recognized. Standard methods treat all genes independently and identically a priori and ignore the biological observation that genes function together in biological processes. For binary classification, we are motivated to improve predictive accuracy and gene selection by developing novel network-based classification tools that explicitly incorporate interrelationships of genes as described by gene networks. We propose three network-based support vector machines (SVM) by suitably forming the penalty term. The neighboring-gene (NG) penalty groups pairwise gene neighbors and sums up the L1-norm of each group over the entire network, leading to NG-SVM. NG-SVM tends to select pairs of neighboring genes. The disease-gene-centric (DGC) penalty is constructed on groups defined on an upper-lower hierarchy imposed on the undirected network. DGC-SVM aims to detect collectives of genes clustering together and around some key disease genes. The truncated L1-norm (TL1) penalty intends to correct bias induced by penalization through a threshold parameter C > 0 built into the L1-norm as used in NG-SVM and DGC-SVM. Simulation studies and real data applications demonstrate that the proposed methods are able to capture more disease genes and less noise genes than the existing popular methods, standard SVM and L1-SVM. We conclude that the proposed methods have the potential to be effective classification tools for microarrays and other high-dimensional data.
نوع الوثيقة: thesis
اللغة: English
العلاقة: http://purl.umn.edu/57042Test
الإتاحة: http://purl.umn.edu/57042Test
رقم الانضمام: edsbas.12D4A34C
قاعدة البيانات: BASE