دورية أكاديمية

Schema Theory-Based Data Engineering in Gene Expression Programming for Big Data Analytics.

التفاصيل البيبلوغرافية
العنوان: Schema Theory-Based Data Engineering in Gene Expression Programming for Big Data Analytics.
المؤلفون: Huang, Zhengwen, Li, Maozhen, Chousidis, Christos, Mousavi, Alireza, Jiang, Changjun
المصدر: IEEE Transactions on Evolutionary Computation; Oct2018, Vol. 22 Issue 5, p792-804, 13p
مصطلحات موضوعية: GENE expression, SOFTWARE analytics, SEGMENTATION (Biology), GENETIC engineering, DATA mining, BIG data
مستخلص: Gene expression programming (GEP) is a data driven evolutionary technique that well suits for correlation mining. Parallel GEPs are proposed to speed up the evolution process using a cluster of computers or a computer with multiple CPU cores. However, the generation structure of chromosomes and the size of input data are two issues that tend to be neglected when speeding up GEP in evolution. To fill the research gap, this paper proposes three guiding principles to elaborate the computation nature of GEP in evolution based on an analysis of GEP schema theory. As a result, a novel data engineered GEP is developed which follows closely the generation structure of chromosomes in parallelization and considers the input data size in segmentation. Experimental results on two data sets with complementary features show that the data engineered GEP speeds up the evolution process significantly without loss of accuracy in data correlation mining. Based on the experimental tests, a computation model of the data engineered GEP is further developed to demonstrate its high scalability in dealing with potential big data using a large number of CPU cores. [ABSTRACT FROM AUTHOR]
Copyright of IEEE Transactions on Evolutionary Computation is the property of IEEE and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:1089778X
DOI:10.1109/TEVC.2017.2771445