Parallel incremental efficient attribute reduction algorithm based on attribute tree.

التفاصيل البيبلوغرافية
العنوان: Parallel incremental efficient attribute reduction algorithm based on attribute tree.
المؤلفون: Ding, Weiping1 (AUTHOR) dwp9988@163.com, Qin, Tingzhen1 (AUTHOR), Shen, Xinjie1 (AUTHOR), Ju, Hengrong1 (AUTHOR), Wang, Haipeng1 (AUTHOR), Huang, Jiashuang1 (AUTHOR), Li, Ming1 (AUTHOR)
المصدر: Information Sciences. Sep2022, Vol. 610, p1102-1121. 20p.
مصطلحات موضوعية: *ALGORITHMS, *ELECTRONIC data processing, PLYOMETRICS, ROUGH sets, MULTICASTING (Computer networks), VIDEO coding, TREES, MACHINE learning
الشركة/الكيان: INTERNATIONAL Agency for Research on Cancer
مستخلص: • We introduce the mechanism of a binary tree and propose a parallel incremental acceleration strategy based on the attribute tree. • The branch threshold coefficient is added into the calculation process to guide the algorithm to jump out of the loop, avoid redundant calculations, and reduce the number of attribute evaluations. • When multiple incremental objects are added to the decision system, the incremental mechanism can be used to update the reduction. • We combine IARAT and Spark parallel technology to parallelize data processing to accelerate the calculation process. Attribute reduction is an important application of rough sets. Efficiently reducing massive dynamic data sets quickly has always been a major goal of researchers. Traditional incremental methods focus on reduction by updated approximations. However, these methods must evaluate all attributes and repeatedly calculate their importance. When these algorithms are applied to large datasets with high time complexity, reducing large decision systems becomes inefficient. We propose an incremental acceleration strategy based on attribute trees to solve this problem. The key step is to cluster all attributes into multiple trees for incremental attribute evaluation. Specifically, we first select the appropriate attribute tree for attribute evaluation according to the attribute tree correlation measure to reduce the time complexity. Next, the branch coefficient is added to the stop criterion, increasing with the branch depth and guiding a jump out of the loop after reaching the maximum threshold. This avoids redundant calculation and improves efficiency. Furthermore, we propose an algorithm for incremental attribute reduction based on attribute trees using these improvements. Finally, a Spark parallel mechanism is added to parallelize data processing to implement the parallel incremental efficient attribute reduction based on the attribute tree. Experimental results on the Shuttle dataset show that the time consumption of our algorithm is more than 40% lower than that of the classical IARC algorithm while maintaining its good classification performance. In addition, the time is shortened by more than 87% from the benchmark after adding the Spark parallelizing mechanism. [ABSTRACT FROM AUTHOR]
Copyright of Information Sciences is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Business Source Index
الوصف
تدمد:00200255
DOI:10.1016/j.ins.2022.08.044