دورية أكاديمية

Using coarse information for real valued prediction.

التفاصيل البيبلوغرافية
العنوان: Using coarse information for real valued prediction.
المؤلفون: Dhurandhar, Amit
المصدر: Data Mining & Knowledge Discovery; Sep2013, Vol. 27 Issue 2, p167-192, 26p, 7 Diagrams, 4 Charts, 4 Graphs
مصطلحات موضوعية: INFORMATION theory, PREDICTION models, CONSUMER goods, REGRESSION analysis, ESTIMATES, DATA analysis
مستخلص: In domains such as consumer products and manufacturing amongst others, we have problems that warrant the prediction of a continuous target. Besides the usual set of explanatory attributes, we may also have exact (or approximate) estimates of aggregated targets, which are the sums of disjoint sets of individual targets that we are trying to predict. The question now becomes can we use these aggregated targets, which are a coarser piece of information, to improve the quality of predictions of the individual targets? In this paper, we provide a simple yet provable way of accomplishing this. In particular, given predictions from any regression model of the target on the test data, we elucidate a provable method for improving these predictions in terms of mean squared error, given exact (or accurate enough) information of the aggregated targets. These estimates of the aggregated targets may be readily available or obtained-through multilevel regression-at different levels of granularity. Based on the proof of our method we suggest a criterion for choosing the appropriate level. Moreover, in addition to estimates of the aggregated targets, if we have exact (or approximate) estimates of the mean and variance of the target distribution, then based on our general strategy we provide an optimal way of incorporating this information so as to further improve the quality of predictions of the individual targets. We then validate the results and our claims by conducting experiments on synthetic and real industrial data obtained from diverse domains. [ABSTRACT FROM AUTHOR]
Copyright of Data Mining & Knowledge Discovery is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:13845810
DOI:10.1007/s10618-012-0287-5