Avtor/Urednik     Lusa, Lara; Blagus, Rok
Naslov     The class-imbalance problem for hig-dimensional class prediction
Tip     članek
Vir     In: Tao D, editor. ICMLA 2012. 11th International conference on machine learning and applications; 2012 Dec 12-15; Boca Raton. Institute of electrical and electronics engineers,
Leto izdaje     2012
Obseg     str. 123-6
Jezik     eng
Abstrakt     The goal of class prediction studies is to develop rules to accurately predict the class membership of new subjects. The classifiers differ in the way they combine the values of the variables available for each subject. Frequently the classifiers are developed using class-imbalanced data, where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced data are often biased towards the majority class: they classify most new samples in the majority class and they do not accurately predict the minority class. Data are high-dimensional when the number of variables greatly exceeds the number of subjects. In this paper we show how the high-dimensionality poses additional challenges when dealing with class-imbalanced prediction. Here we present new simulation studies for five classifiers, where we expand our previous results to correlated variables, and briefly discuss the results.