Author/Editor     Blagus, Rok; Goeman, Jelle J.
Title     What (not) to expect when classifying rare events
Type     članek
Publication year     2016
Volume     str. str.
ISSN     1467-5463 - Briefings in bioinformatics
Language     eng
Abstract     When building classifiers, it is natural to require that the classifier correctly estimates the event probability (Constraint 1), that it has equal sensitivity and specificity (Constraint 2) or that it has equal positive and negative predictive values (Constraint 3). We prove that in the balanced case, where there is equal proportion of events and non-events, any classifier that satisfies one of these constraints will always satisfy all. Such unbiasedness of events and non-events is much more difficult to achieve in the case of rare events, i.e. the situation in which the proportion of events is (much) smaller than 0.5. Here, we prove that it is impossible to meet all three constraints unless the classifier achieves perfect predictions. Any non-perfect classifier can only satisfy at most one constraint, and satisfying one constraint implies violating the other two constraints in a specific direction. Our results have implications for classifiers optimized using g-means or [Formula: see text]-measure, which tend to satisfy Constraints 2 and 1, respectively. Our results are derived from basic probability theory and illustrated with simulations based on some frequently used classifiers.
Keywords     statistics
rare events
prediction models
statistika
redki dogodki
modeli za napovedovanje