Avtor/Urednik     Vidmar, Gaj
Naslov     Pixelisation-based statistical visualisation for categorical datasets with spreadsheet software
Tip     članek
Vir     Lect Notes Comput Sci
Vol. in št.     Letnik 4370
Leto izdaje     2007
Obseg     str. 48-54
Jezik     eng
Abstrakt     A heat-map type of chart for depicting large number of cases and up to twenty-five categorical variables with spreadsheet software is presented. It is implemented in Microsoft Excel using standard formulas, sorting and simple VBA code. The motivating example depicts accuracy of automated assignment of MeSH descriptor headings to abstracts of medical articles. Within each abstract, predicted support for each heading is ranked, then for each heading actually assigned/non-assigned by human specialist (depicted by black/white cell), high/low support is depicted on nine-point two-colour scale. Thus, each case (abstract) is depicted by one row of a table and each variable (heading) with two adjacent columns. Rank-based classification accuracy measure is calculated for each case, and rows are sorted in increasing accuracy order downwards. Based on analogous measure, variables are sorted in increasing prediction accuracy order rightwards. Another biomedical dataset is presented with a similar chart. Different methods for predicting binary outcomes can be visualised, and the procedure is easily extended to polytomous variables.