Author/Editor     Džeroski, Sašo; Hristovski, Dimitar; Kunej, Tanja; Peterlin, Borut
Title     A data mining approach to the development of a diagnostic test for male infertility
Type     članek
Source     Stud Health Technol Inform
Vol. and No.     Letnik 77
Publication year     2000
Volume     str. 779-83
Language     eng
Abstract     The paper presents a database of published Y chromosome deletions and the results of analyzing the database with data mining and other heuristic techniques with the goal of developing a diagnostic test for male infertility. The database describes 382 patients for which 177 markers were tested. Two data mining techniques, clustering and decision tree induction were used, as well as a heuristic set cover algorithm. Clustering was used to group markers according to their appearance across patients, while a heuristic set covering algorithm was used to select as small a set of markers that cover as many patients with deletions as possible. This algorithm created a diagnostic set of 13 markers that cover more than 90% of the patients with deletions. Finally, decision tree induction was used to relate deletion patterns to the severity of the clinical phenotype. A decision tree induced from the data uses 5 markers, all of which are also in the diagnostic set of 13 markers, to show relations between the severity of the clinical phenotype and deletion patterns which have not been known previously.
Descriptors     INFERTILITY, MALE
Y CHROMOSOME
CHROMOSOME DELETION
GENETIC MARKERS
DECISION MAKING, COMPUTER-ASSISTED
DECISION TREES