Author/Editor | Džeroski, Sašo; Hristovski, Dimitar; Kunej, Tanja; Peterlin, Borut | |
Title | A data mining approach to the development of a diagnostic test for male infertility | |
Type | članek | |
Source | Stud Health Technol Inform | |
Vol. and No. | Letnik 77 | |
Publication year | 2000 | |
Volume | str. 779-83 | |
Language | eng | |
Abstract | The paper presents a database of published Y chromosome deletions and the results of analyzing the database with data mining and other heuristic techniques with the goal of developing a diagnostic test for male infertility. The database describes 382 patients for which 177 markers were tested. Two data mining techniques, clustering and decision tree induction were used, as well as a heuristic set cover algorithm. Clustering was used to group markers according to their appearance across patients, while a heuristic set covering algorithm was used to select as small a set of markers that cover as many patients with deletions as possible. This algorithm created a diagnostic set of 13 markers that cover more than 90% of the patients with deletions. Finally, decision tree induction was used to relate deletion patterns to the severity of the clinical phenotype. A decision tree induced from the data uses 5 markers, all of which are also in the diagnostic set of 13 markers, to show relations between the severity of the clinical phenotype and deletion patterns which have not been known previously. | |
Descriptors | INFERTILITY, MALE Y CHROMOSOME CHROMOSOME DELETION GENETIC MARKERS DECISION MAKING, COMPUTER-ASSISTED DECISION TREES |