BS - CX rezultat iskanja po zbirki Biomedicina Slovenica, poln izpis

Avtor/Urednik		Kastrin, Andrej; Hristovski, Dimitar
Naslov		A Fast document classification algorithm for gene symbol disambiguation in the BITOLA literature-based discovery support system
Tip		članek
Vir		In: Supermondt J, Evans SR, Ohno-Machado L, editors. Biomedical and health informatics: from foundations to applications to policy. AMIA 2008 annual symposium proceedings; 2008 Nov 8-12; Washington. Washington: American medical informatics assocition,
Leto izdaje		2008
Obseg		str. 358-62
Jezik		eng
Abstrakt		Gene symbol disambiguation is an important problem for biomedical text mining systems. When detecting gene symbols in MEDLINE(R) citations one of the biggest challenges is the fact that many gene symbols also denote other, more general biomedical concepts (e.g. CT, MR). Our approach to this problem is first to classify the citations into genetic and non-genetic domains and then to detect gene symbols only in the genetic domain. We used ontological information provided by Medical Subject Headings (MeSH(R)) for this classification task. The proposed algorithm is fast and is able to process the full MEDLINE distribution in a few hours. It achieves predictive accuracy of 0.91. The algorithm is currently implemented in the BITOLA literature-based discovery support system (http://www.mf.uni-lj.si/bitola/).
Deskriptorji		GENES MEDLINE SUBJECT HEADINGS VOCABULARY, CONTROLLED NOMENCLATURE ALGORITHMS