Author/Editor     Dimec, Jure
Title     Združevanje informacij z analizo povedne moči različnih vrst slovenskih medicinskih besedil in možnosti njihovega iskanja z ne-Boolovimi metodami
Type     monografija
Place     Ljubljana
Publisher     Medicinska fakulteta
Publication year     1995
Volume     str. 108
Language     slo
Abstract     Unstructured texts in the Slovene and English natural languages are important type of data for any medical professional or researcher, but no information retrieval (IR) tool exist for their effective organization in databases. In this doctoral disertation we tried to accomplish some research tasks leading to the development of such tool and to set its programme foundations. The main research topics of the disertation are: a) the incorporation of Slovene and English unstructured medical texts in the common databases, by automatic selection and stemming of keyword, and c) text retrieval with natural language search requests and the ranking of search results in accordance to their similarity with the search requests. The methods used for the automatic keyword selection and stemming of Slovene texts were based upon the previous research done in our research community (5-9) and supplemented in this work. Analogous methods for the English texts were selected according to the results published in literature (43, 22, 23). Different variants of vector space and probalistic methods, including the automatic reformulation of search requests with word stems from the texts (relevance feedback metod) were used for the calculation of similarity between search requests and texts. Some changes of these basic methods were introduced, the consideration of influence of the relative importance of search request words in the probailistic method, and the so called "fast search" - the simplified vector-space method which doesn't use the document frequencies of word stems among them. The collection of medical text was compiled of 385 Slovene abstracts and their English translations originating from newspapers Medicinski razgledi and Zdravniški vestnik. The results of search methods were evaluated using the collection of 50 search requests in natural (Slovene and English) languages and the list of texts relevant for each search request.(trunc.)
Descriptors     INFORMATION STORAGE AND RETRIEVAL
LINGUISTICS