Author/Editor     Dimec, Jure
Title     Medjezično iskanje dokumentov
Type     članek
Source     Knjižnica
Vol. and No.     Letnik 46, št. 1-2
Publication year     2002
Volume     str. 77-110
Language     slo
Abstract     The article reviews the motivation behind the development of cross-language information retrieval (CLIR) - a relatively new area of information retrieval in multilingual textual databases, defines its objectives and position among the research disciplines dealing with various aspects of processing electronic texts. A short historical overview is followed by a description of the most important methodologies (document translation and query translation) and language resources used in connection with them. Regarding the resources, attention is focused on the two- and multilingual ontologies (thesauri, transfer lexicons and similarity thesauri), corpora, their construction and use with the CLIR experiments. T'he article primarily aims at illustrating the various methodological approaches, while the functioning of particular systems is less prominent. There are no references to the condition of CLIR in Slovenia or to the existence of language resources suitable for processing Slovenian texts in CLIR systems, since this topic calls for a separate review.
Summary     Članek utemeljuje potrebo po razvoju medjezičnega iskanja (MI), relativno novega področja shranjevanja in iskanja informacij v večjezičnih tekstovnih zbirkah, definira njegove cilje in umeščenost med raziskovalnimi področji, ki se ukvarjajo z različnimi vidiki obravnave besedil v elektronski obliki. Kratkemu pregledu zgodovine sledi opis najpomembnejših metodoloških pristopov v MI (prevajanje dokumentov, prevajanje iskalnih zahtev) in jezikovnih virov, ki so pri tem v uporabi. Med viri je največ pozornosti posvečene dvo- in večjezičnim ontologijam (tezavrom, slovarjem, prevajalskim leksikonom in tezavrom kolokacij), korpusom, njihovi gradnji in uporabi pri eksperimentih MI. Članek poskuša predvsem ilustrirati pestrost metodologije področja in manj delovanje konkretnih sistemov. Stanje MI v Sloveniji in obstoj jezikovnih virov, primernih za vključevanje slovenskih besedil v medjezične sisteme, nista obravnavana, ker je to tematika, ki zahteva poseben pregled.
Descriptors     INFORMATION STORAGE AND RETRIEVAL
MULTILINGUALISM
DATABASES, BIBLIOGRAPHIC
UNIFIED MEDICAL LANGUAGE SYSTEM
SUBJECT HEADINGS
DICTIONARIES, POLYGLOT
VOCABULARY, CONTROLLED
TRANSLATIONS