Indexing of textual databases based on lexical resources: A case study for Serbian
Објеката
- Тип
- Поглавље у монографији
- Верзија рада
- објављена верзија
- Језик
- енглески
- Креатор
- Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović
- Извор
- Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers
- Уредник
- Cardoso Jorge, Guerra Francesco, Houben Geert-Jan, Miguel Pinto Alexandre, Velegrakis Yannis
- Издавач
- Springer
- Датум издавања
- 2015
- почетак странице
- 161
- крај странице
- 181
- doi
- 10.1007/978-3-319-27932-9_15
- isbn
- 978-3-319-27932-9
- Шира категорија рада
- M10
- Ужа категорија рада
- M13
- Права
- Отворени приступ
- Лиценца
- Creative Commons – Attribution-Share Alike 4.0 International
- Формат
- issn
- 0302-9743
- Сажетак
- In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these metadata using morphological dictionaries and transducers, and named entities within the metadata were recognized using a rule-based system. Both were then used for indexing documents and ranking was based on tf idf measure. Evaluation of ranked retrieval results based on data obtained by pre-indexing are compared to results obtained by informational retrieval without pre-indexing with Precision-Recall Curve, showing a signifcant improvement in terms of Mean Average Precision measure (MAP).
- Медија
- IKC-2015
Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15