Претрага
465 items
-
WS4LR - a Worksation for Lexical Resources
... 1693 Serbian where two alphabets, Cyrillic and Latin, are used, and lexical and textual resources must exist for both. To that end the HLT group produces resources for Serbian in a special encoding that uses the ASCII character set and that can be unambiguously transformed into Serbian Latin or ...
... workstation for lexical resources, a software tool developed within the Human Language Technology Group at the Faculty of Mathematics, University of Belgrade. The tool is aimed at manipulating heterogeneous lexical resources, and the need for such a tool came from the large volume of resources the Group ...
... 2023-10-14 04:06:36 WS4LR - a Worksation for Lexical Resources Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] WS4LR - a Worksation for Lexical Resources | Cvetana Krstev, Ranka Stanković, Duško Vitas ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović. "WS4LR - a Worksation for Lexical Resources" in Proceedings of the Fifth Interantional Conference on Language Resources and Evaluation, Genoa, Italy, May 2006, ELRA - European Language Resources Association (2006)
-
Vebran Web Services for Corpus Query Expansion
Ranka Stanković, Miloš Utvić (2020)U ovom radu se govori o razvoju veb usluga Vebran i njihovoj primeni u poboljšanju pretraživanja korpusa. Veb-servisi Vebran koriste se za konsultovanje spoljnih leksičkih izvora za srpski jezik (uglavnom elektronski morfološki rečnici i srpski Vordnet) i proširivanje korisničkih upita radi dobijanja relevantnijih rezultata iz srpskih korpusa.... are used to consult external lexical resources for Ser- bian (mainly electronic morphological dictio- naries and Serbian Wordnet) and expand user queries to retrieve more relevant results from Serbian corpora. KEYWORDS: corpus search, web service, Serbian lexical resources, query expansion. PAPER SUBMITTED: ...
... to enable corpus query expansion. 3 Lexical resources In order to improve the current corpus search capabilities based on lin- guistic annotation, it is necessary to consult external lexical resources. The following lexical resources have been developed for Serbian by the HLT Group at the University ...
... support query expansion based on lexical resources. Infotheca Vol. 19, No. 2, December 2019 99 Stanković R. and Utvić M., “Vebran Web Service . . . ”, pp. 99–118 Sections 2 and 3 describe language resources for Serbian, corpora that we can search and lexical resources that Natural Language Processing ...Ranka Stanković, Miloš Utvić. "Vebran Web Services for Corpus Query Expansion" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.5
-
The Dictionary of the Serbian Academy: from the Text to the Lexical Database
In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular Language. Scanning and character recognition were a particular challenge, since various non-standard character set encoding was used in the course of the almost 60-year long production of the dictionary. The first aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized text of and transform it into structured data stored in relational lexical database. This approach ...... hy, lexical database, language resources, dictionary, Serbian language 1 Introduction The first volume of the Dictionary of the Serbo-Croatian Standard and Vernacular Language (re- ferred to as the Dictionary of Serbian Academy or DSA), prepared and compiled by the Institute for the Serbian Language ...
... of the Serbian Academy: from the Text to the Lexical Database Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] The Dictionary of the Serbian Academy: from the Text to the Lexical Database ...
... 941Lexicography in gLobaL contexts The Dictionary of the Serbian Academy: from the Text to the Lexical Database Ranka Stanković1, Rada Stijović2, Duško Vitas1, Cvetana Krstev1, Olga Sabo2 1University of Belgrade, 2Institute for Serbian Language, Serbian Academy of Sciences and Arts E-mail: ranka.stankovic@rgf ...Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)
-
Речници у дигиталном добу - информатичка подршка за српски језик
Биљана Рујевић (2022)Морфолошки речници српског језика представљају електронски језички ресурс који има значајну историју развоја и коришћења за потребе обраде природних језика. С обзиром на то да су чувани у облику датотека чији је број нарастао па је самим тим управљање речницима постало отежано јавила се потреба за смештањем информација из речника у облик лексикографске базе. Како би се омогућио симултани рад на развоју речника за више корисника јавила се потреба за веб-апликацијом заснованој на лексикографској бази. Како би се размотриле ...Биљана Рујевић. Речници у дигиталном добу - информатичка подршка за српски језик, Београд : [Б. Рујевић], 2022
-
The Many Faces of SrpKor
Акроним СрпКор означава фамилију електронских корпуса савременог српског језика чија је изградња почела крајем седамдесетих година прошлога века, а која је постала шире видљива заинтересованој истраживачкој заједници објављивањем његове прве верзије на вебу 2002. године. У овом дугом периоду, посебно пре појаве корисних текстуелних ресурса на вебу, развој корпуса се састојао у прикупљању и обради грађе као и у развоју метода обраде корпуса. Наиме, електронски корпус није само колекција текстова у дигиталном облику (како се то, на пример, наводи ...Duško Vitas, Ranka Stanković, Cvetana Krstev. "The Many Faces of SrpKor" in South Slavic Languages in the Digital Environment JuDig Book of Abstracts, University of Belgrade - Faculty of Philology, Serbia, November 21-23, 2024, University of Belgrade - Faculty of Philology (2024.)
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...... usage of a hybrid approach that combines machine learning and lexical resources. Finally, a user-friendly interface that will enable the use of these resources on the Web is under development. As for the development of the lexical resources, we plan to prepare an ontology for the classification of abusive ...
... the Linked (Open) Data (LOD) paradigm that is used for publishing lexical resources by using URIs to unambiguously identify lexical entries, their components and their relations in the web of data. Moreover, it is used to make lexical data sets accessible via http(s), to publish them in accordance with ...
... Thierry Declerck, Asunción Gómez-Pérez, Jorge Gracia, Laura Hollink, Elena Montiel-Ponsoda, Dennis Spohr, et al. Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation, 46(4):701–719, 2012. doi:10.1007/s10579-012-9182-3. 25 Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
-
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... Linguistics: Human Language Technologies, pages 271–281. Constant, M., Krstev, C., and Vitas, D. (2018). Lexical analysis of serbian with conditional random fields and large-coverage finite-state resources. In Zygmunt Vetu- lani, et al., editors, Human Language Technology. Chal- lenges for Computer Science ...
... Stanković, Miloš Utvić, 2019). 2.1. Serbian morphological dictionaries Serbian morphological dictionaries represent a rich lexical resource, which can be used in various NLP tasks (Krstev, 2008). It is being continually developed and maintained in the lexical database LeXimirka (Stanković et al ...
... version of TreeTag- ger for Serbian (Utvić, 2011), and discussed in Section 4. The paper ends with concluding remarks and an outline of future work in Section 5. 2. Resources The main resources used for the production of the new tag- ger model for Serbian are: (a) Serbian morphological dic- tionaries ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
-
Српски језик у дигиталном добу -- The Serbian Language in the Digital Age
Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević (2012)... existing lexical resources (e. g., WordNet) and grammars ‚ Resources: uality and size of existing text corpora, speech corpora andparallel corpora, quality and cov- erage of existing lexical resources and grammars e relevant tables show that the tools and resources available for Serbian are mostly ...
... 1 0 1 0 1 1 Language Resources (Resources, Data and Knowledge Bases) Text corpora 0,5 1 0,5 1 1 1 0,5 Speech corpora 1 2 4 4 3 3 3 Parallel corpora 3 3 3 2 2 2 3 Lexical resources 1 2 2 2 2 2 2,5 Grammars 1 1 0 1 0 1 1 11: State of language technology support for Serbian ‚ Soware aimed at enhancing ...
... deeper linguistic knowledge to fa- cilitate semantical analysis. Experiments using lexical resources such as machine-readable thesauri or onto- logical language resources (e. g., WordNet for English or SrpNet for Serbian) have demonstrated improve- ments in finding pages using synonyms of the original ...Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević. "Српски језик у дигиталном добу -- The Serbian Language in the Digital Age" in META-NET White Paper Series, G. Rehm, H. Uszkoreit (eds.), Springer (2012)
-
FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain
U radu se daje kratak prikaz teorije semantike okvira, na kojoj je zasnovana leksička baza Frejmnet. Predstavljena je koncepcija ove mreže, kao i mogućnosti njene primene. Predstavljena je i leksička analiza koja se primenjuje u projektu izrade Frejmneta i ukazano na razlike između analize zasnovane na okviru u odnosu na analizu zasnovanu na reči. Zatim je prikazano nekoliko povezanih okvira koje prizivaju reči iz domena rizika. U radu je predstavljena i platforma NLTК pomoću koje se mogu koristiti ...... FrameNet Lexical Database. . . , pp. 7–33 by members of the JeRTeh Society for Language Resources and Technolo- gies.22 A Treegger model for Serbian was trained for tagging (Krstev and Vitas 2005; Utvic 2011), (Stanković et al. 2020, 3957) using a manually an- notated corpus of Serbian morphological ...
... number of corpora and lexical resources. NLTK offers different types of text processing, amongst which are: classification, tokenization, stemming, tagging, parsing and se- mantic reasoning. The NLTK system uses wrappers for other Python natural language processing and lexical resource libraries. One ...
... FrameNet and semantic role labeling programs for Croatian, Slovenian and Serbian. Infotheca Vol. 21, No. 1, September 2021 13 Marković A. et al., FrameNet Lexical Database. . . , pp. 7–33 FrameNet was conceived as a lexical database of English, which incor- porates the databases subsequently developed ...Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović. "FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.1
-
Contrastive Analysis of Syntax Patterns in Comparable Football Corpora in Spanish and Serbian Languages
Jelena Lazarević, Olivera Kitanović (2024.)Cilj rada je istraživanje kolokabilnosti kao načina na koji se leksičke jedinice povezuju sa rečima iz različitih kategorija, formirajući veće jedinice. Istraživanje semantičkih i sintaksičkih principa ovih kombinacija u španskom i srpskom jeziku fudbala izvedeno je na komparabilnim fudbalskim korpusima SrFudKo i EsFudko, razvijenim u okviru doktorske disertacije Jelene Lazarević pod nazivom: Jezičke odlike diskursa novih medija o fudbalu: kontrastivna analiza na korpusu srpskog i španskog jezika. Korpus fudbala SrFudKo, kreiran na osnovu tekstova o fudbalu sa pet srpskih veb-portala: ...Jelena Lazarević, Olivera Kitanović . "Contrastive Analysis of Syntax Patterns in Comparable Football Corpora in Spanish and Serbian Languages" in South Slavic Languages in the Digital Environment JuDig Book of Abstracts, University of Belgrade - Faculty of Philology, Serbia, November 21-23, 2024, University of Belgrade - Faculty of Philology (2024.)
-
Automatic construction of a morphological dictionary of multi-word units
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...electronic dictionary, Serbian, morphology, inflection, multiwordn units, noun phrases, query expansion... (everything between the parenthesis) in most cases already exists in dictionaries of simple words (DELA) we decided to develop a module for our lexical resources management tool LeXimir, an enhancement of its predecessor WS4LR [6], that would help in obtaining this information. However, due to homography ...
... Query Expansion) was developed on basis of LeXimir, and it enables expansion of queries submitted to the Google search engine [6]. Integrated lexical resources enable modifications of user queries for both monolingual and multi-lingual search. The main feature of WS4QE is that it enables inflection of ...
... Finite-State Tool for Multi-Word Units. In: CIAA. (2009) 237–240 6. Krstev, C., Stanković, R., Vitas, D., Obradović, I.: The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines. In: 6th LREC, Marrakech, Marocco (2008) 7. Jacquemin, C.: Spotting and Discovering ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić. "Automatic construction of a morphological dictionary of multi-word units" in Lecture Notes in Computer Science 6233, Advances in Natural Language Processing, Proceedings of the 7thInternational Conference on NLP, IceTAL 2010, Reykjavik, Iceland, August 2010, Springer (2010): 226-237. https://doi.org/10.1007/978-3-642-14770-8_26
-
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... Krstev, C. and Vitas, D. (2007). Extending the Serbian E-dictionary by using lexical transducers. In Formaliser les langues avec l’ordinateur : De INTEX à Nooj, pages 147–168. Krstev, C., Vitas, D., and Erjavec, T. (2004). MULTEXT- East resources for Serbian. In Zbornik 7. mednarodne multikonference ...
... (2006). WS4LR: A Workstation for Lexical Resources. In Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006, pages 1692–1697. Krstev, C., Stanković, R., and Vitas, D. (2010). A Descrip- tion of Morphological Features of Serbian: a Revision using Feature System ...
... relations between lexical entries, nor cross-linking with other lexical models, such as Serbian WordNet, another important lexical resource for Serbian (Koeva et al., 2008). This was the main motiva- tion for transforming SMD dictionaries from the existing file system to a lemon based lexical database. ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
-
An Approach to Development of Bilingual Lexical Resources
... aligned parallel texts [Obradović et al., 2008]. As for available lexical resources, we had at our disposal Serbian morphological e-dictionaries [Krstev, 2008], Serbian and English wordnets (SrpWN and EWN), and a bilingual Serbian-English Dictionary of Library and Information Science technology ...
... concept in both available lexical resources. Terms electronic learning and e-learning and their Serbian translational equivalents elektronsko učenje and e-učenje do not exist in either of the resources. Hence the English synset {electronic learning, e-learning} and its Serbian counterpart {elektronsko ...
... bilingual e-journals in the form of TMX documents, is used for development of a new bilingual lexical resource. The approach relies on already available resources, Serbian morphological e-dictionaries, Serbian and English wordnets connected via the interlingual index, and a bilingual Dictionary of ...Stanković Ranka, Obradović Ivan, Trtovac Aleksandra. "An Approach to Development of Bilingual Lexical Resources" in Proceedings of the Fifth Balkan Conference in Informatics BCI 2012, Workshop on Computational Linguistics and Natural Language Processing of Balkan Languages – CLoBL 2012, September 2012, Novi Sad : BCI (2012)
-
GIS Application Improvement with Multilingual Lexical and Terminological Resources
... vocabulary and other lexical and terminological resources used. Two basic results are outlined: multilingual map annotation and improvement of queries for the GeolISS geodatabase. Multilingual labelling and annotation of maps for their graphic display and printing have been tested with Serbian, which describes ...
... al vocabulary and other lexical and terminological resources used. It also offers examples for both multilingual map annotation and query expansion. Multilingual labelling and annotation of maps for their graphic display and printing have been tested with Serbian, which describes regional ...
... query can be substantially improved by using lexical resources, morphological dictionaries and transducers in the first place. For illustration purposes, the query for retrieval of geological units containing ‘limestone’ (‘krečnjak’ in Serbian) in their description field was submitted twice: ...Ranka Stanković, Ivan Obradović, Olivera Kitanović. "GIS Application Improvement with Multilingual Lexical and Terminological Resources" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta, May 2010, Valetta, Malta : European Language Resources Association (2010)
-
Terminological and lexical resources used to provide open multilingual educational resources
Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...... in the same time language resources: grammars, lexical and textual resources (Image 1). 4. LEXICAL RESOURCES Morphological dictionaries are meant to be used by computers in the process of query expansion. Their usage is necessary because of the rich flexion of Serbian language and other similar ...
... Terminological and lexical resources used to provide open multilingual educational resources Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Terminological and lexical resources used to provide ...
... a brief history and current state of the art of terminological resources are presented, followed by an overview of BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) resources, lexical resources, the process of terminology extraction and a presentation of TERMI ...Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
-
Bilingual lexical extraction based on word alignment for improving corpus search
Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
-
Combining Heterogeneous Lexical Resources
... the development of various lexical resources. Among them the two most important ones are: the system of morphological dictionaries of Serbian (SMD) in Intex format and the Serbian wordnet (SWN) developed in the scope of the Balkanet project. Although these two resources represent dictionaries of a ...
... as the Serbian MD of compounds grows. Finally, the developed tool does not perform any of the tasks automatically, although that solution was also under consideration. Since the Serbian traditional lexical resources can not be directly used for the production of electronic resources, and almost ...
... in electronic form, the Serbian resources presented in this paper have been manually produced, checked and double checked. Our standpoint is that only when reliable lexical resources in electronic form are fully developed it will be possible to produce new resources automatically. Bibliography ...Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić. "Combining Heterogeneous Lexical Resources" in Proceedings of the Fourth Interantional Conference on Language Resources and Evaluation, Lisabon, Portugal , May 2004, vol. 4, ELRA - European Language Resources Association (2004)
-
Softverski alati za korišćenje resursa za srpski jezik
Ivan Obradović, Ranka Stanković (2008)... jezik;Software tools for Serbian lexical resources Ivan Obradović, Ranka Stanković Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Softverski alati za korišćenje resursa za srpski jezik;Software tools for Serbian lexical resources | Ivan Obradović, Ranka Stanković ...
... them is the system of morphological dictionaries of Serbian (SMD). Another highly important and developed resource is the Serbian wordnet (SWN), a lexical database representing the semantic network of words in Serbian. With- in this group of resources, the multilingual onto- logical dictionary of proper ...
... for example, several different types of e-dictionaries, along with other lexical and textual resources, are being developed within the Human Language Technology Group, which SOFTWARE TOOLS FOR SERBIAN LEXICAL RESOURCES Ivan Obradović, Ranka Stanković, Faculty of Mining and Geology Faculty ...Ivan Obradović, Ranka Stanković. "Softverski alati za korišćenje resursa za srpski jezik" in INFOteka: časopis za informatiku i bibliotekarstvo, Belgrade, Serbia : Zajednica biblioteka univerziteta u Srbiji (2008)
-
The Nooj System as Module within an Integrated Language Processing Environment
... Electronic Lexical Database”, The MIT Press (1998) ISO LMF, Language resource management - Lexical markup framework (LMF), (2006), ISO/TC 37/SC 4 N130 Rev.9, ISO CD 24613:2006. Krstev C., Pavlović-Lažetić G., Vitas D., Obradović I.: Using Textual and Lexical Resources in Developing Serbian Wordnet ...
... G.: Processing Serbian Written Texts: An Overview of Resources and Basic Tools., Workshop on Balkan Language Resources and Tools, Thessaloniki, Greece, eds, S. Piperidis and V. Karkaletsis, pp. 97-104, 2003. Vossen, P. (ed.): EuroWordNet: A Multilingual Database with Lexical Semantic Networks ...
... and use of lexical resources, manage the exchange of data between and among these resources, and to enable the merging of large numbers of different individual electronic resources to form large global electronic resources, so conversion of NooJ resources to LMF format (Lexical markup framework) ...Ranka Stanković, Duško Vitas, Cvetana Krstev. "The Nooj System as Module within an Integrated Language Processing Environment" in Proceedings of the 2007 International Nooj Conference, Cambridge Scholars Publishing (2008)
-
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval... substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic and morphological expansion of the query, the latter being very important in highly inflective languages, such as Serbian. Wordnets can also be used for adding ...
... and Serbian. In the case of garlic the appropriate query should be composed of the keywords beli luk, češnjak, Allium sativum, and garlic. It is not to be expected that a common user would normally possess the knowledge necessary to expand a query in this way. 3. The lexical resources used ...
... of query expansions, depending on the resources and type of expansion. Web service WS4QE uses classes from .NET dll components developed within WS4LR (WorkStation for Lexical Resources) (Krstev et al., 2006), which enable the usage of lexical resources for query expansion. The web service ...Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)