Претрага
465 items
-
Development of Open Educational Resources (OER) for Natural Language Processing
In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...... the necessary knowledge to use the existing resources for NLP for Serbian and to develop new ones. Keywords: E-Learning, Open Educational Resources, Computational Linguistics, Lexical Resources, edX 1. INTRODUCTION Open educational resources (OER) publicly available on the web are growing ...
... MWUs, particularly their inflection that has to consider complex rules for MWU inflection in Serbian. 10. The use of powerful morphological mode is presented that enables the use of lexical resources at sub-word level, as well as the use of information from e-dictionaries for output trans ...
... We hope that the developed OER for lexical recognition in NLP will be used in order to reduce the lack of similar courses. We hope that participants will easily acquire the necessary knowledge to use the existing resources for NLP for Serbian and that the number of resource users will ...Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
-
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... tić, G., Vitas, D., and Obradović, I. (2004). Using textual and lexical resources in developing Serbian Wordnet. SCIENCE AND TECHNOLOGY, 7(1- 2):147–161. Kwong, O. Y. (1998). Aligning WordNet with additional lexical resources. Usage of WordNet in Natural Lan- guage Processing Systems. Lenci ...
... notoriously requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA. Keywords: lexical semantic resources, sense alignment, lexicography, language resource 1. Introduction Lexical semantic resources (LSRs) are knowledge reposi- tories that provide ...
... Section 6. 2. Related work Aligning senses across lexical resources has been attempted in several lexicographical milieus over the recent years. Such resources mainly include open-source dictionaries, WordNet and collaboratively-curated resources, such as Wikipedia. The latter has been shown to be ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
-
A Mathematical Learning Environment Based on Serbian Language Resources
In recent years, in line with ever growing usage of Information technology, the learning environments are changing. The amount of available learning materials in various forms has increased. These new environments demand comprehensive learning systems, which enable management of the learning corpus with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling ...... learning corpus with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling semantic search of mathematical content ...
... type of support for Serbian is still not available. Existing Serbian lexical resources and tools enable efficient text search, including semantic and morphological expansion of user queries, the latter being very important in highly inflective languages, such as Serbian. Of special importance ...
... education in Serbian, is given. The salient feature of the system is strong lexical support. Within MLES various types of lexical resources are used as well as local grammars, with the aim to provide a comprehensive and searchable learning environment. Although the general lexica in Serbian is well ...Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan. "A Mathematical Learning Environment Based on Serbian Language Resources" in Proceedings of the 7th International Scientific Conference Technics and Informatics in Education, Faculty of Technical Sciences, Čačak (2018)
-
Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC
OntoLex, dominantni standard zajednice za mašinski čitljive leksičke resurse u kontekstu RDF-a, Linked Data i tehnologija Semantičkog veba, trenutno se proširuje sa posebnim modulom za Frekvencije, Primere i Informacije zasnovane na Korpusu (OntoLex-FrAC). Predlažemo novi komponent za OntoLex-FrAC, koji se bavi inkorporacijom korpusnih upita za (a) povezivanje rečnika sa korpusnim mašinama, (b) omogućavanje RDF baziranih web servisa da dinamički razmenjuju korpusne upite i podatke odgovora, i (c) korišćenje konvencionalnih upitačkih jezika za formalizaciju unutrašnje strukture kolokacija, skica reči i ...standardizacija, digitalna leksikografija, OntoLex, upiti korpusa, povezani podaci, Lingvistički povezani otvoreni podaciChristian Chiarcos, Ranka Stanković, Maxim Ionov, Gilles Sérasset. "Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC" in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, 20-25 May 2024, LREC (2024)
-
E-Connecting Balkan Languages
In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.... have been mainly used for Serbian they are by no means language dependent as long as compatible lexical resources exist for any two languages. Nevertheless, a full potential of these tools was until now used only for Serbian, and in bilingual context, for Serbian and English. In this paper ...
... them. 2. Integrated Language Resources In order to prove the usability of WS4LR and WS4QE for languages other then Serbian and English we used various resources, both textual and lexical. In the following sections we will briefly present these resources, what methodological framework was ...
... both from Serbian, for which they were initially developed, and from English which seems to be in the background of many natural language processing tools. The main presupposition for the usage of these tools for other languages is the existence of textual and lexical resources developed in ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva. "E-Connecting Balkan Languages" in Proceedings of the Workshop Workshop on Multilingual resources, technologies and evaluation for Central and Eastern European Languages, 17 September 2009, eds. C. Vertan, S. Piperidis, E. Paskaleva and Milena Slavcheva, Borovets, Bulgaria : Association for Computational Linguistics Stroudsburg, PA, USA (2009)
-
From DELA Based Dictionary to Leximirka Lexical Database
Biljana Lazić, Mihailo Škorić (2020)In this paper, we will present an approach in transforming Serbian language Morphological dictionaries from a DELA text format to a lexical database dubbed Leximirka. Considering the benefits of storing data within a database when compared to storing them in textual documents, we will outline some of the functionality that the database has made possible. We will also show how hand-made rules that use category labels lexical entries are marked with can be used to link lexical entries. ...... ontolex that describes lexical entry (morphological, semantic and ontological description). 3 Transition to lexical database 3.1 Motivation Automatisation of the management of Serbian Morphological Dictionar- ies started with the implementation of the Workstation for Lexical Resources WS4LR (Krstev et al ...
... records to the WordNet for the Serbian language. It is also envisaged to prepare the data for display in the form of Linked Open Data on the web, which would enable connection with other lexical resources. Since the application is independent of the language for which it is used, it is expected that Leximirka ...
... 1997 Krstev, Cvetana. Processing of Serbian. Automata, Texts and Electronic Dic- tionaries. Faculty of Philology of the University of Belgrade, 2008 Krstev, Cvetana, Ranka Stanković, Duško Vitas and Ivan Obradović. “WS4LR - a Worksation for Lexical Resources”. In Proceedings of the Fifth Interantional ...Biljana Lazić, Mihailo Škorić. "From DELA Based Dictionary to Leximirka Lexical Database" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.4
-
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... Bibliša: the system’s components 4.1 Lexical resources Lexical Resources are used to enhance and refine users’ queries. The query expansion is supported by e-dictionaries (Serbian morphological e-dictionaries), general purpose semantic networks (English and Serbian WordNets) and domain terminological ...
... collections with one English term, “e- learning”. Out of five lexical resources integrated in Bibliša, it is only in RudOnto that is adequate entry found, but without any Serbian equivalence. We enhance the search with a direct translation in Serbian “e-učenje”. As a result, we obtain 84 concordance lines ...
... issued in Serbian or English, can be expanded to the other language, both morphologically and semantically. Thus it offers a novel access to digital content to its users. In addition to that, Bibliša presents an original approach to successful combining of several components: Lexical resources, Library ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
-
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
-
A Data Driven Approach for Raw Material Terminology
Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja (2021)The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has ...sirovine, rudarstvo, terminologija, rečnik, terminološka aplikacija, mobilna aplikacija, digitizacija, leksički podaci, korpusi, otvoreni povezani podaci... Kitanović, O. Bilingual lexical extraction based on word alignment for improving corpus search. Electron. Libr. 2019, 37, 722–739. [CrossRef] 28. Radojičić, M.; Obradović, I.; Stanković, R.; Utvić, M.; Kaplar, S. A Mathematical Learning Environment Based on Serbian Language Resources. In Proceedings of ...
... this goal is to adopt the Linked (Open) Data (LOD) paradigm for publishing lexical resources, that is, to use URIs for unambiguously identifying lexical entries, their components and their relations in the web of data—to make lexical datasets accessible via http(s), to publish them in accordance with W3 ...
... developing this system, a data driven approach is adopted, relying on available textual, lexical and terminological resources, both in printed and electronic form. Within the development of this system, printed resources, the paper dictionaries covering raw material terminology, were subjected to systematic ...Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja. "A Data Driven Approach for Raw Material Terminology" in Applied Sciences, MDPI AG (2021). https://doi.org/10.3390/app11072892
-
OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian
Ovaj rad predstavlja novi jezički resurs za pretraživanje i istraživanje verbalnih aspektnih parova u BCS (bosanskom, hrvatskom i srpskom), kreiran korišćenjem principa Lingvističkih Povezanih Otvorenih Podataka (LLOD). Pošto ne postoji resurs koji bi pomogao učenicima bosanskog, hrvatskog i srpskog kao stranih jezika da prepoznaju aspekt glagola ili njegove parove, kreirali smo novi resurs koji će korisnicima pružiti informacije o aspektu, kao i link ka aspektnim parovima glagola. Ovaj resurs takođe sadrži spoljne linkove ka monolingvalnim rečnicima, Wordnetu i BabelNetu. ...Ranka Stanković, Maxim Ionov, Medina Bajtarević, Lorena Ninčević. "OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
-
On the compatibility of lexical resources for NooJ
Lexical resources for many languages are provided for the NooJ linguistic development environment. Meta-data descriptions of morphosyntactic and semantic properties of these languages and their resources are a mandatory part of each language module. In this paper we analyze how well the meta-data actually describe resources for a chosen subset of languages and to what extent are they compatible across languages to support multilingual processing. We show that there is place for improvement in both directions.... 03:23:36 On the compatibility of lexical resources for NooJ Ranka Stanković, Miloš Utvić, Duško Vitas, Cvetana Krstev, Ivan Obradović Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] On the compatibility of lexical resources for NooJ | Ranka Stanković, Miloš ...
... Proceedings of the 2011 International NooJ Conference 1 ON THE COMPATIBILITY OF LEXICAL RESOURCES FOR NOOJ RANKA STANKOVIĆ, MILOŠ UTVIĆ, DUŠKO VITAS, CVETANA KRSTEV AND IVAN OBRADOVIĆ Abstract Lexical resources for many languages are provided for the NooJ linguistic development environment ...
... for improvement in both directions. Introduction: Motivation, resources and task Lexical resources for NooJ are now available in a considerable number of different languages. The compatibility of these monolingual resources, namely the extent to which they mutually correspond is thus becoming ...Ranka Stanković, Miloš Utvić, Duško Vitas, Cvetana Krstev, Ivan Obradović. "On the compatibility of lexical resources for NooJ" in Automatic Processing of Various Levels of Linguistic Phenomena: Selected Papers from the 2011 International Nooj Conference, Cambridge Scholars Publishing (2012): 96-108
-
Managing mining project documentation using human language technology
Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentationAleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
-
Multi-word Expressions for Abusive Speech Detection in Serbian
Ovaj rad predstavlja istraživanja na usavršavanju i unapređenju srpske verzije rečnika Hurtlex, višejezičnog leksikona uvredljivih reči. Posebnu pažnju posvećujemo dodavanju izraza sa više reči (polileksemskih jedinica) koji se mogu smatrati uvredljivim, jer su takvi leksički zapisi veoma važni za postizanje dobrih rezultata u mnoštvu zadataka otkrivanja uvredljivog jezika. Srpski morfološki rečnici se koriste kao osnova za čišćenje podataka i stvaranje rečnika. Istaknuta je veza sa drugim leksičkim i semantičkim resursima na srpskom jeziku i predviđena je izgradnja sistema za ...... plethora of abusive language detection tasks. We use Serbian morphological dictionaries as a basis for data cleaning and MWE dictionary creation. A connection to other lexical and semantic resources in Serbian is outlined and building of abusive language detection systems based on that connection is foreseen ...
... 26.7 48 41 2.3 Total 2518 1747 141 105 6.0 Table 1: Statistic of lexical cleaning. Bearing in mind that the initial version of HurLex for Serbian was mostly done automatically, without support of any tools and resources for Serbian language processing, such results were expected and certainly indicate ...
... 1621–1622. Biljana Lazić and Mihailo Škorić. From dela based dictionary to leximirka lexical database. Jelena Mitrović, Miljana Mladenović, and Cvetana Krstev. 2015. Adding mwes to serbian lexical resources using crowdsourcing. In poster presented at The 5th PARSEME general meeting. Ias, i, Romania ...Ranka Stanković, Jelena Mitrović, Danka Jokić, Cvetana Krstev. "Multi-word Expressions for Abusive Speech Detection in Serbian" in Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, Association for Computational Linguistics (2020)
-
Production of morphological dictionaries of multi-word units using a multipurpose tool
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...electronic dictionary, Serbian, morphology, inflection, multi-word units, noun phrases, query expansion... and WNDictAuto.dll (Fig. 2). For communication with lexical resources LeXimir makes use of the NlpQuery.dll module. Modular organization of components provides two obvious benefits. In the first place, it enables the use of various resources in any part of the system, wherever they are needed. ...
... “The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines,” in 6th LREC, Marrakech, Marocco, 2008. [11] C. Krstev, R. Stanković, D. Vitas, and S. Koeva, “E-Connecting Balkan Languages,” in Proc. of the Workshop on Multilingual Resources, Tech- nologies and Evaluation ...
... 2005, pp. 10–11. [4] C. Krstev, Processing of Serbian — Automata, Texts and Electronic Dictionaries. Belgrade: Faculty of Philology, University of Belgrade, 2008. [5] A. Savary, “Computational Inflection of Multi-Word Units — A Con- trastive Study of Lexical Approaches,” Linguistic Issues in Language ...Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas. "Production of morphological dictionaries of multi-word units using a multipurpose tool" in Proceedings of the Computational Linguistics-Applications Conference, October 2011, Jachranka, Poland, Jachranka, Poland : PTI - Polish Information Processing Society (2011)
-
Resource-based WordNet Augmentation and Enrichment
In this paper we present an approach to support production of synsets for SerbianWordNet(SerWN)byadjustingPrincetonWordNet(PWN)synsetsusing several bilingual English-Serbian resources. PWN synset definitions were automatically translated and post-edited, if needed, while candidate literals for Serbian synsets were obtained automatically from a list of translational equivalents compiled form bilingual resources. Preliminary results obtained from a setof1248selectedPWNsynsetsshowthattheproducedSerbiansynsetscontain 4024 literals, out of which 2278 were offered by the system we present in this paper, whereas experts added the remaining 1746. Approximately one half of ...... (2006). WS4LR: A Workstation for Lexical Resources. In Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006, pages 1692–1697. Krstev, C., Stanković, R., and Vitas, D. (2010). A Description of Morphological Features of Serbian: a Re- vision using Feature System ...
... SerWN. A brief description of these resources follows. Parallel list is a simple bilingual parallel list, developed gradually from various resources and used as an auxiliary resource in WS4LR (later upgraded and dubbed LeXimir), a workstation for lexical resources we have developed (Krstev et al., 2006) ...
... solved the word sense alignment (WSA) task by pairing senses with the same meaning from different lexical-semantic resources. Besides alignment with a developed wordnet, the use of other available resources for development and enrichment of wordnets have also been proposed. Thus, Oliver and Climent (2014) ...Ranka Stanković, Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev. "Resource-based WordNet Augmentation and Enrichment" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018)
-
Improvement of geodatabase queries within GeolISS
Ranka Stanković (2008)... obtained by the query, can be substantially improved by using various lexical resources. Morphological dictionaries enable morphological expansion of the query, very important in highly inflective languages, such as Serbian. The geological dictionary, developed within GeolISS, supports semantic ...
... Lisabon, Portugal, May 2004, vol. 4, pp. 1103-1106. [11] Krstev C., Pavlović-Lažetić G., Vitas D., Obradović I.: “Using Textual and Lexical Resources in Developing Serbian Wordnet”, Romanian J. Information Science and Technology, Romanian Academy, vol. 7, No. 1–2, pp. 147–161, (2004) [12] Krstev, ...
... for Lexical Resources”. In Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, May 2006, pp. 1692–1697 [10] Krstev, C., Vitas D., Stanković R., Obradović I., Pavlović-Lažetić G. (2004) “Combining Heterogeneous Lexical Resources”, in ...Ranka Stanković. "Improvement of geodatabase queries within GeolISS" in Review of the National Center for Digitization, Beograd : Faculty of Mathematics, Belgrade (2008)
-
Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface
Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos (2024)Predstavljamo trenutne aktivnosti na definisanju interfejsa leksikona i korpusa koji će služiti kao referenca u prikazu polileksemskih jedinica - višečlanih izraza - (različitih tipova - imenskih, glagolskih, itd.) u specijalizovanim leksikonima i povezivanju ovih unosa sa njihovim pojavljivanjima u korpusima. Konačni cilj je korišćenje ovakvih resursa za automatsko identifikovanje višečlanih izraza u tekstu. Uključivanje nekoliko prirodnih jezika ima za cilj univerzalnost rešenja koje nije usredsređeno na određeni jezik, kao i prilagođavanje idiosinkrazijama. Raspravljaju se izazovi u leksikografskom opisu višerečnih ...Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos. "Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
-
Proširivanje upita zasnovano na leksičkim resursima
U radu je opisano kako se leksički resursi za srpski jezik i softverski alati, razvijeni u okviru Grupe za jezičke tehnologije Univerziteta u Beogradu, mogu koristiti za unapređenje postavljanja upita. Rezultati pretrage mogu biti značajno unapređeni korišćenjem različitih leksičkih resursa, kakvi su morfološki rečnici i semantičke mreže. Izloženi pristup može se iskoristiti i u Sistemu naučnih, tehnoloških i poslovnih informacija, jer je efikasno pretraživanje ovog dragocenog resursa, imajući u vidu njegovu heterogenost i obim, kao i preovladavajući tekstualni sadržaj, ...... WS4QE, accompanied by several web services, that enables the solution of various tasks via the web. Besides a short description of the lexical resources for Serbian involved, we shall also describe how the functions of the WS4LR tool can be used for their maintenance and development, as well as some ...
... Abstract - This paper presents how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for improvement of queries. Search results can be substantially improved by using various lexical resources, such as morphological dictionaries ...
... Stanković R., Vitas D., Obradović I., “WS4LR: A Workstation for Lexical Resources”, Proc. of the 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, May 2006, pp. 1692- 1697. [7] Stanković R. (2008) „Improvement of geodatabase queries within GeolISS“, Pregled ...Ranka Stanković, Ivan Obradović, Cvetana Krstev. "Proširivanje upita zasnovano na leksičkim resursima" in SNTPI 09 - Naučno-stručni skup Sistem naučnih, tehnoloških i poslovnih informacija, Beograd 19. i 20. jun 2009, Beograd : Fakultet informacionih tehnologija (2009)
-
Terminology Acquisition and Description Using Lexical Resources and Local Grammars
Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...... from it and their incorporation in the existing system of morphological e-dictionaries as MWU extraction relies heavily on existing lexical resources. In the Serbian e-dictionary of MWUs, all en- tries are distributed in classes according to their syntactic structure, or more precisely, according ...
... and Description Using Lexical Resources and Local Grammars Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Terminology Acquisition and Description Using Lexical Resources and Local Grammars | Cvetana ...
... especially for languages that are morphologi- cally complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical re- sources and local grammars developed for Serbian. Special attention is given to auto- matic inflectional class prediction ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015)
-
SrpELTeC: A Serbian Literary Corpus for Distant Reading
U članku je predstavljen SrpELTeC, korpus razvijen u okviru akcije COST Distant Reading for European Literary History (CA16204). Svi romani u SrpELTeC-u su odabrani, pripremljeni i obeleženi korišćenjem zajedničkih principa uspostavljenih za sve jezičke zbirke u Evropskoj zbirci književnog teksta (ELTeC). Navedeni su izazovi i rešenja u pripremi SrpELTeC od nule. Svi romani su ručno kodirani u TEI sa bogatim metapodacima i strukturnim napomenama. Automatska anotacija je uključivala POS-označavanje, lematizaciju i imenovane entitete, oslanjajući se na resurse za obradu ...digital humanities, Serbian literature, text corpora, distant reading , linked data, named entity recognition, text analyticsRanka Stanković, Cvetana Krstev, Duško Vitas. "SrpELTeC: A Serbian Literary Corpus for Distant Reading" in Primerjalna književnost, Research Centre of the Slovenian Academy of Sciences and Arts (2024). https://doi.org/10.3986/pkn.v47.i2.03