GIS Application Improvement with Multilingual Lexical and Terminological Resources
... geodatabase. Multilingual labelling and annotation of maps for their graphic display and printing have been tested with Serbian, which describes regional information in the local language, and English, used for sharing geographic information with the world, although the geological vocabulary offers the ...
... expansion. Multilingual labelling and annotation of maps for their graphic display and printing have been tested with Serbian, which describes regional information in the local language, and English, used for sharing geographic information with the world. However, other languages could also have been ...
... data practically language independent, thus substantially broadening the group of potential users. Figure 3: An example of automatic map annotation in English and Serbian 5. GeolISS Query ...Ranka Stanković, Ivan Obradović, Olivera Kitanović. "GIS Application Improvement with Multilingual Lexical and Terminological Resources" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta, May 2010, Valetta, Malta : European Language Resources Association (2010)
EUROLAN 2021: Introduction to Linked Data for Linguistics Online Training School
Prva škola za obuku polaznika koju je organizovala COST akcija NexusLinguarum održana je od 8. do 12. februara 2021. godine sa ciljem da studenti, istraživači i stručnjaci nauče osnove lingvističke nauke o podacima. Tokom obuke polaznici su se upoznali sa širokim spektrom tema: od semantičkog veba, RDF -a i ontologija, do modeliranja i pretraživanja jezičkih podataka pomoću najsavremenijih ontoloških modela i alata. Škola je održana u okviru serije letnjih škola EUROLAN-a i organizovalo ju je virtuelno (onlajn) nekoliko instituta; ...nauka o lingvističkim podacima, povezani podaci u lingvistici, jezički podaci, EUROLAN, NexusLinguarum, COST akcija, škola za obuku... Danka, Ranka Stanković, Cvetana Krstev, and Branislava Šandrih. 2021. “A Twitter Corpus and lexicon for abusive speech detection in Serbian.” In Proceedings of the 2021 Language, Data and Knowledge (LDK), 1-3 September in Zaragoza, Spain. McCrae, John P, Julia Bosque-Gil, Jorge Gracia, Paul Buitelaar, and ...
... (Re- source Description Framework Schema, variously abbreviated as RDFS, RDF(S), RDF-S, or RDF/S), Web Ontology Language (OWL),5 etc.); – SPARQL query language- a semantic query language for databases able to retrieve and manipulate data stored in the RDF format; 2. EUROLAN 3. Deliverable D1.1 4. ...
... Vila-Suero, and Guadalupe Aguado-De-Cea. 2014. “Enabling Language Resources to Expose Trans- lations as Linked Data on the Web.” In Proceedings of the 9th LREC, edited by Nicoletta Calzolari (Conference Chair) et al. Reykjavik, Ice- land: European Language Resources Association (ELRA), May. isbn: 978-2-9517408-8-4 ...Milan Dojchinovski, Julia Bosque Gil, Jorge Gracia, Ranka Stanković. "EUROLAN 2021: Introduction to Linked Data for Linguistics Online Training School" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.7
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... www.dr.rgf.bg.ac.rs Towards Automatic Definition Extraction for Serbian Stanković Ranka1, Krstev Cvetana1, Stijović Rada2, Gočanin Mirjana2, Škorić Mihailo1 1 University of Belgrade, Serbia 2 Institute for the Serbian Language of SASA, Serbia Abstract The paper presents preliminary results ...
... results of the automatic extraction of candidates for dictionary definitions from unstructured texts in the Serbian language with the aim of accelerating dictionary development. Definitions in the Serbian Academy of Sciences and Arts (SASA) dictionary were used to model different definition types (descriptive ...
... achieved by using electronic dictionaries of the Serbian language and local grammars developed based on the results of some previous similar research, particularly (Barnbrook 2002), analysis of definitions in existing dictionaries and previous research into Serbian (Krstev et al. 2015). 3 Analysis and R ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
A Data Driven Approach for Raw Material Terminology
Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja (2021)The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has ...sirovine, rudarstvo, terminologija, rečnik, terminološka aplikacija, mobilna aplikacija, digitizacija, leksički podaci, korpusi, otvoreni povezani podaci... few such dictionaries in the Serbian language, as most of the published Serbian terminological dictionaries are only translational (bilingual or multilingual). An ongoing activity is the adaptation of English definitions, which are the most comprehensive in DMMRT, to Serbian, in the post-editing phase ...
... Natural Language Processing (NLP), to develop a semi-automatic pipeline for dictionary production. The approach is focused on raw material terminology, with an emphasis on terminology related to the mining industry, as a case study, the main goal being to cover Serbian and bilingual English-Serbian terminology ...
... Sketch-engine [9] for several languages, but has not yet been used for Serbian. A similar approach to the one outlined in this paper was applied in development of the Sõnaveeb language portal of the Institute of the Estonian Language, which contains data from a number of dictionaries and termbases, with ...Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja. "A Data Driven Approach for Raw Material Terminology" in Applied Sciences, MDPI AG (2021). https://doi.org/10.3390/app11072892
Softverski alati za korišćenje resursa za srpski jezik
Ivan Obradović, Ranka Stanković (2008)... the most important lexical re- sources for Serbian developed within the Human Language Technology Group. More precisely, about the three basic resources encompassed by WS4LR and WS4QE, namely the system of mor- phological dictionaries of Serbian, the Serbian wordnet and aligned texts. 2.1 Morphological ...
... in one language (for example Serbian) may be created on basis of an existing synset in another language (for example English). In order to support this feature, the module en- ables the usage of bilingual, parallel word lists which can help in translation of synset literals in one language synset ...
... BalkaNet languages are spoken, but also from France and Netherlands. A national development team was formed for each language, and in the case of Serbian this team was the Human Language Technology Group at the University of Belgrade. Upon the termination of this project, the development of SWN contin- ...Ivan Obradović, Ranka Stanković. "Softverski alati za korišćenje resursa za srpski jezik" in INFOteka: časopis za informatiku i bibliotekarstvo, Belgrade, Serbia : Zajednica biblioteka univerziteta u Srbiji (2008)
Дигиталне библиотеке у рударству и геологији са посебним освртом на представљање сиве литературе
Имајући у виду потребу за проналажењем информација похрањених у различитим облицима документације која се генерише у областима рударства и геологије на Рударско-геолошком факултету Универзитета у Београду, отпочет је процес развоја дигиталне библиотеке ROmeka@RGF, на платформи за приказивање дигиталних колекција - Омека. Значајан део документације представља такозвана сива литература која је претежно заступљена у виду вишетомне документацијe. Први савладани изазов представљало је повезивање различитих вишетомних делова пројектних извештаја у једну целину која би била лако доступна и претражива.... which are designed to define document relations. We will also present some language resources for Serbian language which are used to improve information retrieval. Keywords: digital libraries, grey literature, Omeka, language resources, dictionaries. ...
... to Improve the Performance of Web Search Engines”. Sixth International Conference on Language Resources and Evaluation (LREC ‘08), Marrakech, Morocco. Nicoletta Calzolari et al. (ur.). Marrakech : European Language Resources Association (ELRA), 2008. . Okoroma, Francisca. „Grey Literature Management ...
... Search of Multilingual Digital Libraries of E-journals”. Eighth International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey. Nicoletta Calzolari et al. (ur.). Istanbul : European Language Resources Association (ELRA), 2012. 1710-1717. Stanković, Ranka, Cvetana Krstev, Ivan Obradović ...Биљана Лазић, Александра Томашевић, Михаило Шкорић. "Дигиталне библиотеке у рударству и геологији са посебним освртом на представљање сиве литературе" in Научна конференција Библиоинфо — 55 година од покретања наставе библиотекарства на високошколском нивоу, Београд 18. мај 2017., Филолошки факултет Универзитета у Београду (2019). https://doi.org/10.18485/biblioinfo.2017.ch13
Combining Heterogeneous Lexical Resources
... tasks of the Natural Language Processing Group at the Faculty of Mathematics, University of Belgrade is the development of various lexical resources. Among them the two most important ones are: the system of morphological dictionaries of Serbian (SMD) in Intex format and the Serbian wordnet (SWN) developed ...
... documents. Figure 1 shows the graphical representation of XSD schema of Serbian WN. The XML Path Language (XPath) provides a language for addressing parts of an XML document. XPath treats an XML document as a tree of interrelated branches and nodes. A node in a XML document can be an element ...
... started good many years before WN, so it more thoroughly covers the language. As a consequence, the Serbian MD can benefit less form the WN then vice versa. For that reason, the production of the fully semantically marked Serbian DELAS has been postponed until the two resources will become comparable ...Cvetana Krstev, Duško Vitas, Ranka Stanković, Ivan Obradović, Gordana Pavlović-Lažetić. "Combining Heterogeneous Lexical Resources" in Proceedings of the Fourth Interantional Conference on Language Resources and Evaluation, Lisabon, Portugal , May 2004, vol. 4, ELRA - European Language Resources Association (2004)
Advantages of python programming language in hydrological model development
Milan Tucaković, Dragoljub Bajić, Vesna Ristić Vakanjac, Dušan Polomčić . "Advantages of python programming language in hydrological model development" in Proceedings of the XVIII Serbian Geological Congress, Divčibare, Serbia, 01-04 June 2022, Serbian Geological Society (2022)
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... (2008). Unitex 2.1 User Manual, http://www-igm.univ-mlv.fr/~unitex/UnitexManual2.1 .pdf. Popović, Z. (2009) Taggers Applied On Texts On Serbian Language, Language Tools And Machine Learning. In Infotheca, Vol. X, No. 2, (to appear). Przepiórkowski, A. and Woliński, M. (2003) A Flexemic Tagset For ...
... 07 Language resource management - Feature Structures – Part 2: Feature System Declaration, ISO/TC 37/SC 4. ISO. (2009) ISO 12620 Terminology and other language and content resources – Data Categories – Specification of data categories and management of a data category registry for language resources ...
... the satisfactory solution. 1. Motivation Description of morphological features of a language is a prerequisite for many NLP applications. This description can be simple or complex depending both on a language and application in question. Considerable efforts in standardizing such a description ...Cvetana Krstev, Ranka Stanković, Vitas Duško. "A Description of Morphological Features of Serbian: a Revision using Feature System Declaration" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta : European Language Resources Association (2010)
A Tool for Enhanced Search of Multilingual Digital Libraries of E-journals
This paper outlines the main features of Bibliša, a tool that offers various possibilities of enhancing queries submitted to large collections of TMX documents generated from aligned parallel articles residing in multilingual digital libraries of e-journals. The queries initiated by a simple or multiword keyword, in Serbian or English, can be expanded by Bibliša, both semantically and morphologically, using different supporting monolingual and multilingual resources, such as wordnets and electronic dictionaries. The tool operates within a complex system composed ...... can be either Serbian or English, thus they switch places as the source and target language. The example in Figure 1 shows a TU from an INFOtheca journal article in English translated into Serbian: Figure 1. A translation unit (TU) from text in English translated into Serbian Besides TMX ...
... simple and multiword keywords in more than one language. In addition to that, user queries can be expanded, both semantically and morphologically, the latter being very important in highly inflective languages, such as Serbian. Open access Serbian scientific journals are increasingly present ...
... metadata All metadata, except language independent data, such as the numeration metadata (, , , , ), the and , are entered in both languages (Serbian and English), using the attribute xml:lang to denote the language of the content (see Figure 2) ... Ranka Stanković, Cvetana Krstev, Ivan Obradović, Aleksandra Trtovac, Miloš Utvić. "A Tool for Enhanced Search of Multilingual Digital Libraries of E-journals" in Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, May 2012, Istanbul, Turkey, Istanbul, Turkey : European Language Resources Association (2012)
OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian
Ovaj rad predstavlja novi jezički resurs za pretraživanje i istraživanje verbalnih aspektnih parova u BCS (bosanskom, hrvatskom i srpskom), kreiran korišćenjem principa Lingvističkih Povezanih Otvorenih Podataka (LLOD). Pošto ne postoji resurs koji bi pomogao učenicima bosanskog, hrvatskog i srpskog kao stranih jezika da prepoznaju aspekt glagola ili njegove parove, kreirali smo novi resurs koji će korisnicima pružiti informacije o aspektu, kao i link ka aspektnim parovima glagola. Ovaj resurs takođe sadrži spoljne linkove ka monolingvalnim rečnicima, Wordnetu i BabelNetu. ...Ranka Stanković, Maxim Ionov, Medina Bajtarević, Lorena Ninčević. "OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
English for Geology Students 1 – Dyslexia friendly
Lidija Beko (2023)Lidija Beko. English for Geology Students 1 – Dyslexia friendly, Belgrade : The Faculty of Mining and Geology, 2023
Quantitative analysis of syllable properties in Croatian, Serbian, Russian, and Ukrainian
Biljana Rujević, Marija Kaplar, Sebastijan Kaplar, Ranka Stanković, Ivan Obradović, Jan Mačutek (2021)Biljana Rujević, Marija Kaplar, Sebastijan Kaplar, Ranka Stanković, Ivan Obradović, Jan Mačutek. "Quantitative analysis of syllable properties in Croatian, Serbian, Russian, and Ukrainian" in Language and Text: Data, models, information and applications, John Benjamins Publishing Company (2021). https://doi.org/10.1075/cilt.356.04ruj
English for Geology Students 2 - Dyslexia friendly
Lidija Beko (2023)Lidija Beko. English for Geology Students 2 - Dyslexia friendly, Belgrade : The Faculty of Mining and Geology, 2023
Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection
Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić (2022)In this paper we present the Serbian part of the ELTeC multilingual corpus of novels written in the time period 1840-1920. The corpus is being built in order to test various distant reading methods and tools with the aim of re-thinking the European literary history. We present the various steps that led to the production of the Serbian sub-collection: the novel selection and retrieval, text preparation, structural annotation, POS-tagging, lemmatization and named entity recognition. The Serbian sub-collection was published ...Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić. "Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection" in Proceedings of the Language Resources and Evaluation Conference, June 2022, Marseille, France, European Language Resources Association (2022)
Improvement of geodatabase queries within GeolISS
Ranka Stanković (2008)... expansion of the query, very important in highly inflective languages, such as Serbian. The geological dictionary, developed within GeolISS, supports semantic and multilingual expansions of the query. The Human Language Technology group at the University of Belgrade (HLT) has been developing various ...
... Vitas D., G. Pavlović-Lažetić, C. Krstev, Lj. Popović, I. Obradović (2003): „Processing Serbian Written Texts: An Overview of Resources and Basic Tools“, Proceedings of the International Workshop on Balkan Language Resources and Tools, Thessaloniki, Greece, November 2003, S. Piperidis, V. Karakaletsis ...
... Resources in Developing Serbian Wordnet”, Romanian J. Information Science and Technology, Romanian Academy, vol. 7, No. 1–2, pp. 147–161, (2004) [12] Krstev, C., Vitas, D., Maurel, D., Tran, M. (2005). “Multilingual Ontology of Proper Names”. In Proc. of Second Language & Technology Conference ...Ranka Stanković. "Improvement of geodatabase queries within GeolISS" in Review of the National Center for Digitization, Beograd : Faculty of Mathematics, Belgrade (2008)
Sentiment Analysis of Serbian Old Novels
In this paper we present first study of Sentiment Analysis (SA) of Serbian novels from the 1840-1920 period. The preparation of sentiment lexicon was based on three existing lexicons: NRC, AFFIN and Bing with additional extensive corrections. The first phase of dataset refinement included filtering the word that are not found in Serbian morphological dictionary and in second automatic POS tagging and lemma were manually corrected. The polarity lexicon was extracted and transformed into ontolex-lemon and published as initial ...Ranka Stanković, Miloš Košprdić, Milica Ikonić Nešić, Tijana Radović. "Sentiment Analysis of Serbian Old Novels" in Proceedings of the 2nd Workshop on Sentiment Analysis and Linguistic Linked Data, June 2022, Marseille, France, European Language Resources Association (2022)
A Mathematical Learning Environment Based on Serbian Language Resources
In recent years, in line with ever growing usage of Information technology, the learning environments are changing. The amount of available learning materials in various forms has increased. These new environments demand comprehensive learning systems, which enable management of the learning corpus with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling ...... Environment Based on Serbian Language Resources Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] A Mathematical Learning Environment Based on Serbian Language Resources | Radojičić ...
... Sciences, Čačak, Serbia, 25-27th May 2018 Session 2: IT Education and Practice UDC: 51-7 248 A Mathematical Learning Environment Based on Serbian Language Resources Marija Radojičić1*, Ivan Obradović2, Ranka Stanković2, Miloš Utvić3, Sebastijan Kaplar1 1 University of Novi Sad/Faculty of Technical ...
... of this content. It relies on existing lexical resources, morphological e-dictionaries and WordNet of Serbian, which have been developed within the University of Belgrade Human Language Technology group for several decades [1], as well as a newly developed glossary, Termi. The system is ...Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan. "A Mathematical Learning Environment Based on Serbian Language Resources" in Proceedings of the 7th International Scientific Conference Technics and Informatics in Education, Faculty of Technical Sciences, Čačak (2018)
From ELTeC Text Collection Metadata and Named Entities to Linked-data (and Back)
In this paper we present the wikification of the ELTeC (European Literary Text Collection), developed within the COST Action ``Distant Reading for European Literary History'' (CA16204). ELTeC is a multilingual corpus of novels written in the time period 1840—1920, built to apply distant reading methods and tools to explore the European literary history. We present the pipeline that led to the production of the linked dataset, the novels’ metadata retrieval and named entity recognition, transformation, mapping and Wikidata population, ...Milica Ikonić Nešić, Ranka Stanković, Christof Schöch and Mihailo Škorić. "From ELTeC Text Collection Metadata and Named Entities to Linked-data (and Back)" in Proceedings of The 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference, June 2022, Marseille, France, European Language Resources Association (2022)
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval... semantic and morphological expansion of the query, the latter being very important in highly inflective languages, such as Serbian. Wordnets can also be used for adding another language to a query, if appropriate, thus making the query bilingual. Problems encountered in retrieving documents of interest ...
... precision in retrieving documents from the web we have developed WS4QE (Work Station for Query Expansion) which uses various language resources we have developed for Serbian (Krstev et al., 2008). These resources include morphological e-dictionaries and finite state transducers, which offer the ...
... in Cyrillic (Figure 3). 2 For reasons of flexibility letters specific for the Serbian language ć, č, š ,ž ,đ, dž, lj and nj, are internally coded as cx, cy, sx, zx, dx, dy, lx and nx, respectively) 222 Figure 2. Semantic expansion ...Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)