Претрага
97 items
-
Managing mining project documentation using human language technology
Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentationAleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
-
Indexing of textual databases based on lexical resources: A case study for Serbian
In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...... from our collection, and additionally it does not take into account MWUs. The approach described in this paper bases lemmatization on morphological electronic dictionaries and finite state transducers for Serbian [6]. 4.1 Used Resources Lexical Resources. The resources for natural language processing ...
... pre-indexing already outperforms it, its main advan- tage is that it can be improved and there are various means to do that: – Enriching morphological e-dictionaries of simple words and MWUs by terms from geological domain; – Addapting NERs to the new domain and text type (project rather than newspapers) ...
... search and access to information they need. The content on the portal can be grouped into several categories: cartographic content, multimedia, dictionaries and textual databases. The “core” is the whole information system of the Geological Dictionary (Thesaurus) containing about 4,000 geological terms ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15
-
GIS Application Improvement with Multilingual Lexical and Terminological Resources
... application. Morphological dictionaries of simple words and compounds are in the so called LADL format (Courtois et al., 1990), and basically consist of lemmas accompanied by inflectional class codes, which enable automatic production of all inflectional forms. The Serbian morphological dictionary ...
... data from morphological dictionaries of simple words. Automatic creation of lemmas for compounds is of special importance for technical applications, as is the case here. Namely, it often happens that a technical term, which is frequently a compound, is not in the morphological e-dictionary ...
... resources, morphological dictionaries and transducers in the first place. For illustration purposes, the query for retrieval of geological units containing ‘limestone’ (‘krečnjak’ in Serbian) in their description field was submitted twice: once without and once with morphological expansion. ...Ranka Stanković, Ivan Obradović, Olivera Kitanović. "GIS Application Improvement with Multilingual Lexical and Terminological Resources" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta, May 2010, Valetta, Malta : European Language Resources Association (2010)
-
Building Terminological Resources in an e-Learning Environment
... content of the database. Some other controlled dictionaries that can be derived from RudOnto are the Geostatistics dictionary, Mine safety protection dictionary, Mineral resource exploitation dictionary, Petroleum exploitation dictionary, but also dictionaries of general terms, namely those not strictly ...
... is still being intensively enlarged and refined. However, it is also already being used, among other things for the production of controlled dictionaries related to planning and management of exploitation, to mine safety protection systems, mining equipement management, human resources management ...
... whereas all others are represented as synonyms. As we have already mentioned, RudOnto is used, among other things, for production of controlled dictionaries. A controlled dictionary is a consistent collection of terms selected for a specific purpose, chosen by its author. For example, a ...Ranka Stanković, Ivan Obradović, Olivera Kitanović, Ljiljana Kolonja. "Building Terminological Resources in an e-Learning Environment" in Proceedings of the Third International Conference on e-Learning, eLearning-2012, September 2012, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2012)
-
The Many Faces of SrpKor
Акроним СрпКор означава фамилију електронских корпуса савременог српског језика чија је изградња почела крајем седамдесетих година прошлога века, а која је постала шире видљива заинтересованој истраживачкој заједници објављивањем његове прве верзије на вебу 2002. године. У овом дугом периоду, посебно пре појаве корисних текстуелних ресурса на вебу, развој корпуса се састојао у прикупљању и обради грађе као и у развоју метода обраде корпуса. Наиме, електронски корпус није само колекција текстова у дигиталном облику (како се то, на пример, наводи ...Duško Vitas, Ranka Stanković, Cvetana Krstev. "The Many Faces of SrpKor" in South Slavic Languages in the Digital Environment JuDig Book of Abstracts, University of Belgrade - Faculty of Philology, Serbia, November 21-23, 2024, University of Belgrade - Faculty of Philology (2024.)
-
A Mathematical Learning Environment Based on Serbian Language Resources
In recent years, in line with ever growing usage of Information technology, the learning environments are changing. The amount of available learning materials in various forms has increased. These new environments demand comprehensive learning systems, which enable management of the learning corpus with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling ...... of mathematical content and provides mechanisms for processing and search of this content. It relies on existing lexical resources, morphological e-dictionaries and WordNet of Serbian, which have been developed within the University of Belgrade Human Language Technology group for several decades ...
... the Latin alphabet. In textual format the corpus contains 1,802,519 simple forms of which 118,027 are different. Existing Serbian morphological e-dictionaries of simple forms (DELAS) and inflected forms (DELAF) contain 135,000 lemmas [11], among which only 65 are marked as belonging to the ...
... as: ENG3013860281nimplication:4, logicalimplication:1, conditional relation: ENG30-13859307-n difference:4. The enrichment of morphological dictionaries and SWN should be complemented by content synchronization (entries and literals), as well as domain markers. In the existing dictionary ...Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan. "A Mathematical Learning Environment Based on Serbian Language Resources" in Proceedings of the 7th International Scientific Conference Technics and Informatics in Education, Faculty of Technical Sciences, Čačak (2018)
-
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... description of morphological features MULTEXT-East (Erjavec, 2004) was used in several projects (Kešelj et al., 2004), (Popović, 2009). Serbian morphological dictionaries of simple words and compounds developed in the LADL format (Courtois & Silberztein, 1990) use different morphological description ...
... Solution Our main goal was to produce a new systematic morphological description of Serbian: (1) that would not deviate much from the traditional description; (2) that would be compatible with already developed morphological dictionaries; (3) that would be compatible with MULTEXT-East description; ...
... attribute. For instance, in our description for verbs verbal forms are strictly separated from tense, voice and aspect, as usual in LADL morphological dictionaries in general. Constraints specify which forms are used with these verbal features, and which of them are realized as simple or compound. ...Cvetana Krstev, Ranka Stanković, Vitas Duško. "A Description of Morphological Features of Serbian: a Revision using Feature System Declaration" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta : European Language Resources Association (2010)
-
Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources
Large collections of textual documents represent an example of big data that requires the solution of three basic problems: the representation of documents, the representation of information needs and the matching of the two representations. This paper outlines the introduction of document indexing as a possible solution to document representation. Documents within a large textual database developed for geological projects in the Republic of Serbia for many years were indexed using methods developed within digital humanities: bag-of-words and named ...... keywords, abstract, and geographical location. These metadata were used for generating a bag of words for each document with the aid of morphological dictionaries and transducers. Named entities within metadata were also recognized with the help of a rule- based system. Both the bag of words and the ...
... applies lexical resources for A u t h o r P r o o f Improving Document Retrieval in Large Domain Specific Textual Databases 3 Serbian—morphological dictionaries and the WordNet—for text categorization. Mladenović and associates [18] use the same resources for document-level sen- timent polarity ...
... in developing an improved solution for search- ing the textual database of geological projects described in this paper is based on morphological electronic dictionaries and finite-state transducers for Serbian [12]. 3.1 Used Resources Lexical Resources. The resources for natural language processing ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources" in Trans. Computational Collective Intelligence - Lecture Notes in Computer Science 26, Springer (2017). https://doi.org/10.1007/978-3-319-59268-8_8
-
Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities
Овај рад представља активности на развоју корпуса ELEXIS-sr, српском додатку вишејезичном анотираном корпусу ELEXIS-а, који се састоји од семантичких анотација и репозиторија значења речи. ELEXIS је паралелни вишејезични анотирани корпус на десет европских језика, који може да се користи као вишејезички репер за евалуацију европских језика са мање и средње развијеним ресурсима. Фокус овог рада је на вишечланим изразима и именованим ентитетима, њиховом препознавању у скупу реченица ELEXIS-sr и поређењу са анотацијама на другим језицима. Разматрају се први кораци ...Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov. "Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
-
Improvement of geodatabase queries within GeolISS
Ranka Stanković (2008)... result. WS4LR handles simultaneously several types of resources, one of them being the system of morphological dictionaries of Serbian simple words and compounds in LADL format. Morphological dictionaries in the same format exist for many other languages, including French, English, Greek, Portuguese ...
... substantially improved by using various lexical resources, such as morphological dictionaries and a geological dictionary. These lexical resources used within WS4QE (Workstation for query expansion) enable semantic and morphological expansion of the query, the latter being very important in highly ...
... importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources. Morphological dictionaries enable morphological expansion of the query, very important in highly inflective languages, such as Serbian. The geological dictionary, developed ...Ranka Stanković. "Improvement of geodatabase queries within GeolISS" in Review of the National Center for Digitization, Beograd : Faculty of Mathematics, Belgrade (2008)
-
Multi-word Expressions for Abusive Speech Detection in Serbian
Ovaj rad predstavlja istraživanja na usavršavanju i unapređenju srpske verzije rečnika Hurtlex, višejezičnog leksikona uvredljivih reči. Posebnu pažnju posvećujemo dodavanju izraza sa više reči (polileksemskih jedinica) koji se mogu smatrati uvredljivim, jer su takvi leksički zapisi veoma važni za postizanje dobrih rezultata u mnoštvu zadataka otkrivanja uvredljivog jezika. Srpski morfološki rečnici se koriste kao osnova za čišćenje podataka i stvaranje rečnika. Istaknuta je veza sa drugim leksičkim i semantičkim resursima na srpskom jeziku i predviđena je izgradnja sistema za ...... abusive, as such lexical entries are very important in obtaining good results in a plethora of abusive language detection tasks. We use Serbian morphological dictionaries as a basis for data cleaning and MWE dictionary creation. A connection to other lexical and semantic resources in Serbian is outlined and ...
... are working on is the first one of its kind. Still, some resources that will facilitate abusive language detection already exist. Serbian Morphological Dictionaries are certainly a staple in processing texts in Serbian (Krstev, 2008). In order to process implicitly abusive language, we need to take into ...
... lemmas, written in both Latin and Cyrillic alphabet. After alphabet unification, 1819 unique lemmas were first analysed using the Serbian Morphological Dictionaries (Krstev, 2008). The manual check-up of unrecognised words followed, resulting in the removal of 803 entries (602 unique). Our next task ...Ranka Stanković, Jelena Mitrović, Danka Jokić, Cvetana Krstev. "Multi-word Expressions for Abusive Speech Detection in Serbian" in Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, Association for Computational Linguistics (2020)
-
SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian
Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković (2019)У овом раду представљамо модел за избор добрих примера за речник српског језика и развој иницијалних компоненти модела. Метода која се користи заснива се на детаљној анализи различитих лексичких и синтактичких карактеристика у корпусу састављених од примера из пет дигитализованих свезака речника САНУ. Почетни скуп функција био је инспирисан сличним приступом и за друге језике. Дистрибуција карактеристика примера из овог корпуса упоређује се са карактеристиком дистрибуције узорака реченица ексцерпираних из корпуса који садрже различите текстове. Анализа је показала да ...Српски, добри примери из речника, аутоматизација израде речника, издвајање својстава, Машинско учење... descriptive dictionaries of the Serbian language. The approach was motivated by the need for modernization of the dictionary-making process for the dictionary of the Serbian Academy of Sciences and Arts (SASA), a large monolingual thesaurus of Serbian, as well as for the production of new dictionaries of Serbian ...
... various different goals: speeding up the dictionary-making process, but also the development of a lexical database as the source for building new dictionaries of Serbian. 248 Proceedings of eLex 2019 In the e-lexicography era, with the imperatives of faster dictionary-making and “smart le ...
... expert knowledge, is the basis for the improvement of dictionary example selection which will be useful both for the production of different dictionaries of Serbian and the forthcoming volumes of the SASA dictionary. Section 2 describes some steps towards modernization of the dictionary-making ...Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković. "SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian" in Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference , Lexical Computing CZ, s.r.o. (2019)
-
E-Connecting Balkan Languages
In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.... fost izgoniţi din Illinois. 2.2 Morphological Dictionaries in LADL Format Morphological dictionaries are a necessary resource in various phases of the automatic analysis of text. The tool WS4LR expects morphological dictionaries to be in the format known as DELAS/DELAF presented ...
... inflection” should be checked. Morphological expansion is performed by Unitex modules that use morphological dictionaries of simple words as well as inflectional transducers. This options works only if a particular query keyword is listed in the morphological dictionary of the corresponding ...
... expansions for those languages as well. For Greek [12] and Romanian [3], morphological dictionaries in LADL format were also developed – however, these resources were not at our disposal so we could not experiment with morphological expansion for these languages. The possibility and the need for ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva. "E-Connecting Balkan Languages" in Proceedings of the Workshop Workshop on Multilingual resources, technologies and evaluation for Central and Eastern European Languages, 17 September 2009, eds. C. Vertan, S. Piperidis, E. Paskaleva and Milena Slavcheva, Borovets, Bulgaria : Association for Computational Linguistics Stroudsburg, PA, USA (2009)
-
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... Lexical resources Lexical Resources are used to enhance and refine users’ queries. The query expansion is supported by e-dictionaries (Serbian morphological e-dictionaries), general purpose semantic networks (English and Serbian WordNets) and domain terminological bases and ontologies. A Dictionary ...
... performs semantic and multilingual query expansion. If the user so specifies, Bibliša forwards this query for further morphological expansion, based on morphological e-dictionaries, the system of rules for multi-word inflection, and finite automata and transducers (Krstev et al, 2008; Krstev 2008) ...
... Browse Biblisha XQuery + XML Lexical resources & web services Biblimir RudOnto GeolISSTerm E-dictionaries Grammers …. Vebran Semantical and multilingual query expansion Morphological query expansion LeXimir Keyword query expansion 7 ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
-
Using technology for knowledge transfer between academia and enterprises
Ivan Obradović, Ranka Stanković (2014)... grammars. Bilingual dictionaries in electronic form are one of the simplest multilingual lexical resources. However, for their full functionality in languages with complex morphology, such as Serbian, they need to be coupled with morphological dictionaries. Morphological dictionaries of Serbian simple ...
... International Conference on e-Learning, eLearning-2012, 114-119. Stanković, R., Obradović, I., Krstev, C., & Vitas, D. (2011). Production of morphological dictionaries of multi-word units using a multipurpose tool. In Proceedings of the Computational Linguistics-Applications Conference, CLA '11, 77-84 ...
... representing specific concepts, called synsets, with a semantic network formed on basis of semantic relations between them. Akin to standard dictionaries, each synset word, or literal, is composed of a literal string and a sense tag, representing the sense of the literal string specific to that ...Ivan Obradović, Ranka Stanković. "Using technology for knowledge transfer between academia and enterprises" in Knowledge and Management Models for Sustainable Growth, Proc. of IFKAD 2014, 9th International Forum on Knowledge Asset Dynamics, 11-13 June 2013, Matera, Italy, Bari : IFKAD (2014)
-
Building learning capacity by blending different sources of knowledge
... resources in general are bilingual dictionaries in electronic form. However, for their full functionality in languages with complex morphology, such as Serbian, they need to be coupled with language specific morphological dictionaries. Morphological dictionaries of Serbian simple words and compounds ...
... for many other languages, including English and Russian, which are also envisaged as OER languages within the BAEKTEL network. Besides morphological dictionaries, for full functionality of the language support system grammars are also needed, and they are implemented by the so called finite state ...
... International Conference on e-Learning, eLearning-2012, 114-119. Stanković, R., Obradović, I., Krstev, C., & Vitas, D. (2011). Production of morphological dictionaries of multi-word units using a multipurpose tool. In Proceedings of the Computational Linguistics-Applications Conference, CLA '11, 77-84 ...Ivan Obradović, Ranka Stanković, Olivera Kitanović, Dalibor Vorkapić. "Building learning capacity by blending different sources of knowledge" in International Journal of Learning and Intellectual Capital (2016). https://doi.org/10.1504/IJLIC.2016.075698
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
-
Developing Students’ Mining and Geology Vocabulary Through Flashcards and L1 in the CLIL Classroom
... minimum autonomy at the tertiary level starts at around 3000 words allowing a learner to read a text without the need to refer constantly to dictionaries or the teacher, we hypothesized that with the use of flashcards and judicious use of L1 in the CLIL environment, our students will have larger ...
... of deep strategies which take more time but ensure greater retention and ease retrieval memory (Nation 2003 22). By making extensive use of dictionaries and exposure to relevant items, this type of learning builds up deeper knowledge, and from a cost vs. benefit view, cost of teaching them is justified ...
... students may study, or, alternatively, flashcards containing the requested term can be given to the students to fill in by using a wide range of dictionaries and appropriate sources, (Figure 1). Flashcards of sufficient quality may be added to the central terminology database of the Faculty. ...Lidija Beko, Ivan Obradović, Ranka Stanković. "Developing Students’ Mining and Geology Vocabulary Through Flashcards and L1 in the CLIL Classroom" in Proceedings of the Second International Conference on Teaching English for Specific Purposes and New Language Learning Technologies, May, 22-24, 2015, Niš, Serbia, Faculty of Electronic Engineering, University of Niš, Niš : Faculty of Electronic Engineering (2015)
-
The Use of the Omeka Semantic Platform for the Development of the University of Belgrade, Faculty of Mining and Geology Digital Repository
Under the regulations of the Ministry of Education, Science and technological Development, a digital repository based on the Omeka S data storage platform has been developed for the Faculty of Mining and Geology. The platform has been upgraded with the required modular extensions, Solr index and automatic OCR. Furthermore, document indexing and search have been fine-tuned with the aid of e-dictionaries of the Serbian language, which has brought about outstanding results in terms of usage facilitation and overall ...Petar Popović, Mihailo Škorić, Biljana Rujević. "The Use of the Omeka Semantic Platform for the Development of the University of Belgrade, Faculty of Mining and Geology Digital Repository" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2020.20.1_2.9
-
Sentiment Analysis of Serbian Old Novels
In this paper we present first study of Sentiment Analysis (SA) of Serbian novels from the 1840-1920 period. The preparation of sentiment lexicon was based on three existing lexicons: NRC, AFFIN and Bing with additional extensive corrections. The first phase of dataset refinement included filtering the word that are not found in Serbian morphological dictionary and in second automatic POS tagging and lemma were manually corrected. The polarity lexicon was extracted and transformed into ontolex-lemon and published as initial ...Ranka Stanković, Miloš Košprdić, Milica Ikonić Nešić, Tijana Radović. "Sentiment Analysis of Serbian Old Novels" in Proceedings of the 2nd Workshop on Sentiment Analysis and Linguistic Linked Data, June 2022, Marseille, France, European Language Resources Association (2022)