Претрага ⚒ Радови ⚒ Др РГФ - Репозиторијум РГФ

Претрага

Per page

Sort by

391 items

Part of Speech Tagging for Serbian language using Natural Language Toolkit

Ranka Stanković, Boro Milovanović (2020)

Dok se razvijaju složeni algoritmi za NLP (obrada prirodnog jezika), osnovni zadaci kao što je označavanje ostaju veoma važni i još uvek izazovni. NLTK (Natural Language Toolkit) je moćna Python biblioteka za razvoj programa zasnovanih na NLP-u. Pokušavamo da iskoristimo ovu biblioteku za kreiranje PoS (vrsta reči) oznake za savremeni srpski jezik. Jedanaest različitih modela je kreirano korišćenjem NLTK API-ja za označavanje. Najbolji modeli se transformišu sa Brill tagerom da bi se poboljšala tačnost. Obučili smo modele na označenom ...

obrada prirodnog jezika, mašinsko učenje, neuronske mreže

... Index Terms—Natural Language Processing; Machine Learning; Neural Network. I. INTRODUCTION In the last couple of years, a big advancement in the field of Natural Language Processing has occurred. There are state-of- the-art language models that perform exceptionally in various language tasks [1-3] ...
... Statistical Part-of-Speech Tagger,” Proc. Sixth Applied Natural Language Processing Conference, Seattle, Washington, USA, 2000 [22] E. Brill, “A simple rule-based part of speech tagger”, Proc. Third conference on Applied natural language processing (ANLC '92), Stroudsburg, Pennsylvania, USA, Mar. 1992 ...
... available at: www.dr.rgf.bg.ac.rs Abstract—While complex algorithms for NLP (Natural language processing) are being developed, base tasks such as tagging remain very important and still challenging. NLTK (Natural Language Toolkit) is a powerful Python library for developing programs based on NLP ...
Ranka Stanković, Boro Milovanović. "Part of Speech Tagging for Serbian language using Natural Language Toolkit" in 7th International Conference on Electrical, Electronic and Computing Engineering IcETRAN 2020, Academic Mind, Belgrade (2020)
An Approach to Efficient Processing of Multi-Word Units

Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas (2013)

Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...

Natural Language Processing, Grammatical Category, Lexical Representation, MWU, multi-word unit

... and Their Automatic Processing. Bulag — Bulletin de Linguistique Appliquée et Générale 32, 73–94 (2007) 19. Savary, A., Rabiega-Wisniewska, J., Wolinski, M.: Inflection of Polish Multi-Word Proper Names with Morfeusz and Multiflex. In: Aspects of Natural Language Processing, Lecture Notes in Computer ...
... Cvetana Krstev, Ivan Obradović, Ranka Stanković, and Duško Vitas 1 Introduction Morphological electronic dictionaries of Serbian for natural language processing (NLP) are being developed for many years now. Their development follows the methodology and format (known as DELAS/DELAF) presented for ...
... use of finite automata in the lexical representation of natural language. In: Electronic dictionaries and automata in computational linguistics, Lecture Notes in Computer Science, vol. 377, pp. 34–50. Springer (1989) 6. Krstev, C.: Processing of Serbian — Automata, Texts and Electronic Dictionaries ...
Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
Terminological and lexical resources used to provide open multilingual educational resources

Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić (2016)

Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...

otvoreni obrazovni resursi, leksički resursi, obrada prirodnih jezika, terminologija

... resources, Natural Language Processing, Terminology 1. INTRODUCTION Natural Language Processing (NLP) has a two-faceted approach to education where one involves e-learning and computer-assisted learning and instruction and the other consists of NLP tools for analysis and use of language by machines ...
... Greenhow, J. Sonnevend and C. Agur, Ed. Cambridge, MA: MIT Press, 2016, pp. 22. [3] D. Litman, “Natural language processing for enhancing teaching and learning,” in Proc. Natural language processing for enhancing teaching and learning, 2016, pp. 4170–4176. [4] T. M. Cabré Castellví, Terminology: ...
... standardisation so more accurate translations are produced. To summarize above mentioned, terminology now constitues a very important field of Natural Language Processing whilethe work that has been done in the field of terminologyhas become to be an indespensible, widespread used resource. The standards ...
Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
Creation of a Training Dataset for Question-Answering Models in Serbian

Ranka Stanković, Jovana Rađenović, Maja Ristić, Dragan Stankov (2024)

Razvoj i primena veštačke inteligencije u jezičkim tehnologijama značajno su napredovali poslednjih godina, posebno u domenu zadatka odgovaranja na pitanja (Question Answering - QA). Dok su postojeći resursi za QA zadatke razvijeni za glavne svetske jezike, srpski jezik je relativno zanemaren u ovoj oblasti. Ovaj rad predstavlja inicijativu za kreiranje obimnog i raznovrsnog skupa podataka za obučavanje modela za odgovaranje na pitanja na srpskom jeziku, koji će doprineti unapređenju jezičkih tehnologija za srpski jezik. Pored brojnih istraživanja o jezičkim modelima ...

veštačka inteligencija, obrada prirodnog jezika, jezički resursi, anotirani skupovi, ekstrakcija informacija, odgovaranje na pitanja

Ranka Stanković, Jovana Rađenović, Maja Ristić, Dragan Stankov. "Creation of a Training Dataset for Question-Answering Models in Serbian" in South Slavic Languages in the Digital Environment JuDig Book of Abstracts, University of Belgrade - Faculty of Philology, Serbia, November 21-23, 2024, University of Belgrade - Faculty of Philology (2024)
Parallel Bidirectionally Pretrained Taggers as Feature Generators

Ranka Stanković, Mihailo Škorić, Branislava Šandrih Todorović (2022)

In a setting where multiple automatic annotation approaches coexist and advance separately but none completely solve a specific problem, the key might be in their combination and integration. This paper outlines a scalable architecture for Part-of-Speech tagging using multiple standalone annotation systems as feature generators for a stacked classifier. It also explores automatic resource expansion via dataset augmentation and bidirectional training in order to increase the number of taggers and to maximize the impact of the composite system, which ...

анотација, обрада природног језика, издвајање обележја, композитне структуре, врста речи

Ranka Stanković, Mihailo Škorić, Branislava Šandrih Todorović. "Parallel Bidirectionally Pretrained Taggers as Feature Generators" in Applied Sciences, MDPI AG (2022). https://doi.org/10.3390/app12105028
FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain

Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović (2021)

U radu se daje kratak prikaz teorije semantike okvira, na kojoj je zasnovana leksička baza Frejmnet. Predstavljena je koncepcija ove mreže, kao i mogućnosti njene primene. Predstavljena je i leksička analiza koja se primenjuje u projektu izrade Frejmneta i ukazano na razlike između analize zasnovane na okviru u odnosu na analizu zasnovanu na reči. Zatim je prikazano nekoliko povezanih okvira koje prizivaju reči iz domena rizika. U radu je predstavljena i platforma NLTК pomoću koje se mogu koristiti ...

Srpski jezik, semantika okvira, FrameNet, scenario rizika, rudarski korpus, obrada prirodnog jezika

... included. KEYWORDS: Serbian language, frame semantics, FrameNet, risk scenario, mining corpus, natural language processing. PAPER SUBMITTED: 15 July 2021 PAPER ACCEPTED: 6 September 2021 Aleksandra Marković aleksan- dra.markovic@isj.sanu.ac.rs Institute for Serbian Language, SASA Belgrade, Serbia ...
... Scientific paper 3 NLTK FrameNet Wrappers NLTK (Natural Language Toolkit) is an easy-to-use natural language pro- cessing Python suite that accesses continually increasing number of corpora and lexical resources. NLTK offers different types of text processing, amongst which are: classification, tokenization ...
... in lexicography, it is important to list the most frequent collocates of a LU; collocations are crucial not only in language learning, but also in different natural language processing tasks). Using the word sketch and the collocation risk of (ризик од) as a starting point, a detailed view of the co ...
Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović. "FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.1
Knowledge Graphs in the Era of Large Language Models: Opportunities and Challenges

Danka Jokić, Ranka Stanković, Jelena Jaćimović (2024)

Pojava velikih jezičkih modela (eng. Large Language Models ili LLMs) je značajno uticala na oblast veštačke inteligencije, naročito u oblastima obrade prirodnog jezika i generisanju teksta. Međutim, ključno ograničenje ovih modela leži u nedostatku strukturiranog znanja i sposobnosti zaključivanja, što otežava njihovu primenu u stvarnom svetu, gde se zahteva tačnost iznetih činjenica i zaključivanje na osnovu konteksta. S druge strane, grafovi znanja nude primamljivo rešenje. Oni pružaju bogat izvor strukturiranog znanja, tako što predstavljaju entitete i njihove relacije u ...

grafovi znanja, veliki jezički modeli, obrada prirodnog jezika, strukturirano znanje, kvalitet podataka, objašnjiva veštačka inteligencija, bezbednost sadržaja na internetu

Danka Jokić, Ranka Stanković, Jelena Jaćimović. "Knowledge Graphs in the Era of Large Language Models: Opportunities and Challenges" in South Slavic Languages in the Digital Environment JuDig Book of Abstracts, University of Belgrade - Faculty of Philology, Serbia, November 21-23, 2024., University of Belgrade - Faculty of Philology (2024)
Development of Open Educational Resources (OER) for Natural Language Processing

Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević (2015)

In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...

E-Learning, Open Educational Resources, Computational Linguistics, Lexical Resources, edX

... l protection, geology and natural language processing, the last being in the focus of this paper. Why Study Natural Language Processing (NLP) and Computational Linguistics (CL)? Natural language processing is the technology for dealing with human language, as it appears in everyday spoken ...
... LINGUISTICS AND NATURAL LANGUAGE PROCESSING Computational linguistics (CL) is a theoretical discipline between linguistics and computer science concerned with understanding and modelling the written and spoken language from a computational aspect.[3]Natural Language Processing (NLP) develops ...
... (OER) for Natural Language Processing Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Development of Open Educational Resources (OER) for Natural Language Processing | Cvetana ...
Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++

Branislava Šandrih, Ranka Stanković (2020)

U nauci, industriji i mnogim istraživačkim oblastima, terminologija se brzo razvija. Najčešće, jezik koji je „lingua franca“ za većinu ovih oblasti je engleski. Kao posledica toga, za mnoga polja termini domena su koncipirani na engleskom, a kasnije se prevode na druge jezike. U ovom radu predstavljamo pristup za automatsko izdvajanje dvojezične terminologije za englesko-srpski jezički par koji se oslanja na usaglašeni dvojezični korpus domena, ekstraktor terminologije za ciljni jezik i alat za usklađivanje delova. Ispitujemo performanse metode na domenu ...

ekstrakcija terminologije, validacija terminologije, GIZA++, grafovi, Unitex, klasifikacija teksta

... Improve Machine Translation in a Computer Aided Translation Environment”. Natural Language Engineering Vol. 23, no. 5 (2017): 763–788 Baldwin, Timothy and Su Nam Kim. “Multiword Expressions”. Handbook of Natural Language Processing Vol. 2 (2010): 267–292 Bouamor, Dhouha, Nasredine Semmar and Pierre Z ...
... Translations”. Natural Language Engineer- ing Vol. 23, no. 1 (2017): 31–51 Hamon, T. and N. Grabar. “Adaptation of Cross-lingual Transfer Methods for the Building of Medical Terminology in Ukrainian”. In Proceedings of the 17th International Conference on Intelligent Text Processing and Computational ...
... “A Hybrid Approach to Compiling Bilingual Dictionaries of Medical Terms from Parallel Corpora”. Statistical Language and Speech Processing Vol. 8791 (2014): 57–69 Krstev, Cvetana. Processing of Serbian. Automata, Texts and Electronic Dictionaries. Faculty of Philology of the University of Belgrade, ...
Branislava Šandrih, Ranka Stanković. "Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.6
A Mathematical Learning Environment Based on Serbian Language Resources

Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan (2018)

In recent years, in line with ever growing usage of Information technology, the learning environments are changing. The amount of available learning materials in various forms has increased. These new environments demand comprehensive learning systems, which enable management of the learning corpus with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling ...

mathematical content, text processing, mathematical formulae

... 380–409. [16] Stanković, R., Obradović, I., Utvić, M., (2014). Developing Termbases for Expert Terminology under the TBX Standard. Natural Language Processing for Serbian - Resources and Applications, University of Belgrade, Faculty of Mathematics pp. 12-26. ...
... as several resources simultaneously [10]. Although the resources and tools have already been successfully used for a number of various language processing related tasks including query expansion, they need further improvement for management, named entity recognition, terminology extraction ...
... multilingual digital libraries of e-journals. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC), pp. 1710- 1717. [11] Krstev, C., (2008). Processing of Serbian. Automata, Texts and Electronic Dictionaries Search Engine. Faculty of Philology of the ...
Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan. "A Mathematical Learning Environment Based on Serbian Language Resources" in Proceedings of the 7th International Scientific Conference Technics and Informatics in Education, Faculty of Technical Sciences, Čačak (2018)
Transformer-Based Composite Language Models for Text Evaluation and Classification

Mihailo Škorić, Miloš Utvić, Ranka Stanković (2023)

Parallel natural language processing systems were previously successfully tested on the tasks of part-of-speech tagging and authorship attribution through mini-language modeling, for which they achieved significantly better results than independent methods in the cases of seven European languages. The aim of this paper is to present the advantages of using composite language models in the processing and evaluation of texts written in arbitrary highly inflective and morphology-rich natural language, particularly Serbian. A perplexity-based dataset, the main asset for the ...

General Mathematics, Engineering (miscellaneous), Computer Science (miscellaneous)

Mihailo Škorić, Miloš Utvić, Ranka Stanković. "Transformer-Based Composite Language Models for Text Evaluation and Classification" in Mathematics, MDPI AG (2023). https://doi.org/10.3390/math11224660
E-Connecting Balkan Languages

Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva (2009)

In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.

Query expansion, e-dictionary, wordnet, proper name, aligned text

... versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language ...
... independent both from Serbian, for which they were initially developed, and from English which seems to be in the background of many natural language processing tools. The main presupposition for the usage of these tools for other languages is the existence of textual and lexical resources developed ...
... 2.4 Prolex Database The Prolex project was initiated in 1990s with the study of toponyms in French with aim of appropriately processing proper names in natural language applications [16]. This work has been pursued by development of a Serbian version, which finally led to the design and construction ...
Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva. "E-Connecting Balkan Languages" in Proceedings of the Workshop Workshop on Multilingual resources, technologies and evaluation for Central and Eastern European Languages, 17 September 2009, eds. C. Vertan, S. Piperidis, E. Paskaleva and Milena Slavcheva, Borovets, Bulgaria : Association for Computational Linguistics Stroudsburg, PA, USA (2009)
Bilingual lexical extraction based on word alignment for improving corpus search

Jelena Andonovski, Branislava Šandrih, Olivera Kitanović (2019)

Library and Information Sciences,Computer Science Applications

Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
An Italian-Serbian Sentence Aligned Parallel Literary Corpus

Saša Moderc, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić (2023)

This article presents the construction and relevance of an Italian-Serbian sentence-aligned parallel corpus, delving into the aligned sentences in order to facilitate effective translation between the two languages. The parallel corpus serves as a valuable resource for language experts, researchers, and language enthusiasts, fostering a deeper understanding of linguistic nuances and cultural expressions. By bridging the gap between Serbian and Italian, this corpus opens new avenues for cross-cultural communication and collaboration, and ultimately contributes to the improvement of language-related ...

Aligned corpus, parallel corpus, Serbian, Italian, literature

Saša Moderc, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić. "An Italian-Serbian Sentence Aligned Parallel Literary Corpus" in Review of the National Center for Digitization, Belgrade : Faculty of Mathematics, University of Belgrade (2023). https://doi.org/10.5281/zenodo.11203388
Managing mining project documentation using human language technology

Aleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja (2018)

Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...

Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentation

Aleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis

Dalibor Vorkapić, Aleksandra Tomašević, Miljana Mladenović, Ranka Stanković, Nikola Vulović (2017)

U ovom radu predstavljen je model koji omogućava prikupljanje, pripremu, opis metapodataka, upravljanje i eksploataciju, uključujući pretragu punog teksta dokumenata iz domena kriminalistike napisanih na srpskom jeziku. Predloženi pristup primenjuje se na veb portalu koji sakuplja različite tekstove nastale iz časopisa Akademije za kriminalistiku i policijske studije, Krivičnog zakona Srbije, konferencija „Tara“ i „Reiss“, kao i iz nekih doktorskih disertacija vezanih za ovu oblast istraživanje. Nakon obrade teksta, korpus koji sadrži preko 5500 stranica običnog teksta, kreiran je i ...

Omeka, Wordnet, pretraga punog teksta, morfološka i semantička pretraga teksta, proširenje upita

... LINGUISTICS The linguistic study of forensic texts is a part of the field of Natural Language Processing, which includes text types classification and syntax and semantic analysis of texts written in a natural language. Various texts are subject of the study: Acts of Parliament (or other law-making ...
... Krstev, I. Obradović & D. Vitas Natural Language Processing for Serbian – Resources and Application, 1-11. Matematički fakultet, Beograd. 21 Mladenović, M., Mitrović, J., Krstev, C., & Vitas, D. (2015). Hybrid Sentiment Analysis Framework For A Morphologically Rich Language. Journal of Intelligent Information ...
... not in Serbian language was removed, as well as tables, figures, references and links, as usual preparation for corpus processing. After this preparation, the text collection contained 5,500 pages of plain text, in A4 format, which was used for further text analysis and processing. For digital objects ...
Dalibor Vorkapić, Aleksandra Tomašević, Miljana Mladenović, Ranka Stanković, Nikola Vulović. "Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis" in International Scientific Conference “Archibald Reiss Days” Thematic Conference Proceedings Of International Significance, Belgrade, 7-9 November 2017, Academy Of Criminalistic And Police Studies Belgrade (2017)
Towards Automatic Definition Extraction for Serbian

Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić (2021)

U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...

... Conference on Empirical Methods in Natural Language Processing, pp. 780-790. Tissier, J., Gravier, C., & Habrard, A. (2017). Dict2vec: Learning Word Embeddings using Lexical Dictionaries. In Proceeding of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), Sep 2017, Copenhague ...
... definitions into consistent word embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 1522-1532. Barnbrook, G. (2002). Defining Language, A local grammar of definition sentences, Studies in Corpus Linguistics, (Vol. 11). John Benjamins Publishing ...
... In: 1st Workshop on Recent Advances in Slavonic Natural Language Processing, 2007, pp. 65–70. SASA Dictionary: Речник српскохрватског књижевног и народног језика САНУ, I–XXI [The Dictionary of the Serbo-Croatian Standard and Vernacular Language] (1959–2020). Београд: Институт за српски језик САНУ ...
Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian

Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih (2021)

Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...

uvredljivi jezik, govor mržnje, srpski, tviter, leksikon, korpus

... arXiv:1709.10159. 38 Anna Schmidt and Michael Wiegand. A survey on hate speech detection using natural language processing. In Proceedings of the Ąfth international workshop on natural language processing for social media, pages 1–10, 2017. 39 Alessandro Seganti, Helena Sobol, Iryna Orlova, Hannam Kim ...
... with abusive triggers extracted from the AbCoSER dataset. 2012 ACM Subject ClassiĄcation Computing methodologies → Natural language processing Keywords and phrases abusive language, hate speech, Serbian, Twitter, lexicon, corpus Digital Object IdentiĄer 10.4230/OASIcs.LDK.2021.13 Funding Linked ...
... on a Common Natural Language Processing Paradigm for Balkan Languages, pages 15–22, 2007. LDK 2021 https://www.aclweb.org/anthology/2020.lrec-1.401.pdf https://www.aclweb.org/anthology/2020.globalex-1.1.pdf https://www.aclweb.org/anthology/2020.globalex-1.1.pdf 13:16 Building Language Resources for ...
Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources

Branislava Šandrih, Cvetana Krstev, Ranka Stanković (2020)

In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...

Linguistics and Language,Software,Artificial Intelligence,Language and Linguistics

Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
The Nooj System as Module within an Integrated Language Processing Environment

Ranka Stanković, Duško Vitas, Cvetana Krstev (2008)

NooJ, electronic dictionary, lexical resources

... System as Module within an Integrated Language Processing Environment Ranka Stanković, Duško Vitas, Cvetana Krstev Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] The Nooj System as Module within an Integrated Language Processing Environment | Ranka Stanković, Duško ...
... as the employees' publications. - The Repository is available at: www.dr.rgf.bg.ac.rs The NooJ system as module within an integrated language processing environment Ranka Stanković, ranka@rgf.bg.ac.yu Duško Vitas, vitas@matf.bg.ac.yu Cvetana Krstev, cvetena@matf.bg.ac.yu 1. Introduction ...
... http://www.lisa.org/tmx/ Vitas D., Krstev C., Obradović I., Popović Lj., Pavlović-Lažetić G.: Processing Serbian Written Texts: An Overview of Resources and Basic Tools., Workshop on Balkan Language Resources and Tools, Thessaloniki, Greece, eds, S. Piperidis and V. Karkaletsis, pp. 97-104, ...
Ranka Stanković, Duško Vitas, Cvetana Krstev. "The Nooj System as Module within an Integrated Language Processing Environment" in Proceedings of the 2007 International Nooj Conference, Cambridge Scholars Publishing (2008)

Претрага

391 items

Part of Speech Tagging for Serbian language using Natural Language Toolkit cite

An Approach to Efficient Processing of Multi-Word Units cite

Terminological and lexical resources used to provide open multilingual educational resources cite

Creation of a Training Dataset for Question-Answering Models in Serbian cite

Parallel Bidirectionally Pretrained Taggers as Feature Generators cite

FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain cite

Knowledge Graphs in the Era of Large Language Models: Opportunities and Challenges cite

Development of Open Educational Resources (OER) for Natural Language Processing cite

Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++ cite

A Mathematical Learning Environment Based on Serbian Language Resources cite

Transformer-Based Composite Language Models for Text Evaluation and Classification cite

E-Connecting Balkan Languages cite

Bilingual lexical extraction based on word alignment for improving corpus search cite

An Italian-Serbian Sentence Aligned Parallel Literary Corpus cite

Managing mining project documentation using human language technology cite

Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis cite

Towards Automatic Definition Extraction for Serbian cite

A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian cite

Two approaches to compilation of bilingual multi-word terminology lists from lexical resources cite

The Nooj System as Module within an Integrated Language Processing Environment cite

Part of Speech Tagging for Serbian language using Natural Language Toolkit

An Approach to Efficient Processing of Multi-Word Units

Terminological and lexical resources used to provide open multilingual educational resources

Creation of a Training Dataset for Question-Answering Models in Serbian

Parallel Bidirectionally Pretrained Taggers as Feature Generators

FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain

Knowledge Graphs in the Era of Large Language Models: Opportunities and Challenges

Development of Open Educational Resources (OER) for Natural Language Processing

Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++

A Mathematical Learning Environment Based on Serbian Language Resources

Transformer-Based Composite Language Models for Text Evaluation and Classification

E-Connecting Balkan Languages

Bilingual lexical extraction based on word alignment for improving corpus search

An Italian-Serbian Sentence Aligned Parallel Literary Corpus

Managing mining project documentation using human language technology

Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis

Towards Automatic Definition Extraction for Serbian

A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian

Two approaches to compilation of bilingual multi-word terminology lists from lexical resources

The Nooj System as Module within an Integrated Language Processing Environment