Претрага
88 items
-
Речник САНУ као база терминолошких речника (на примеру речника кулинарства)
... extraction of multiword lexical units that are allocated to the frequency of terms that are significantly more frequent in a culinary text than in the corpus of contemporary Serbian language. Using this approach, we were able to identify extremely rich term collection for culinary lexicon contained ...Рада Стијовић, Олга Сабо, Ранка Станковић. "Речник САНУ као база терминолошких речника (на примеру речника кулинарства)" in Словенска терминологија данас, Београд : Српска академија наука и уметности (2017)
-
Automatic construction of a morphological dictionary of multi-word units
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...electronic dictionary, Serbian, morphology, inflection, multiwordn units, noun phrases, query expansion... The calculation is performed on the basis of dictionaries described in [10] and [11] that are part of the standard distribution of Unitex [12], a corpus processing system based on the finite-state technology. 4 Authors Suppressed Due to Excessive Length Table 1. Initial content of the Serbian m ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić. "Automatic construction of a morphological dictionary of multi-word units" in Lecture Notes in Computer Science 6233, Advances in Natural Language Processing, Proceedings of the 7thInternational Conference on NLP, IceTAL 2010, Reykjavik, Iceland, August 2010, Springer (2010): 226-237. https://doi.org/10.1007/978-3-642-14770-8_26
-
Wordnet Development Using a Multifunctional Tool
Ivan Obradović, Ranka Stanković (2007)In this paper we present a multifunctional tool for manipulating heterogeneous language resources. The tool handles electronic dictionaries, wordnets and aligned texts, and provides for their synchronous use in various tasks. We focus here on the description of the possibilities this tool offers in the development of wordnets. Besides the wordnet module which enables parallel handling of two wordnets, other modules, such as the module for morphological dictionaries and the module for aligned texts, as well as available finite ...... of aligned parallel texts Parallel texts, which usually originate from a text in one language and its translation in another, are often aligned at a certain level (paragraph, sentence, etc) by matching the corresponding segments of the original and its translation. Aligned parallel texts are ...
... use aligned texts. If PWN is used for the source synset, then the language of one of the parallel texts must be English. Namely, WS4LR allows the user to search aligned texts using words from both parallel texts. All of the words found in both texts will be highlighted (in blue color) (Figure ...
... words properly fit into the synset. In that case the user might want to observe these words within a context, which can be done by searching a corpus for these words and obtaining concordances. By getting the occurrences of the words within the context, the user will be able to make a better ...Ivan Obradović, Ranka Stanković. "Wordnet Development Using a Multifunctional Tool" in Proceedings of the International Workshop Computer Aided Language Processing (CALP) '2007, Borovets, Bulgaria, September 2007, - (2007)
-
Named Entity Recognition for Distant Reading in ELTeC
Francesca Frontini, Carmen Brando, Joanna Byszuk, Ioana Galleron, Diana Santos, Ranka Stanković (2020)Akcija COST „Udaljeno čitanje za evropsku književnu istoriju“, koja je počela 2017. godine, ima među svojim glavnim ciljevima stvaranje višejezične zbirke evropskih književnih tekstova (ELTeC) otvorenog koda. U ovom radu predstavljamo rad koji je obavljen na ručnom označavanju selekcije ELTeC kolekcije za imenovane entitete, kao i na proceni postojećih alata za prepoznavanje imenovanih entiteta u pogledu njihove sposobnosti da automatski urade takve anotacije. U poslednjem paragrafu se razmatraju zajedničke tačke između ove inicijative i CLARIN-a.... IN se rv ic e s a n d to o ls , w ith s o m e id e a s fo r p o s s ib le c o lla b o ra tio n . 2 Developing the NE layer of the ELTeC corpus 2.1 Desiderata and annotation set N E R is a w e ll k n o w n ta s k in N L P , a n d th e re a re se v e ra l se ts o f g u id e l in ...
... 4 2 5 2 1 3 138 1 0 5 1 4 3 6 8 5 131 T a b le 1: D a ta o n th e m a n u a lly N E -a n n o ta te d c o rp u s . 2.2 Current state of the corpus T h e N E a n n o ta t io n o f th e c o rp u s is p a r t o f th e p la n fo r th e so c a lle d le v e l 2 a n n o ta tio n , w ...Francesca Frontini, Carmen Brando, Joanna Byszuk, Ioana Galleron, Diana Santos, Ranka Stanković. "Named Entity Recognition for Distant Reading in ELTeC" in CLARIN Annual Conference 2020, Oct 2020, Virtual Event, France, CLARIN (2020)
-
Чији је пример? Анализа лексичких обележја на примерима Речника САНУ
У овом раду поставља се питање: да ли се може утврдити ко је аутор неког текста уколико се анализирају искључиво његова лексичка обележја? Како бисмо покушали да добијемо одговор на ово питање, посматрали смо примере у оквиру речничког чланка појединачне лексеме Речника САНУ, који су забележени у пет томова (и то: I, II, XVIII, XIX и XX). Сваки пример је преузет из неког извора на шта упућују скраћенице, наведене у заградама. Од преко 5.000 понуђених извора, определили смо се ...... 1959. и (допуњено) 2017. Бранислава Б. Шандрих, Ранка М. Станковић, Мирјана С. Гочанин316 Утвић 2014: Miloš Utvić, The construction of reference corpus of contemporary Serbian [Izgradnja referentnog korpusa savremenog srpskog jezika] (Doc- toral dissertation, University of Belgrade). Фекете 1993: ...Бранислава Б. Шандрих, Ранка М. Станковић, Мирјана С. Гочанин. "Чији је пример? Анализа лексичких обележја на примерима Речника САНУ" in Српски језик и његови ресурси, Међународни славистички центар, Филолошки факултет, Универзитет у Београду (2019). https://doi.org/10.18485/msc.2019.48.3.ch13
-
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... existing SMD to Lex- Info, as a catalog of data categories (e.g., to denote gender, number, part of speech, etc.). 3Unitex is a lexically-based corpus processing suite that offers strong support for finite-state processing using morphological dic- tionaries –http://unitexgramlab.org/ Figure 1: ...
... existing dictionar- ies into the lemon-based model is integrated in the existing tool for dictionary management LeXimir, in order to sup- port parallel development for a certain period of time, and to enable smooth transition of development environment. The database contains all currently used markers ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
-
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... (LTC 2011), pages 126–130. Henrich, V., Hinrichs, E. W., and Suttner, K. (2012). Auto- matically linking GermaNet to Wikipedia for harvesting corpus examples for Germanet senses. JLCL, 27(1):1– 19. Henrich, V., Hinrichs, E., and Barkey, R. (2014). Align- ing word senses in GermaNet and the DWDS ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
-
Developing Termbases for Expert Terminology under the TBX Standard
... tegration with cascades for named entity recognition such as mining equipment, specific minerals and the like. Building of an aligned Serbian-English corpus of texts in the area of mining and geology from sources like the bilingual jour- nal “Underground Mining” are underway. The possibility of searching ...Ranka Stanković, Ivan Obradović, and Miloš Utvić. "Developing Termbases for Expert Terminology under the TBX Standard" in Natural Language Processing for Serbian - Resources and Applications, Belgrade : University of Belgrade, Faculty of Mathematics (2014)