Претрага
753 items
-
Knowledge and Rule-Based Diacritic Restoration in Serbian
In this paper we present a procedure for the restoration of diacritics in Serbian texts written using the degraded Latin alphabet. The procedure relies on the comprehensive lexical resources for Serbian: the morphological electronic dictionaries, the Corpus of Contemporary Serbian and local grammars. Dictionaries are used to identify possible candidates for the restoration, while the dataobtainedfromSrpKorandlocalgrammarsassistsinmakingadecisionbetween several candidates in cases of ambiguity. The evaluation results reveal that,dependingonthetext,accuracyrangesfrom95.03%to99.36%,whilethe precision (average 98.93%) is always higher than the recall (average 94.94%).... that is precisely formulated terms referring to implied concepts. If an unambiguous and clear name in form of an existing word or a phrase cannot be found, than an ambiguous word can be used for naming and supplied with a “relator” (a brief note in parentheses). The RuThes concepts are not divided ...
... studies, 43(5-6):907–928. Guarino, N. and Welty, C. A. (2009). An overview of ontoclean. In Handbook on ontologies, pages 201–220. Springer. Guarino, N., Oberle, D., and Staab, S. (2009). What is an ontology? In Handbook on ontologies, pages 1–17. Springer. Guarino, N. (1998). Some ontological principles ...
... Processing for Text Analytics The main stages of thesaurus-based document processing include: • Tokenization and lemmatization, that is, the transfer of word forms to dictionary forms (lemmas); • Matching with the thesaurus based on the lemma representation of the document. Multiword terms from a thesaurus ...Cvetana Krstev, Ranka Stanković, Duško Vitas. "Knowledge and Rule-Based Diacritic Restoration in Serbian" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018): 41-51
-
Towards translation of educational resources using GIZA++
... sentence pairs, created during the training process. The frequencies of n-grams in a source language text that co-occur with n-grams in a parallel target language text represent the probability that those source-target paired n- grams will occur again in other texts similar to the parallel corpus. This ...
... parameters can be produced in output: lexical weighting (direct and indirect), word penalty, phrase penalty, Lexical weighting features estimate the probability of a phrase pair or translation rule word-by-word. The word penalty ensures that the translations do not get too long or too short. The phrase ...
... where V is the set of word forms i of a target language for which C(i|y) > 0, C(x) is the frequency of occurrences of a word x in the target language, while C(x|y) represents the frequency of a word x from the target language occurring in the same segment with the chosen word y from the source language ...Ivan Obradović, Dalibor Vorkapić, Ranka Stanković, Nikola Vulović, Miladin Kotorčević. "Towards translation of educational resources using GIZA++" in The Seventh International Conference on e-Learning (eLearning-2016), September 2016, Belgrade : Metropolitan Univesity (2016)
-
Using English Baits to Catch Serbian Multi-Word Terminology
In this paper we present the first results in bilingual terminology extraction. The hypothesis of our approach is that if for a source language domain terminology exists as well as a domain aligned corpus for a source and a target language, then it is possible to extract the terminology for a target language. Our approach relies on several resources and tools: aligned domain texts, domain terminology for a source language, a terminology extractor for a target language, and a ...aligned texts, word alignment, terminology extraction, electronic dictionaries, morphological inflection... extracted forms and (word by word) lemmas.8 The total number of 6One class can group MWTs with various syntactic struc- tures (recognized by different graphs, or finite-state automata); all MWTs in one class have the same number and characteristics of components that inflect. 7A – adjective, N – noun, g – the ...
... http: //www.meta-net.eu/whitepapers. Baldwin, T. and Kim, S. N. (2010). Multiword expres- sions. Handbook of natural language processing, 2:267– 292. Bouamor, D., Semmar, N., and Zweigenbaum, P. (2012). Identifying bilingual multi-word expressions for statisti- cal machine translation. In Nicoletta ...
... Stanković, R., Krstev, C., sko Vitas, D., Vulović, N., and Kitanović, O., (2017). Keyword-Based Search on Bilin- gual Digital Libraries, pages 112–123. Springer Inter- national Publishing, Cham. Tsvetkov, Y. and Wintner, S. (2010). Extraction of multi- word expressions from small parallel corpora. In Pro- ...Cvetana Krstev, Branislava Šandrih, Ranka Stanković. "Using English Baits to Catch Serbian Multi-Word Terminology" in Proceedings of the 11th International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
-
An Approach to Efficient Processing of Multi-Word Units
Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...... CFLX="NC_2XN" CflxGroup="NC_2XN"><Word ID="1" POS="MOT" Flex="false" Sep="!-"/> <Word ID="2" POS="N" Flex="true" Case="1" Num="s"/> ...<Word ID="1" POS="N" Case="˜1"/> <Word ID="2"/> ... In some rules ...
... <Word ID="1" POS="!SDIC" Flex="true" Cond="$PRE" setPOS="A" setFlexCode="A2"/> <Word ID="2" POS="N" Flex="true" Case="1" Num="p"/><Word ID="1" Sufix="ska,ška,čka" setLemma="[B]i" setGramCats="np1gae" /> <Word ID="2" Gen="n" /> ...
... CflxGroup="NC_AXN"><Word ID="1" POS="A" Flex="true" Case="1" Anim="$a" Gen="$g"/> <Word ID="2" POS="N" Flex="true" Case="1" Anim="=$a" Gen="=$g"/> <Word ID="1" Num="s" Cond="$PRE"/> <Word ID="2" Num="s"/> ...Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
-
Geochemical characterization of sediments from the archaeological site Vinča – Belo Brdo, Serbia
Gorica Veselinović, Dragana Životić, Kristina Penezić, Milica Kašanin-Grubin, Nevenka Mijatović, Jovana Malbašić, Aleksandra Šajnović (2020)Gorica Veselinović, Dragana Životić, Kristina Penezić, Milica Kašanin-Grubin, Nevenka Mijatović, Jovana Malbašić, Aleksandra Šajnović. "Geochemical characterization of sediments from the archaeological site Vinča – Belo Brdo, Serbia" in CATENA, Elsevier BV (2020). https://doi.org/10.1016/j.catena.2020.104914
-
Production of morphological dictionaries of multi-word units using a multipurpose tool
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...electronic dictionary, Serbian, morphology, inflection, multi-word units, noun phrases, query expansion... AXN’><Word ID = ‘1 ’ POS= ‘A’ F l ex = ‘ t r u e ’ Case = ‘1 ’ Anim= ‘$ a ’ Gen= ‘$ g ’ /> <Word ID = ‘2 ’ POS= ‘N’ F l ex = ‘ t r u e ’ Case = ‘1 ’ Anim= ‘=$ a ’ Gen= ‘=$g /> <Word ID= ‘1 ’ Num= ‘ s ’ Cond ...
... ‘AC A3XN ’><Word ID=” 1 ” POS= ‘A’ F le x = ‘ t r u e ’ Case = ‘1 ’ Num= ‘ s ’ Gen= ‘m’ /> <Word ID =”2” POS= ‘MOT’ F le x = ‘ f a l s e ’ Cond = ‘= , kao ”/> <Word ID =”3” POS= ‘N,A’ F le x = ‘ t r u e ’ Case = ‘1 ’ Num= ‘ s ’ Anim= ‘ v ’ /> < ...
... D. Vitas, and M. Utvić, “Auto- matic Construction of a Morphological Dictionary of Multi-Word Units,” in IceTAL. Reykavik, Iceland: Springer, August 2010, pp. 226–237. [13] I. Alegria, O. Ansa, X. Artola, N. Ezeiza, K. Nojenola, and R. Urizar, “Representation and Treatment of Multiword Expressions ...Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas. "Production of morphological dictionaries of multi-word units using a multipurpose tool" in Proceedings of the Computational Linguistics-Applications Conference, October 2011, Jachranka, Poland, Jachranka, Poland : PTI - Polish Information Processing Society (2011)
-
The use of biological markers in determination of origin and type of organic matter in the Tisza river sediments
Snežana Štrbac, Gordana Gajica, Aleksandra Šajnović, Nebojša Vasić, Ksenija Stojanović, Branimir Jovančićević (2013)The objective of the study was to determine the origin and type of organic matter (OM) of the Tisza recent sediments along the distance of 153 km through the territory of Serbia. For this purpose group organic-geochemical parameters and biomarker compositions were used. All samples contain approximately same amount of OM, which was deposited under uniform, slightly reducing conditions. Based on the distribution of n-alkanes, the origin and type of OM could not be precisely estimated. However, n-alkane patterns ...... n-C20)+Σodd(n-C17–n-C21) / Σeven(n-C18–n-C22)]; cCPI deter- mined for distribution of n-alkanes C23–C35 (mass chromatogram m/z 71), CPI (C23–C35) = 1/2 [Σodd(n-C23– –n-C35) / Σeven(n-C22–n-C34)+Σodd(n-C23–n-C35) / Σeven(n-C24–n-C36)]; dPr/Ph = pristane/phytane Tricyclic and pentacyclic terpanes. The ...
... n of n-alkanes C16–C35 (mass chromatogram m/z 71), CPI (C16–C35) = 1/2 [Σodd(n-C17–n-C35) / Σeven(n-C16–n-C34) + Σodd(n-C17–n-C35) / Σeven(n-C18–n- –C36)]; bCPI determined for the distribution of n-alkanes C16–C22 (mass chromatogram m/z 71), CPI (C16– –C22) = 1/2 [Σodd(n-C17–n-C21) / Σeven(n-C16– ...
... such as gaso- line and diesel, have different n-alkane distributions, which range from n-C6 to n-C12, and n-C12 to n-C25, respectively. Moreover, these derivatives do not con- tain polycyclic biomarkers of the sterane and terpane types.7,8 Similarly to n-alkanes, the distributions of polycyclic alkanes ...Snežana Štrbac, Gordana Gajica, Aleksandra Šajnović, Nebojša Vasić, Ksenija Stojanović, Branimir Jovančićević. "The use of biological markers in determination of origin and type of organic matter in the Tisza river sediments" in Journal of Serbian Chemical Society, Beograd : Srpsko hemijsko društvo (2013). https://doi.org/10.2298/JSC130614087S
-
Multi-word Expressions for Abusive Speech Detection in Serbian
Ovaj rad predstavlja istraživanja na usavršavanju i unapređenju srpske verzije rečnika Hurtlex, višejezičnog leksikona uvredljivih reči. Posebnu pažnju posvećujemo dodavanju izraza sa više reči (polileksemskih jedinica) koji se mogu smatrati uvredljivim, jer su takvi leksički zapisi veoma važni za postizanje dobrih rezultata u mnoštvu zadataka otkrivanja uvredljivog jezika. Srpski morfološki rečnici se koriste kao osnova za čišćenje podataka i stvaranje rečnika. Istaknuta je veza sa drugim leksičkim i semantičkim resursima na srpskom jeziku i predviđena je izgradnja sistema za ...... For instance, pukla bruka ‘scandal burst’ has a common verbal phrase structure V N; however, it is a frozen expression mostly used as an interjection. As already mentioned, prior to this task SrpMD contained 79 multi-word entries (noun phrases) from the compiled list of 653 nominal MWEs, however, without ...
... ical variants, e.g. Ekavian od Ijekavian word form (pogaziti reč/riječ ‘trample the word’, or synonyms (lomiti/polomiti/slomiti vrat ‘break a neck’), complements, adjuncts etc. These tables are complemented with finite-state automata (FSA) that deal with word order, model complements, etc. and that ...
... of abusive content similar to (Rezvan et al., 2018) who firstly created an offensive word lexicon and then collected Twitter messages that contain at least one word from the lexicon. As authors noted, presence of a word in a tweet is just an indication of its offensiveness, thus subsequent manual annotation ...Ranka Stanković, Jelena Mitrović, Danka Jokić, Cvetana Krstev. "Multi-word Expressions for Abusive Speech Detection in Serbian" in Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, Association for Computational Linguistics (2020)
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
-
Rule-based Automatic Multi-word Term Extraction and Lemmatization
In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...... distinct multi-word forms were evaluated as proper multi-word units, and among them 97% were associated with correct lemmas. Keywords: term extraction, terminology, multi-word units, lemmatization, finite-state transducers 1. Motivation Various approaches have been proposed for multi-word term (MWT) ...
... method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is ...
... this path rejects nouns that are homographous with other PoS word forms in order to avoid false recognitions. Given the high level of homography of word forms in Serbian it is possible that two or more graphs recognize the same word sequence where only one of them is correct. In the case of ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016)
-
Mineral and Thermal Waters of Serbia: Multivariate Statistical Approach to Hydrochemical Characterization
Maja Todorović, Jana Štrbački, Marina Ćuk, Jakov Andrijašević, Jovana Šišović, Petar Papić . "Mineral and Thermal Waters of Serbia: Multivariate Statistical Approach to Hydrochemical Characterization" in Mineral and Thermal Waters of Southeastern Europe, Springer International Publishing (2016). https://doi.org/10.1007/978-3-319-25379-4
-
A survey of greenhouse gases production in central European lignites
Anna Pytlak, Anna Szafranek-Nakonieczna, Weronika Goraj, Izabela Śnieżyńska, Aleksandra Krążała, Artur Banach, Ivica Ristović, Mirosław Słowakiewicz, Zofia Stępniewska (2021)... -W 1 V LN -W 2 K N N -W 1 K N N -W 2 TC B - W 1 TC B - W 2 B C B - W 1 B C B - W 2 H C 4 H C l o m n ( n oitc u d or p 4 ss a m yr d g 1- y a d 1- ) 0.0 0.2 0.4 0.6 60.0 80.0 100.0 120.0 K B R V LN -W 1 V LN -W 2 K N N -W 1 K N N -W 2 TC B - W ...
... onieczna et al., 2018), microcosms with the aim of determining the biological formation of CH4 and CO2 were prepared in a glove box, under N2. Ten grams of aseptically crushed lig- nite was placed in dark glass bottles (total capacity 60 cm3) and supple- mented with an appropriate volume of a deionised ...
... values for Eh and N-NO3 were 0.49 (p < 0.05), while for N-NH4 0.52 (p < 0.05), suggesting that the more reduced lignites contained higher concentrations of the re- duced N form. Because N-NH4 was the dominant form of nitrogen (with concentrations several dozen times higher than those of N- NO3), it may ...Anna Pytlak, Anna Szafranek-Nakonieczna, Weronika Goraj, Izabela Śnieżyńska, Aleksandra Krążała, Artur Banach, Ivica Ristović, Mirosław Słowakiewicz, Zofia Stępniewska. "A survey of greenhouse gases production in central European lignites" in Science of The Total Environment, Elsevier (2021). https://doi.org/10.1016/j.scitotenv.2021.149551
-
Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC
OntoLex, dominantni standard zajednice za mašinski čitljive leksičke resurse u kontekstu RDF-a, Linked Data i tehnologija Semantičkog veba, trenutno se proširuje sa posebnim modulom za Frekvencije, Primere i Informacije zasnovane na Korpusu (OntoLex-FrAC). Predlažemo novi komponent za OntoLex-FrAC, koji se bavi inkorporacijom korpusnih upita za (a) povezivanje rečnika sa korpusnim mašinama, (b) omogućavanje RDF baziranih web servisa da dinamički razmenjuju korpusne upite i podatke odgovora, i (c) korišćenje konvencionalnih upitačkih jezika za formalizaciju unutrašnje strukture kolokacija, skica reči i ...standardizacija, digitalna leksikografija, OntoLex, upiti korpusa, povezani podaci, Lingvistički povezani otvoreni podaciChristian Chiarcos, Ranka Stanković, Maxim Ionov, Gilles Sérasset. "Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC" in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, 20-25 May 2024, LREC (2024)
-
Keyword Extraction from Parallel Abstracts of Scientific Publications
... sequences which denote the sequence of morpho- logical suffixes of its words [27,29]. Wikipedia is one of the most commonly used semantic sources: using n-grams that appear in Wikipedia article titles as candidates for keywords [22], utilizing Wikipedia as a thesaurus for candidate selection from documents’ ...
... method for multi-word term extraction that relies on lexical resources for modeling various syntactic structures of multi-word terms. It is applied in several domains, also among them is the corpus of Serbian texts from the geology and mining domain containing more than 600,000 simple word forms. Part of ...
... to overcome differences between the inflected forms in the text and the lemmatized keyword forms of the same word. In the text preprocessing stage for the English language we use: (1) Stop-word list - extracted from the Natural Language Toolkit (NLTK) for Python [11], and (2) the Porter stemmer [11] ...Slobodan Beliga, Olivera Kitanović, Ranka Stanković, Sanda Martinčić-Ipšić . "Keyword Extraction from Parallel Abstracts of Scientific Publications" in Sematic Keyword-Based Search on Structured Data Sources - Third International KEYSTONE Conference, IKC 2017 Gdańsk, Poland, September 11–12, 2017 Revised Selected Papers and COST Action IC1302 Reports, Springer (2017)
-
A comparative study of the molecular and isotopic composition of biomarkers in immature oil shale (Aleksinac deposit, Serbia) and its liquid pyrolysis products (open and closed systems)
Gordana Gajica, Aleksandra Šajnović, Ksenija Stojanović, Jan Schwarzbauer, Aleksandar Kostić, Branimir Jovančićević (2021)The molecular and isotopic composition of biomarkers in initial bitumen isolated from immature (0.41% Rr) oil shale samples (Aleksinac deposit) and liquid products obtained by pyrolysis in open (OS) and closed (CS) systems are studied. The influence of pyrolysis type and variations of kerogen type on biomarkers composition and their isotopic signatures in liquid products is determined. The applicability of pyrolysis type, numerous biomarkers and carbon isotopic compositions (δ13C) of n-alkanes in liquid pyrolysates is established. Pyrolysis experiments were ...Uljni šejl, Aleksinac, organska supstanca, otvoreni i zatvoreni sistem pirolize, biomarkeri, izotopski sastav ugljjenikaGordana Gajica, Aleksandra Šajnović, Ksenija Stojanović, Jan Schwarzbauer, Aleksandar Kostić, Branimir Jovančićević. "A comparative study of the molecular and isotopic composition of biomarkers in immature oil shale (Aleksinac deposit, Serbia) and its liquid pyrolysis products (open and closed systems)" in Marine and Petroleum Geology, Elsevier BV (2021). https://doi.org/10.1016/j.marpetgeo.2021.105383
-
Late and post-collisional tectonic evolution of the Adria-Europe suture in the Vardar Zone
The Vardar Zone is a product of the Triassic-Jurassic opening of the Neotethys, Jurassic obduction, Late Cretaceous/Paleogene consumption of the oceanic crust and continental collision. During the last process, the Eastern Vardar Zone was thrust over the Central and eventually both onto the Western Vardar Zone. The present paleomagnetic and structural study provided new results from the first two zones in the Belgrade area. The younger set of data, together with published ones from the third zone, provide firm ...Emő Márton, Marinko Toljić, Vesna Cvetkov. "Late and post-collisional tectonic evolution of the Adria-Europe suture in the Vardar Zone" in Journal of Geodynamics, Elsevier BV (2022). https://doi.org/10.1016/j.jog.2021.101880
-
Evidence of Variscan and Alpine tectonics in the structural and thermochronological record of the central Serbo-Macedonian Massif (south-eastern Serbia)
... and complete “ Ar/”Ar data. A brief overview of critical sample information is given in Table 1, while relevant “Ar/”Ar date spectra, K/Ca dia- grams, and isotope correlation plots are shown in Fig. 14. Additionally, the “Ar/Ar dates are presented in Fig. 2 along with previously published K/Ar ...
... predominantly shallow dipping Int J Earth Sci (Geol Rundsch) (2017) 106:1665–1692 1671 @ \N, s N=5 Western part ofthe Lower Complex Vrvi Kobila area N Umin N=19 imylonitc • lneadon N shear-zone % foliation foliation outside" % ihe shear:zone - Vranjska Banja area - see Figure ...
... Simplified tectonic map of the study area with arrows representing local directions of tectonic transport. See text for i w details N O 42 *% 0' N || ~ N y j }a ~ AOI NM / i w\ S 1673 “ Outcrop-scale high conf. “ Outcrop-scale low conf. m Microstructure s Mineral lineation ...Milorad D. Antić, Alexandre Kounov, Branislav Trivić, Richard Spikings, Andreas Wetzel. "Evidence of Variscan and Alpine tectonics in the structural and thermochronological record of the central Serbo-Macedonian Massif (south-eastern Serbia)" in International Journal of Earth Sciences, Springer Science and Business Media LLC (2016). https://doi.org/10.1007/s00531-016-1380-6
-
Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities
Овај рад представља активности на развоју корпуса ELEXIS-sr, српском додатку вишејезичном анотираном корпусу ELEXIS-а, који се састоји од семантичких анотација и репозиторија значења речи. ELEXIS је паралелни вишејезични анотирани корпус на десет европских језика, који може да се користи као вишејезички репер за евалуацију европских језика са мање и средње развијеним ресурсима. Фокус овог рада је на вишечланим изразима и именованим ентитетима, њиховом препознавању у скупу реченица ELEXIS-sr и поређењу са анотацијама на другим језицима. Разматрају се први кораци ...Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov. "Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
-
Named Entity Recognition for Distant Reading in ELTeC
Francesca Frontini, Carmen Brando, Joanna Byszuk, Ioana Galleron, Diana Santos, Ranka Stanković (2020)Akcija COST „Udaljeno čitanje za evropsku književnu istoriju“, koja je počela 2017. godine, ima među svojim glavnim ciljevima stvaranje višejezične zbirke evropskih književnih tekstova (ELTeC) otvorenog koda. U ovom radu predstavljamo rad koji je obavljen na ručnom označavanju selekcije ELTeC kolekcije za imenovane entitete, kao i na proceni postojećih alata za prepoznavanje imenovanih entiteta u pogledu njihove sposobnosti da automatski urade takve anotacije. U poslednjem paragrafu se razmatraju zajedničke tačke između ove inicijative i CLARIN-a.... -s y n ta c t ic a n d d ire c t sp e e c h a n n o ta tio n . A t th is m o m e n t, W G 2 c a r r ie s o u t th e N E a n n o ta t io n fo r a s u b s e t o f la n g u a g e s : C z e c h (c z e ) , G e rm a n (d e u ), E n g lis h (e n g ) , F re n c h ( f ra ) , H u n g a r ...
... French: T h e m a n u a l c o rp u s c o n ta in s m a n y P E R S a n d fe w e r L O C a n n o ta t io n s . H o w e v e r , s p a C y - f ra a n n o ta te s to o m a n y L O C h e n c e th e lo w p re c is io n fo r th is ca te g o ry , a n d S E M -f ra a n n o ta te s to o ...
... u to m atic C o n te n t E x trac tio n ) E n g lish A n n o ta tio n G u id e lin e s fo r E n tit ie s , V ers io n 6 .6 . T ech- n ic a l rep o r t, L in g u is t ic D a ta C o n so r tiu m . h ttp s : //w w w .ld c .u p e n n .e d u /s ite s /w w w .ld c .u p e n n .e d u /f ile s /en ...Francesca Frontini, Carmen Brando, Joanna Byszuk, Ioana Galleron, Diana Santos, Ranka Stanković. "Named Entity Recognition for Distant Reading in ELTeC" in CLARIN Annual Conference 2020, Oct 2020, Virtual Event, France, CLARIN (2020)
-
The Newton method for solving nonlinear equations based on aggregation operators
Nebojša M. Ralević, Dejan Ćebić (2019)Nebojša M. Ralević, Dejan Ćebić. "The Newton method for solving nonlinear equations based on aggregation operators" in XLVI International Symposium on Operational Research SYM-OP-IS, Kladovo, 15-18.9.2019, Универзитет у Београду (2019)