Претрага
180 items
-
Rule-based Automatic Multi-word Term Extraction and Lemmatization
In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...... Automatic Multi-word Term Extraction and Lemmatization Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Rule-based Automatic Multi-word Term Extraction and Lemmatization | Ranka ...
... them 97% were associated with correct lemmas. Keywords: term extraction, terminology, multi-word units, lemmatization, finite-state transducers 1. Motivation Various approaches have been proposed for multi-word term (MWT) extraction as this problem has been gaining in importance in the field ...
... them, such as indexing or document information retrieval, for term extraction. The current application is developed and tested within a Windows environment, while a corresponding web application, which would offer term extraction from texts in various domains to a wider community of expert users ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016)
-
Resource-based WordNet Augmentation and Enrichment
In this paper we present an approach to support production of synsets for SerbianWordNet(SerWN)byadjustingPrincetonWordNet(PWN)synsetsusing several bilingual English-Serbian resources. PWN synset definitions were automatically translated and post-edited, if needed, while candidate literals for Serbian synsets were obtained automatically from a list of translational equivalents compiled form bilingual resources. Preliminary results obtained from a setof1248selectedPWNsynsetsshowthattheproducedSerbiansynsetscontain 4024 literals, out of which 2278 were offered by the system we present in this paper, whereas experts added the remaining 1746. Approximately one half of ...... aligned term pairs. The structure of lexical entries in GeolISSTerm, Rudonto and Termi is similar. Each term comes with a name, definition, an optional list of synonyms, abbreviations and a bibliographic source. Each term, except the top term in the dictionary tree, has only one hyperonym term, but it ...
... bilingual resource, it is taken into account only if the term is found in the Serbian morphological e-dictionary and if the POS of the term matched the synset POS. 2.2. Bilingual lists In addition to the list of English-Serbian (en-sr) term pairs extracted from aligned PWN and SerWN synsets, we have ...
... format, with 6,939 term entries, and 6,971 aligned pairs of terms. Microsoft language portal10 has published Microsoft Terminology Collection data in the form of a .tbx (ISO 30042:2008) file containing: Concept ID, Definition, Source term, Source language identifier, Target term, Target language ...Ranka Stanković, Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev. "Resource-based WordNet Augmentation and Enrichment" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018)
-
Long-term planning methodology for improving wood biomass utilization
The insufficiently developed forest management system is often followed by undeveloped forest resources supply chain and insufficient institutional support. These cause inefficient usage of fuel-wood as well as huge amounts of unused forest residues. In order to achieve optimal and long-term sustainable utilisation of biomass, an original methodology based on the interaction of mathematical optimization and backcasting approach has been developed. Mathematical optimization is used for both generation and consideration of techno-economic parameters of the forest biomass supply chain. ...Vladimir Vukašinović, Dušan Gordić, Marija Živković, Davor Koncalović, Dubravka Živković. "Long-term planning methodology for improving wood biomass utilization" in Energy, Elsevier BV (2019). https://doi.org/10.1016/j.energy.2019.03.105
-
Keyword Extraction from Parallel Abstracts of Scientific Publications
... publicly available from http://langnet.uniri. hr/resources.html. The previous research [14] for terminology extraction in the Serbian language used the rule-based method for multi-word term extraction that relies on lexical resources for modeling various syntactic structures of multi-word terms. It is applied ...
... (2010) 28. Lopez, P., Romary, L.: HUMB: automatic key term extraction from scientific arti- cles in GROBID. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 248–251 (2010) 29. Nguyen, T.D., Kan, M.-Y.: Keyphrase extraction in scientific publications. In: Goh, D.H.-L., Cao, ...
... human, but in the text the author used a synonym term “velocity” (stemmed to: veloc). Since the word “speed” is not present in the original text, the method will never extract it as a keyword. Similar examples adversely affect the success of extraction and reduce the efficiency of the SBKE method in terms ...Slobodan Beliga, Olivera Kitanović, Ranka Stanković, Sanda Martinčić-Ipšić . "Keyword Extraction from Parallel Abstracts of Scientific Publications" in Sematic Keyword-Based Search on Structured Data Sources - Third International KEYSTONE Conference, IKC 2017 Gdańsk, Poland, September 11–12, 2017 Revised Selected Papers and COST Action IC1302 Reports, Springer (2017)
-
Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++
Branislava Šandrih, Ranka Stanković (2020)U nauci, industriji i mnogim istraživačkim oblastima, terminologija se brzo razvija. Najčešće, jezik koji je „lingua franca“ za većinu ovih oblasti je engleski. Kao posledica toga, za mnoga polja termini domena su koncipirani na engleskom, a kasnije se prevode na druge jezike. U ovom radu predstavljamo pristup za automatsko izdvajanje dvojezične terminologije za englesko-srpski jezički par koji se oslanja na usaglašeni dvojezični korpus domena, ekstraktor terminologije za ciljni jezik i alat za usklađivanje delova. Ispitujemo performanse metode na domenu ...... the extraction of English terms (Input ii) we used the English side of the dictionary LIS- dict in one series of experiments, and term extractor Eng-TE in the other, while the extraction of Serbian terms (Input iii) was done by Serb-TE. With the notation introduced in Section 3, the extraction procedure ...
... showed that a number of new term pairs were retrieved. When LIS-dict was used as a source of English terminology, 364 English terms from the dictionary were linked to new Serbian translations yielding 428 new term pairs. Among all term pairs retrieved using Eng-TE for extraction, 538 were supported by LIS-dict ...
... hypothesis: On the basis of bilingual, aligned, domain-specific textual re- sources, a terminological list and/or a term extraction tool in a source language, and a system for the extraction of terminology- specific Multi-Words Terms in a target language, it is possible to compile a bilingual aligned t ...Branislava Šandrih, Ranka Stanković. "Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.6
-
Comparison of sequential and single extraction in order to estimate environmental impact of metals from fly ash
летећи пепео угља, екстракција са једним агенсом, секвенцијална екстракција, микроталасне пећнице, ултраталасиAleksandra Tasić, Ivana Sredović-Ignjatović, Ljubiša Ignjatović, Marija Ilić, Mališa Antić. "Comparison of sequential and single extraction in order to estimate environmental impact of metals from fly ash" in Journal of the Serbian Chemical Society (2016). https://doi.org/10.2298/JSC160307038T
-
Some examples of interactions between certain rare earth elements and soil
... happened in several different phases and the extraction in phases with milder extraction agents had occurred. In case of neodymium, the nature of soil is important in particular, since only in certain extraction phases there came to ion extraction, and also due to slightly different neodymium ...
... stic for humus. The extraction was noticed in ion exchange phase (Phase I) and in extraction phases with acids (Phases IV and V). The results show that in sand soil about 80 % of neodymium ions have been extracted and in the clay type of soil 96 % (Table VI). The extraction from humus soil was ...
... quantities depend on soil type, nature of extraction agent and metallic ion characteristics. Neodymium, first of all, due to its ionic diameter reacts differently with soil. For that reason the extraction happened only with solutions of certain extraction agents depending on their nature and ionic ...Zlatko Nikolovski, Jelena Isailović, Dejan Jeremić, Sabina Kovač, Ilija Brčeski. "Some examples of interactions between certain rare earth elements and soil" in Journal of the Serbian Chemical Society, National Library of Serbia (2021). https://doi.org/10.2298/JSC211006095N
-
Corpus-based bilingual terminology extraction in the power engineering domain
Ovaj rad predstavlja resurse i alate koji se koriste za ekstrkciju i evaluaciju dvojezične, englesko-srpske terminologije u domenu energetike. Resursi se sastoje od postojeće opšte i domenske leksike i domenskog paralelnog korpusa; alati uključuju ekstraktore termina za oba jezika i alat za poravnavanje segmenata koji pripadaju korpusnim rečenicama. Sistem je testiran variranjem funkcije podudaranja koja utvrđuje prisustvo ekstrahovanog termina u poravnatom segmentu (odsečak), u rasponu od veoma labavog do strogog. Procena rezultata je pokazala da je preciznost izdvajanja termina ...Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, Cvetana Krstev. "Corpus-based bilingual terminology extraction in the power engineering domain" in Terminology, John Benjamins Publishing Company (2022). https://doi.org/10.1075/term.20038.iva
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
-
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... definition extraction can be formalized in different ways. One of the possibilities is to consider it as a problem of classifying sentences into those that are potential candidates for defining a term and those that are not, namely, as a problem of determining whether a sentence contains a term-definition ...
... <term> defined (as|by)define(s)? <term> as definition of <term> <term> a measure of <term> is DT (that|which|where) <term> comprise(s)? <term> consist(s)? of <term> denote(s)? <term> de ...
... either as a sentence classification task (i.e., containing term-definition pairs or not) or a sequential labelling task (i.e. identifying the boundaries of terms and definitions). The previous work on definition extraction can be classified as follows: 1) the rule-based approach with linguistic rules and ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
-
Terminological and lexical resources used to provide open multilingual educational resources
Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...... for term recognition, extraction and lemmatization. Picture 1 illustrates steps in terminology extraction. Crucial resources are morphological dictionaries and grammars. They are combined with some statistical measures for term extraction. The first step is analysis of terms in existing term base ...
... Automatic term extraction is a process that is meant to facilitate this painstaking task and identify terms less obvious to humans by using computer aided techniques. For now, the automatic extraction is used as a preliminary process, to 5 Available at the address: http://geoliss.mprrpp.gov.rs/term/ h ...
... http://meta.baektel.eu/ identify term candidates, but is expected to replace manual term extraction completely. Due to the rich morphology of Serbian language and the complexity of terms (they are the most often composed of two or more words called multi word units) it is not a simple process. ...Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
-
Bilingual lexical extraction based on word alignment for improving corpus search
Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
-
Project REASONING: Characterization and technological procedures for recycling and reusing of the rudnik mine flotation tailings
Vesna Cvetkov, Vladimir Simić, Stefan Petrović, Filip Arnaut, Milena Kostović, Dragan Radulović, Jovica Stojanović, Vladimir Jovanović, Dejan Todorović, Nina Nikolić, Jelena Senćanski, Grozdanka Bogdanović, Dragana Marilović (2024)Vesna Cvetkov, Vladimir Simić, Stefan Petrović, Filip Arnaut, Milena Kostović, Dragan Radulović, Jovica Stojanović, Vladimir Jovanović, Dejan Todorović, Nina Nikolić, Jelena Senćanski, Grozdanka Bogdanović, Dragana Marilović. "Project REASONING: Characterization and technological procedures for recycling and reusing of the rudnik mine flotation tailings" in 5th Congress Geologists of the Republic of North Macedonia, Ohrid, 28-29. 10. 2024, Македонско геолошко друштво (2024)
-
Groundwater management by riverbank filtration and an infiltration channel, the case of Obrenovac, Serbia
Dušan Polomčić, Bojan Hajdin, Zoran Stevanović, Dragoljub Bajić, Katarina Hajdin. "Groundwater management by riverbank filtration and an infiltration channel, the case of Obrenovac, Serbia" in Hydrogeology Journal, Berlin, Heidelberg : Springer, International Association of Hydrogeologists (2013). https://doi.org/10.1007/s10040-013-1025-9
-
Global trend and negative synergy: Climate changes and groundwater over-extraction
Stevanović Zoran (2013)Stevanović Zoran. "Global trend and negative synergy: Climate changes and groundwater over-extraction" in Proceedings of the International Conference “Climate Change Impact on Water Resources”, 17-18 Oct.2013, Belgrade, Belgrade:Institute of Wat. Manag. J.Cerni & WSDAC (2013): 42-45
-
Terminology Acquisition and Description Using Lexical Resources and Local Grammars
Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...... geochemical research” and a term for technical characteristics of machines “Length of the caterpillar transporting device measured from the vertical excavator rotation axis to the front edge of the caterpillar”. 4.2 Extraction of MWUs from domain texts The extraction of MWUs from a text is preceded ...
... shallow grammar, obtained by an automatic con- version of the lexicon. Przepiorkowski and asso- ciates (2007) present results of automatic extraction of term definitions from unstructured texts in Bulgarian, Czech and Polish by use of regular grammars. There are also combinations of the two ap- ...
... ‘environmental pollution’ is. For that reason, the order of term candidate extraction is: 1. AXAXN, 2XAXN, AXN2X, AXN4X, AXN 2. N6X 3. N4X 4. 2XN, N2X, NXN At the end of each round duplicates are elimi- nated according to the priorityand the union of all results is performed. The output of processing ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015)
-
Application of contour blasting for the extraction of dimension stone blocks
Kričak Lazar, Negovanović Milanka, Janković Ivan, Zeković D., Mitrović S.. "Application of contour blasting for the extraction of dimension stone blocks" in Proceedings of the 2nd International Conference „Harmony of nature and spirituality in stone“, Kragujevac, Serbia:Stone Studio Association, (2012): 79-85
-
Hydrodynamic analysis of potential groundwater extraction capacity increase: case study of Nelt groundwater source at Dobanovci
Bajić Dragoljub, Polomčić Dušan, Ratković Jelena, Matić Ivan. "Hydrodynamic analysis of potential groundwater extraction capacity increase: case study of Nelt groundwater source at Dobanovci" in Tehnika 4 no. 68, Belgrade:Union of Engineers and Technicians of Serbia (2017): 512-525. https://doi.org/10.5937/tehnika1704512B
-
Using Query Expansion for Cross-Lingual Mathematical Terminology Extraction
Velislava Stoykova, Ranka Stanković (2018)Velislava Stoykova, Ranka Stanković. "Using Query Expansion for Cross-Lingual Mathematical Terminology Extraction" in Advances in Intelligent Systems and Computing, Springer International Publishing (2018). https://doi.org/10.1007/978-3-319-91189-2_16
-
Semi-Automatic Extraction of Multiword Terms from Domain-Specific Corpora
Vesna Pajić, Staša Vujičić Stanković, Ranka Stanković, Miloš Pajić. "Semi-Automatic Extraction of Multiword Terms from Domain-Specific Corpora" in The Electronic Library 36 no. 3, Emerald Publishing Limited (2018): 550-567. https://doi.org/10.1108/EL-06-2017-0128