The Dictionary of the Serbian Academy: from the Text to the Lexical Database

Објеката

Тип
Рад у зборнику
Верзија рада
објављена верзија
Језик
енглески
Креатор
Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo
Извор
Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts
Уредник
Jaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek
Издавач
Ljubljana : Ljubljana University Press, Faculty of Arts
Датум издавања
2018
Сажетак
In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular
Language. Scanning and character recognition were a particular challenge, since various non-standard
character set encoding was used in the course of the almost 60-year long production of the dictionary. The first
aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized
text of and transform it into structured data stored in relational lexical database. This approach is compatible
with several standard structured forms and ontologies (TEI, LMF, Ontolex, LexInfo). A lexical database model
was designed in compliance with these structured forms, following mostly the lemon model. Mapping of
the lexical entry markers to LexInfo and TEI enabled export of the lexical data to the mentioned formats. A
software solution for the dictionary text analysis, parsing and lexical database population was developed and
tested on the first and the last published volumes of the dictionary (which contain 27,141 articles in total). An
evaluation of the results shows that the developed model and software solution can be successfully used for
the other volumes as well.
почетак странице
941
крај странице
949
isbn
978-961-06-0097-8
Subject
computer lexicography, lexical database, language resources, dictionary, Serbian language
Шира категорија рада
M30
Ужа категорија рада
M33
Права
Отворен приступ
Лиценца
Creative Commons – Attribution-NonComercial-No Derivative Works 4.0 International
Формат
.pdf
Скупови објеката
Ранка Станковић
Radovi istraživača
Медија
pdf

Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)