Prikaz osnovnih podataka o dokumentu
Open database of polysemous senses of 308 Serbian polysemous nouns, verbs, and adjectives
dc.creator | Mišić, Ksenija | |
dc.creator | Anđelić, Sara | |
dc.creator | Ilić, Lenka | |
dc.creator | Osmani, Dajana | |
dc.creator | Manojlović, Milica | |
dc.creator | Filipović Đurđević, Dušica | |
dc.date.accessioned | 2023-11-02T16:46:40Z | |
dc.date.available | 2023-11-02T16:46:40Z | |
dc.date.issued | 2023 | |
dc.identifier.uri | http://reff.f.bg.ac.rs/handle/123456789/5127 | |
dc.description.abstract | The majority of words can denote multiple related objects/phenomena, i.e. can have multiple related senses – so called polysemes. Understanding this linguistic phenomenon is therefore of high importance both in terms of linguistic inquiries and in terms of psychological studies of cognitive mechanisms. Previous research demonstrated that, in addition to the number of senses, processing is also influenced by the balance of sense probabilities (Filipović Đurđević & Kostić, 2021). However, the resources for the study of lexical ambiguity are very sparce (e.g. a database of 150 polysemous Serbian nouns; Filipović Đurđević & Kostić, 2017). Additionally, most of these effects were demonstrated either within a single part of speech category (typically nouns) or for ambiguous words with senses that span across various part of speech (e.g. a record / to record; as pointed out by Eddington & Tokowicz, 2015). Therefore, the goal of this paper is to present a new open database containing raw and categorized native speakers’ semantic intuitions for 308 Serbian polysemous nouns (100), verbs (100), adjectives (108) and multiple quantifications representing an array of the level of ambiguity indices. For each of the polysemous words, we collected semantic intuitions of native speakers by using the total meaning metric (Azuma, 1997). We then categorized the collected descriptions by using three strategies: a) relying solely on semantic intuition, b) relying solely on dictionary descriptions, and c) combining semantic intuitions and dictionary descriptions. Within each strategy, we also monitored and investigated the effect of the coder (the researcher performing the categorization) in order to explore the robustness of each approach. We then generated the sense probability distributions for each word by counting the response frequencies across created categories. In order to quantify the level of ambiguity, we calculated the number of senses, redundancy, and entropy of the obtained sense probability distributions (Shannon, 1948; Filipović Đurđević & Kostić, 2017). Each measure, within each approach was also corrected for the effects of idiosyncratic senses, reflexive verbs etc. This database will be openly available and will provide a useful resource in ambiguity research. In future, this database should be expanded with measures from word embeddings (i.e. BERT; Wiedemann et al., 2019) that separate different word senses. This will allow for quantifying the level of ambiguity on large-scale samples of text that may reveal a more precise estimation of sense numbers and sense probabilities, and would allow for abandoning the counting-of-senses approach (as suggested by Filipović Đurđević et al., 2009). Adding this to the database in the future, and therefore allowing comparison to existing measures may allow another validation point for measures derived from human participants. | sr |
dc.language.iso | en | sr |
dc.publisher | Faculty of Philosophy in Novi Sad | sr |
dc.relation | info:eu-repo/grantAgreement/MESTD/inst-2020/200163/RS// | sr |
dc.rights | openAccess | sr |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.source | Book of abstracts, 10th Novi Sad workshop on Psycholinguistic, neurolinguistic, and clinical linguistic research, April 22, Faculty of Philosophy, University of Novi Sad | sr |
dc.subject | open database | sr |
dc.subject | polysemous nouns | sr |
dc.subject | polysemous verbs | sr |
dc.subject | polysemous adjectives | sr |
dc.title | Open database of polysemous senses of 308 Serbian polysemous nouns, verbs, and adjectives | sr |
dc.type | conferenceObject | sr |
dc.rights.license | BY | sr |
dc.citation.spage | 27 | |
dc.identifier.fulltext | http://reff.f.bg.ac.rs/bitstream/id/12670/bitstream_12670.pdf | |
dc.identifier.rcub | https://hdl.handle.net/21.15107/rcub_reff_5127 | |
dc.type.version | publishedVersion | sr |