{"id":16789,"date":"2024-10-01T10:56:18","date_gmt":"2024-10-01T08:56:18","guid":{"rendered":"https:\/\/is.ijs.si\/?p=16789"},"modified":"2025-03-26T09:02:25","modified_gmt":"2025-03-26T08:02:25","slug":"sarcasm-detection-in-a-less-resourced-language","status":"publish","type":"post","link":"https:\/\/is.ijs.si\/?p=16789","title":{"rendered":"Sarcasm Detection in a Less-Resourced Language"},"content":{"rendered":"\n\n\n<p>Lazar \u0110okovi\u0107 and Marko Robnik-\u0160ikonja<\/p>\n<p>Abstract<br \/>The sarcasm detection task in natural language processing tries<br \/>to classify whether an utterance is sarcastic or not. It is related<br \/>to sentiment analysis since it often inverts surface sentiment. Because sarcastic sentences are highly dependent on context, and<br \/>they are often accompanied by various non-verbal cues, the task<br \/>is challenging. Most of related work focuses on high-resourced<br \/>languages like English. To build a sarcasm detection dataset for<br \/>a less-resourced language, such as Slovenian, we leverage two<br \/>modern techniques: a machine translation specific medium-size<br \/>transformer model, and a very large generative language model.<br \/>We explore the viability of translated datasets and how the size of<br \/>a pretrained transformer affects its ability to detect sarcasm. We<br \/>train ensembles of detection models and evaluate models\u2019 performance. The results show that larger models generally outperform<br \/>smaller ones and that ensembling can slightly improve sarcasm<br \/>detection performance. Our best ensemble approach achieves an<br \/>F1-score of 0.765 which is close to annotators\u2019 agreement in the<br \/>source language.<\/p>\n<p>\u00a0<\/p>\n\n\n\n<div data-wp-interactive=\"core\/file\" class=\"wp-block-file\"><object data-wp-bind--hidden=\"!state.hasPdfPreview\" hidden class=\"wp-block-file__embed\" data=\"https:\/\/is.ijs.si\/wp-content\/uploads\/2024\/10\/SCAI_2024_paper_4212.pdf\" type=\"application\/pdf\" style=\"width:100%;height:600px\" aria-label=\"Embed of SCAI_2024_paper_4212.\"><\/object><a id=\"wp-block-file--media-1bd793e3-18c1-43a4-a8d0-ee479ce5f2e4\" href=\"https:\/\/is.ijs.si\/wp-content\/uploads\/2024\/10\/SCAI_2024_paper_4212.pdf\">SCAI_2024_paper_4212<\/a><a href=\"https:\/\/is.ijs.si\/wp-content\/uploads\/2024\/10\/SCAI_2024_paper_4212.pdf\" class=\"wp-block-file__button wp-element-button\" download aria-describedby=\"wp-block-file--media-1bd793e3-18c1-43a4-a8d0-ee479ce5f2e4\">Download<\/a><\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":29,"featured_media":24966,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[105,102],"tags":[],"class_list":["post-16789","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-doi-skui-2024","category-papers"],"_links":{"self":[{"href":"https:\/\/is.ijs.si\/index.php?rest_route=\/wp\/v2\/posts\/16789","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/is.ijs.si\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/is.ijs.si\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/is.ijs.si\/index.php?rest_route=\/wp\/v2\/users\/29"}],"replies":[{"embeddable":true,"href":"https:\/\/is.ijs.si\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=16789"}],"version-history":[{"count":1,"href":"https:\/\/is.ijs.si\/index.php?rest_route=\/wp\/v2\/posts\/16789\/revisions"}],"predecessor-version":[{"id":16791,"href":"https:\/\/is.ijs.si\/index.php?rest_route=\/wp\/v2\/posts\/16789\/revisions\/16791"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/is.ijs.si\/index.php?rest_route=\/wp\/v2\/media\/24966"}],"wp:attachment":[{"href":"https:\/\/is.ijs.si\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=16789"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/is.ijs.si\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=16789"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/is.ijs.si\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=16789"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}