Georgios Aivatoglou, Alexia Fytili, Georgios Arampatzis, Dimitrios Zaikis, Stylianou Nikolaos, and Ioannis Vlahavas. “End-to-end Aspect Extraction and Aspect-Based Sentiment Analysis Framework for Low-Resource Languages.” In: Intelligent Systems Conference (IntelliSys) 2023. Series Title: Lecture Notes in Networks and Systems. Springer Nature Switzerland, 2023.

Author(s): Georgios Aivatoglou, Alexia Fytili, Georgios Arampatzis, Dimitrios Zaikis, Stylianou Nikolaos, and Ioannis Vlahavas

Keywords: natural language processing; media analysis; low resource languages

Tags:

Abstract: Due to the increasing volume of user-generated content on the web, the vast majority of businesses and organizations have focused their interest on sentiment analysis in order to gain insights and information about their customers. Sentiment analysis is a Natural Language Processing task that aims to extract information about the human emotional state. Specifically, sentiment analysis can be achieved on three different levels, namely at the document level, sentence level or the aspect/feature level. Since document and sentence levels can be too generic for an opinion estimation given specific attributes of a product or service, aspect-based sentiment analysis became the norm regarding the exploitation of user generated data. However, most human languages, with the exception of the English language, are considered low-resource languages due to the restricted resources available, leading to challenges in automating information extraction tasks. Accordingly, in this work, we propose a methodology for automatic aspect extraction and sentiment classification on Greek texts that can potentially be generalized to other low-resource languages. For the purpose of this study, a new dataset was created consisting of social media posts explicitly written in the Greek language from Twitter, Facebook and YouTube. We further propose Transformer-based Deep Learning architectures that are able to automatically extract the key aspects from texts and then classify them according to the author's intent into three pre-defined classification categories. The results of the proposed methodology achieved relatively high F1-macro scores on all the classes denoting the importance of the proposed methodology on aspect extraction and sentiment classification on low-resource languages.