Please use this identifier to cite or link to this item:
Appears in Collections:Computing Science and Mathematics eTheses
Title: Towards Arabic textual and multi-modal sentiment analysis
Author(s): Alqarafi, Abdulrahman
Supervisor(s): Swingler, Kevin
Keywords: Sentiment Analysis
Machine Learning
Multi-modal Sentiment Analysis
Word Embedding
Issue Date: 1-Sep-2020
Publisher: University of Stirling
Citation: 1. A. Alqarafi, A. Adeel, A. Hawalah, K. Swingler, and A. Hussain, “A semi-supervised corpus annotation for Saudi sentiment analysis using twitter,” in International Conference on Brain Inspired Cognitive Systems. Springer,2018, pp. 589–596.
2. A. S. Alqarafi, A. Adeel, M. Gogate, K. Dashitpour, A. Hussain, and T. Durrani, “Toward’s Arabic multi-modal sentiment analysis,” in International Conference in Communications, Signal Processing, and Systems. Springer, 2017,pp. 2378–2386.
Abstract: Sentiment Analysis (SA) is the process of classifying the sentiment in given media such as text, audio, or video. Classifying the sentiment from several combined modalities is known as multi-modal sentiment analysis. People use the Internet to express and share opinions, facts, and sentiments about products and services. Social media applications such as Facebook and Twitter, have become popular information-sharing platforms. These expressed opinions have an impact on individuals when making decision such as which products to purchase, which movies to watch or which political parties to vote for. Organizations, providers, and governments can obtain crucial information from the response of consumers regarding their products or services which helps in continuously improving their services and products. Analysing this kind of information automatically can allow them to adapt the products or services to groups’ desires hence improving the consumer satisfaction and increasing sales. Most of the user-contributed content until recently has been text. Though, as the number of smartphones is increasing, individuals share online videos using online video platforms. Similar to text, these videos hold opinions and sentiments that can be analysed. Most previous work on sentiment analysis, using one or more modes, have concentrated on the English language and other languages such as Spanish, Chinese, and French. This research aims to investigate Arabic sentiment analysis as a low-resourced language focusing on both uni-modal sentiment analysis (text only) and multi- modal sentiment analysis. For textual sentiment analysis, we examine several aspects, and systematically compare the result with previously built Arabic benchmark sentiment analysis models. This includes building CNN models as well as comparing several word embeddings that are used as input to the models. We explored the feasibility of using sentiment embedding which has not previously been built for Arabic models compared with traditional word embedding which showed an increase in the sentiment classification accuracy. Although the challenges that the test-sets contain such as the use of both Modern Standard Arabic MSA and several dialects, the proposed CNN model and the sentiment embedding models showed an increase in the sentiment classification accuracy compared with the state-of-the-art sentiment analysis systems for Arabic. Additionally, the thesis investigated tuning several CNN hyper- parameters and their impact on model performance for Arabic language and we introduced a guideline for tuning the hyper-parameters for the Arabic language. This guideline reveals that certain hyper-parameters are crucial to be tuned especially for the Arabic language such as, the kernel size. For Arabic, the use of audio and visual cues alongside text for sentiment analysis is not well studied including the efficacy and feasibility of this approach. There is no published framework that extracts features from different modalities and fuses them for sentiment classification. In this thesis, we propose a multi-modal framework for Arabic sentiment analysis that combines the features extracted from text, audio, and video. We report extensive experiments and investigations of all the modalities and their impact on the sentiment classification. We also compared the hand-crafted machine learning approaches, which were widely studied for Arabic sentiment analysis with deep learning approaches which have received limited attention. In contrast with previous work, we benchmark the trained models against several independent test sets. Despite the challenges for Arabic language that include the type of data whether Tweets or Facebook and the noise they contain, the dialectical variations that were addressed in the literature as well as modern standard Arabic, we showed that our proposed models are able to achieve sentiment classification scores that are comparable to the state-of-the- art Arabic sentiment classification models. The thesis also investigates the feasibility of combining audio and video features alongside text by first constructing a benchmark dataset, the so-called Arabic Multi-Modal Dataset AMMD, and then applying different classifiers for feature extraction, feature fusion, and classification.
Type: Thesis or Dissertation

Files in This Item:
File Description SizeFormat 
Abdulrahman Alqarafi FINAL THESIS MAY2.pdfFinal Thesis 2146.66 MBAdobe PDFUnder Embargo until 2030-01-02    Request a copy

Note: If any of the files in this item are currently embargoed, you can request a copy directly from the author by clicking the padlock icon above. However, this facility is dependent on the depositor still being contactable at their original email address.

This item is protected by original copyright

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved

If you believe that any material held in STORRE infringes copyright, please contact providing details and we will remove the Work from public display in STORRE and investigate your claim.