Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/6497
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorAmir, Hussain-
dc.contributor.advisorCatherine, Havasi-
dc.contributor.advisorChris, Eckl-
dc.contributor.authorErik, Cambria-
dc.date.accessioned2012-05-21T09:14:32Z-
dc.date.available2012-05-21T09:14:32Z-
dc.date.issued2011-12-16-
dc.identifier.urihttp://hdl.handle.net/1893/6497-
dc.description.abstractThe ways people express their opinions and sentiments have radically changed in the past few years thanks to the advent of social networks, web communities, blogs, wikis and other online collaborative media. The distillation of knowledge from this huge amount of unstructured information can be a key factor for marketers who want to create an image or identity in the minds of their customers for their product, brand, or organisation. These online social data, however, remain hardly accessible to computers, as they are specifically meant for human consumption. The automatic analysis of online opinions, in fact, involves a deep understanding of natural language text by machines, from which we are still very far. Hitherto, online information retrieval has been mainly based on algorithms relying on the textual representation of web-pages. Such algorithms are very good at retrieving texts, splitting them into parts, checking the spelling and counting their words. But when it comes to interpreting sentences and extracting meaningful information, their capabilities are known to be very limited. Existing approaches to opinion mining and sentiment analysis, in particular, can be grouped into three main categories: keyword spotting, in which text is classified into categories based on the presence of fairly unambiguous affect words; lexical affinity, which assigns arbitrary words a probabilistic affinity for a particular emotion; statistical methods, which calculate the valence of affective keywords and word co-occurrence frequencies on the base of a large training corpus. Early works aimed to classify entire documents as containing overall positive or negative polarity, or rating scores of reviews. Such systems were mainly based on supervised approaches relying on manually labelled samples, such as movie or product reviews where the opinionist’s overall positive or negative attitude was explicitly indicated. However, opinions and sentiments do not occur only at document level, nor they are limited to a single valence or target. Contrary or complementary attitudes toward the same topic or multiple topics can be present across the span of a document. In more recent works, text analysis granularity has been taken down to segment and sentence level, e.g., by using presence of opinion-bearing lexical items (single words or n-grams) to detect subjective sentences, or by exploiting association rule mining for a feature-based analysis of product reviews. These approaches, however, are still far from being able to infer the cognitive and affective information associated with natural language as they mainly rely on knowledge bases that are still too limited to efficiently process text at sentence level. In this thesis, common sense computing techniques are further developed and applied to bridge the semantic gap between word-level natural language data and the concept-level opinions conveyed by these. In particular, the ensemble application of graph mining and multi-dimensionality reduction techniques on two common sense knowledge bases was exploited to develop a novel intelligent engine for open-domain opinion mining and sentiment analysis. The proposed approach, termed sentic computing, performs a clause-level semantic analysis of text, which allows the inference of both the conceptual and emotional information associated with natural language opinions and, hence, a more efficient passage from (unstructured) textual information to (structured) machine-processable data. The engine was tested on three different resources, namely a Twitter hashtag repository, a LiveJournal database and a PatientOpinion dataset, and its performance compared both with results obtained using standard sentiment analysis techniques and using different state-of-the-art knowledge bases such as Princeton’s WordNet, MIT’s ConceptNet and Microsoft’s Probase. Differently from most currently available opinion mining services, the developed engine does not base its analysis on a limited set of affect words and their co-occurrence frequencies, but rather on common sense concepts and the cognitive and affective valence conveyed by these. This allows the engine to be domain-independent and, hence, to be embedded in any opinion mining system for the development of intelligent applications in multiple fields such as Social Web, HCI and e-health. Looking ahead, the combined novel use of different knowledge bases and of common sense reasoning techniques for opinion mining proposed in this work, will, eventually, pave the way for development of more bio-inspired approaches to the design of natural language processing systems capable of handling knowledge, retrieving it when necessary, making analogies and learning from experience.en_GB
dc.language.isoenen_GB
dc.publisherUniversity of Stirlingen_GB
dc.subjectAIen_GB
dc.subjectNLPen_GB
dc.subjectopinion miningen_GB
dc.subjectsentiment analysisen_GB
dc.subjectKRen_GB
dc.subjectHCIen_GB
dc.subject.lcshComputational linguistics.en_GB
dc.subject.lcshHuman-machine systemsen_GB
dc.subject.lcshHuman-computer interaction.en_GB
dc.titleApplication of Common Sense Computing for the Development of a Novel Knowledge-Based Opinion Mining Engineen_GB
dc.typeThesis or Dissertationen_GB
dc.type.qualificationlevelDoctoralen_GB
dc.type.qualificationnameDoctor of Philosophyen_GB
dc.rights.embargodate2012-12-31-
dc.rights.embargoreasonI require time to write articles for publication from my thesisen_GB
dc.author.emailcambria.erik@gmail.comen_GB
dc.contributor.affiliationSchool of Natural Sciencesen_GB
dc.contributor.affiliationComputing Science and Mathematicsen_GB
Appears in Collections:Computing Science and Mathematics eTheses

Files in This Item:
File Description SizeFormat 
thesis.pdf7.14 MBAdobe PDFView/Open


This item is protected by original copyright



Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.