Project File Details


Original Author (Copyright Owner): 

3,000.00

Instant Download

Download your project material immediately after online payment.

100% Money Back Guarantee

File Type: MS Word (DOC) & PDF

File Size:  737KB

Number of Pages:55

 

ABSTRACT

Sentiment analysis has proven to be one of the most challenging tasks in natural language processing
(NLP). Many AI systems have been developed which can detect the polarity of a sentence (degree of
positivity, neutrality or negativity). But more information such as the emotion of the author can be detected.
Our task here is to build an artificial agent – or an AI system – that is capable of detecting polarity
in a document as well as the emotion of the author. Generally speaking, sentiment analysis detects the
polarity of the opinion based on the object/subject in discussion. But emotion detection can identify
the particular “mood” of the author. Social platforms – such as Facebook, Twitter, IMDB, or comment
sections from online newspapers – to name a few – can provide a huge corpus of – usually unlabeled
or extremely sparsely labelled – content. Using modern machine learning tools and the available computational
power, an agent can analyze the content – given usually as a list of messages – and detect
the general emotion “within the message”. It must be capable of identifying subjects with unstable or
“chaotic” mood that might require attention.
Our aim is to attempt detecting the underlying emotion as well as the polarity of a document. In this
process we will make use of open source libraries and publicly available data sets. The program will be
able to run locally, both in geographic and in cultural sense, and to analyze the results that were obtained.
This work is the result of my own activity. I have neither given nor received unauthorized assistance on
this work.

 

 

TABLE OF CONTENTS

1 Introduction 5
2 Problem Statement 6
2.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Significance of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Research methodology 8
3.1 Sentiment Analysis process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.2 Text Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.3 Sentiment Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.4 Sentiment Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.5 Presentation of Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 Delimitation of the study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.4 Computation Science Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.5 Machine Learning Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.6 Examples of Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . 16
3.3.7 Maximum Entropy (supervised) . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.8 Support Vector Machines (supervised) . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.9 K-Means Clustering (unsupervised) . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Studies on Sentiment Analysis Using the Naïve Bayes Classifier . . . . . . . . . . . . . 18
3.4.1 Choosing the training set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4.3 Strengths and Challenges of this Method . . . . . . . . . . . . . . . . . . . . . 20
3.4.4 Studies on Term Frequency—Inverse Document Frequency . . . . . . . . . . . . 20
3.4.5 Studies Using Lexicon-Driven Methods . . . . . . . . . . . . . . . . . . . . . . 21
3.4.6 Studies Using Graph-Based Label Propagation . . . . . . . . . . . . . . . . . . 21
3.4.7 Studies on Predicting Users Actions from Sentiment . . . . . . . . . . . . . . . 22
3.4.8 Non-Hierarchical Action Classifier . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4.9 Hierarchical Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5 Challenges or problems faced with the current systems . . . . . . . . . . . . . . . . . . 24
2
CONTENTS
3.6 Analysis of the proposed system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.7 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Proposed Sentiment Analysis System 26
4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Lexicon (VaderSentiment) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.1 Subjectivity and Objectivity (TextBlob) . . . . . . . . . . . . . . . . . . . . . . 29
4.2.2 Multi-label convolutional neural network text classifier (Spacy) . . . . . . . . . 29
4.2.3 Training Dataset (ISEAR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.4 Streaming twitter (tweepy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.5 Text Processing document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.6 Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5 Implementing Sentiment analysis 35
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.1 Implementing (VaderSentiment and TextBlob) . . . . . . . . . . . . . . . . . . 35
5.1.2 Training SpaCy Text classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.3 Streaming twitter API(Tweepy) . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.1.4 Evaluating SpaCy classification model . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 System design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2.1 Architectural Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2.2 Sentiment Analysis (detection and classification) . . . . . . . . . . . . . . . . . 39
5.2.3 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2.4 Use Case Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3 Development Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3.1 SpaCy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3.2 VaderSentiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3.3 TextBlob . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.3.4 Jupyter Notebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.3.5 MatplotLib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.4 Creating Twitter Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.5 Implications of findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6 Recommendations, future work and conclusion 45
6.1 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7 Results and analysis 46
7.1 Data visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.1.1 matplotlib and seaborn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8 Conclusion 48
3

 

CHAPTER ONE

 

Introduction
Sentiment analysis makes use of computational techniques to study peoples’ emotions and opinions on
given topics. In recent years this field has attracted a lot of attention from both academia and industry, it
comes with a lot of challenging research problems but has a wide range of applications.
Whenever we want to make a decision, we must take into consideration the opinions of others, this
is what makes opinions important. Both individuals and organizations who want to know the opinions of
others benefit from this.
Prior to the web, no computational study on peoples opinions was being done. Opinionated text
did not exist in abundance. To get people’s opinion one would typically need to use techniques such
as surveys or questionnaires, or simply ask from friends and or family members. When organizations
wanted to get opinions about services or products, they would typically use these methods.
But due to the explosive growth of social media websites and mobile applications, opinionated content
on the web has increased exponentially. People can now share their opinions about almost anything
on blogs, comment sections and social websites [Liu, 2010]
5

GET THE FULL WORK

DISCLAIMER: All project works, files and documents posted on this website, projects.ng are the property/copyright of their respective owners. They are for research reference/guidance purposes only and the works are crowd-sourced. Please don’t submit someone’s work as your own to avoid plagiarism and its consequences. Most of the project works are provided by the schools' libraries to help in guiding students on their research. Use it as a guidance purpose only and not copy the work word for word (verbatim). If you see your work posted here, and you want it to be removed/credited, please call us on +2348157165603 or send us a mail together with the web address link to the work, to hello@projects.ng. We will reply to and honor every request. Please notice it may take up to 24 or 48 hours to process your request.