With the Emergence of social media high quality of structured and unstructured information shared through various sources such as the data generated by Twitter or Facebook which depicts user sentiments. The ability to process this information has become important to deep dive into the Brand Perception using Twitter Sentiment Analysis. Text analysis is a part of machine learning technique where the high-quality information derived from text to mine the customer perception about a particular brand. In this article, we see how to perform Brand Perception using Twitter Sentiment Analysis.
Brand perception is a special result of a consumer’s experiences with a brand. The purpose of this article is to find image and awareness of the brand among the consumers, what they really think and feel about #Nike as a brand. For the study, we collected data from Twitter for brand perception and sentiment analysis.
Just Do It is a trademark of shoe company Nike, and one of the core components of Nike’s brand. The slogan coined in 1988 at an advertising agency meeting.
PRE-PROCESSING AND CLEANING
Pre-processing the text can dramatically improve the Bag of Words method. The first step towards doing this is Creating a Corpus, which in simple terms, is a collection of a text document. Once the Corpus created, we are ready for pre-processing. First, let us remove Punctuation. The basic approach to deal with this is to remove everything that isn’t a standard number or letter.
In our case, we will remove all punctuation. Next, we change the case of the word to lowercase so that same words are not counted as different because of lower or upper case. Another pre-processing task we must do is to remove meaningless terms to improve our ability to understand sentiments. Transformations in text done via the tm_map() function. Basically, all transformations work on single text documents and tm_map() just applies them to all documents in a corpus.
- Remove URL: This will remove URL from the corpus document
- Remove Punctuation: Build external function and passed to tm_map() to transform text corpus. Unnecessary punctuations from the text corpus will be removed
- Strip white-space: This will trim white-space from the text corpora
- Remove the @ (usernames): Few words in the corpus may contain mail id’s or words starting with @.
The process of normalization involves transforming text uniformly.
convert text to lowercase
corpus <- tm_map(corpus, content_transformer(stri_trans_tolower)) writeLines(strwrap(corpus[]$content,60))
Stopwords are just common words which are meaningless. If we look at the result of stop words (“English”) we can see what is getting removed.
corpus = tm_map(corpus, removeWords, stopwords('English')) inspect(corpus)
remove single letter words
removeSingle <- function(x) gsub(" . ", " ", x) corpus <- tm_map(corpus, content_transformer(removeSingle)) writeLines(strwrap(corpus[]$content,60))
Once we have pre-processed our data, we’re now ready to extract the word frequencies used in our twitter data. The tm package provides a function called Term Document Matrix that generates a matrix where the rows correspond to documents, in our case tweets, and the columns correspond to words in those tweets. The values in the matrix are the counts of how many times that word appeared in each document. Document matrix is a table containing the frequency of the words. Column names are words and row names are documents. The function TermDocumentMatrix() from text mining package used as follow :
dtm <- TermDocumentMatrix(corpus)
Most Frequent Terms
Based on the termdocumentmatrix() output tried to sort the keywords based on their frequency. The word with high frequency is justdoit as tweeted by the users where trump is the least occurring word in the corpus.
Low-frequency @55 and @85 words shown in the below frequency count plot.
Here we try to find the association between the keywords. The word nike is strongly correlated with terms believe, Christ, Jesus, Kaepernick, dems and run
Since there are many similar tweets getting generated with #nike, it becomes challenging making meaningful interpretations from the huge volumes of data being processed. We try to cluster similar tweets together. Hierarchical clustering attempts to build different levels of clusters.
The R function, hclust() function performs hierarchical clustering. It uses the agglomerative method. To do this operation, corpus converted to document matrix. We have used Ward’s method for hierarchical clustering. Dendrogram presented below shows the results of hierarchical clustering.
TWITTER SENTIMENT ANALYSIS
Sentiment analysis and opinion mining is the field of study that analyzes people’s opinions, sentiments, evaluations, attitudes, and emotions from written language. It is one of the most active research areas in natural language processing and artificial intelligence which is also widely studied in data mining, Web mining, and text mining.
Sentiments categorized as Positive, Anticipation, Fear, Joy, Surprise and Negative.
This suggests that out of 2000 text data score for positive tweets are comparatively higher than the other sentiments about 1400 where 1300 keywords anticipated as negative and there are few sentiments which cannot be categorized.
Nike launched a campaign “Just Do it” for its 30th Anniversary with controversial ex-NFL star Colin Kaepernick. The controversy around Kaepernick relates to his decision to kneel during the American National Anthem played at the start of NFL games to protest the police brutality against people of color. Kaepernick has been without an NFL contract for the past two seasons but is still seen as a polarizing figure. In launching this new campaign Nike is risking alienating a huge segment of its U.S. consumer base, perhaps as much as half. Why would they do that? Perhaps they are thinking that it will tighten the tribe with millennial, who tend to involved in protest movements, particularly when political leaders and other authority figures are not aligned with their feelings and values.
So, this campaign has scattered parts of the Nike tribe of loyal American patriots and people who serve or have served in our armed forces, government or institutions that rely closely on a healthy government and national reputation. These people see not standing for the national anthem at a sporting event as an outward sign of disrespect for the idea of America and all the sacrifices made in the name of the nation. They see the gestures taken by Colin Kaepernick as a sign of questionable character. They see his public gestures as inappropriate and out-of-place.
This latest version of the “Just Do It” has generated a level of social debate that, over time will elevate greater social understanding for the risk Nike has taken with this campaign. For many urban and minority professional athletes this campaign will draw them closer to the brand. They like that Nike is supporting individual athlete rights, acts of moral conscience, conviction and protest. For the league or the nation to criticize their freedom of speech or expression is dimly viewed. But, that we can have a dialog about the pro’s and con’s of such an event and moment in our history speaks loudly about American values and human values. The current debate would probably make the founders of the United States proud.
for more information read :