The Olympic are seen as the pinnacle of international sporting achievement. Social Media platforms like Twitter provide the world with access to an infinite space to share sentiments, especially during sporting events. 84,832 unique tweets related to the Tokyo 2020 Olympics were scraped and the most popular, effective, and efficient machine learning and deep learning techniques for sentiment analysis of those tweets were explored and analyzed. The results of 25 different feature extraction, machine learning and deep learning pipelines were analyzed in order to assess the best-suited framework for the task. The best Machine Learning combination was Random Forest with either Bag of Words or Term Frequency - Inverse Document Frequency with an accuracy score of 0.963, and the best Deep Learning combination was Bidirectional – Long Short-Term Memory with GloVe with an accuracy score of 0.975.

1.
Gruzd
A.
,
Doiron
S.
,
Mai
P
. “
Is happiness contagious online? A case of twitter and the 2010 Winter Olympics
.”
Proc Annu Hawaii Int Conf Syst Sci.
2011
:
1
9
. doi:
2.
Naseem
U.
,
Razzak
I.
,
Khushi
M.
,
Eklund
P.W.
,
Kim
J.
. “
COVIDSenti: A Large-Scale Benchmark Twitter Data Set for COVID-19 Sentiment Analysis
.”
IEEE Trans Comput Soc Syst.
2021
;
8
(
4
):
976
988
. doi:
3.
Jianqiang
Z.
,
Xiaolin
G
. “
Comparison research on text pre-processing methods on twitter sentiment analysis
.”
IEEE Access.
2017
;
5
(
c
):
2870
2879
. doi:
4.
Jianqiang
Z.
,
Xiaolin
G
,
Xuejun
Z
. “
Deep Convolution Neural Networks for Twitter Sentiment Analysis
.”
IEEE Access.
2018
;
6
(
c
):
23253
23260
. doi:
5.
Neethu
M.S.
,
Rajasree
R
. “
Sentiment analysis in twitter using machine learning techniques
.”
2013 4th Int Conf Comput Commun Netw Technol ICCCNT
2013
. 2013. doi:
6.
Wang
L.
,
Niu
J.
,
Yu
S
. “
SentiDiff: Combining Textual Information and Sentiment Diffusion Patterns for Twitter Sentiment Analysis
.”
IEEE Trans Knowl Data Eng.
2020
;
32
(
10
):
2026
2039
. doi:
7.
Pokharel
B.P
. “
Twitter Sentiment Analysis During Covid-19 Outbreak in Nepal
.”
SSRN Electron J.
2020
;(
March
):
1
9
. doi:
8.
Tweepy
. https://www.tweepy.org/. Accessed January 8,
2022
.
9.
TextBlob: Simplified Text Processing — TextBlob 0.16.0 documentation.
https://textblob.readthedocs.io/en/dev/. Accessed January 8,
2022
.
10.
Regular Expressions.
https://pubs.opengroup.org/onlinepubs/007908799/xbd/re.html. Accessed January 8,
2022
.
11.
Stopwords - A list of Block Words in different Languages.
https://stopwords.net/. Accessed January 8,
2022
.
12.
Porter Stemming Algorithm.
https://tartarus.org/martin/PorterStemmer/. Accessed January 8,
2022
.
13.
Introduction to Information Retrieval.
https://nlp.stanford.edu/IR-book/html/htmledition/irbook.html. Accessed January 8,
2022
.
14.
Ramos
J.
. “
Using TF-IDF to Determine Word Relevance in Document Queries.”
15.
sklearn.feature_extraction.text.HashingVectorizer — scikit-learn 1.0.2 documentation.
https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.HashingVectorizer.html. Accessed January 8, 2022.
16.
sklearn.feature_extraction. FeatureHasher — scikit-learn 1.0.2 documentation.
https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.FeatureHasher.html. Accessed January 8, 2022.
17.
Goldberg
Y.
,
Levy
O.
. “
word2vec Explained: deriving Mikolov
et al.’s negative-sampling word-embedding method.” February
2014
. http://arxiv.org/abs/1402.3722. Accessed June 7, 2021.
18.
tf.keras.layers.Embedding | TensorFlow Core v2.7.0.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding. Accessed January 19,
2022
.
19.
Pennington
J
,
Socher
R
,
Manning
C.D
. “
GloVe: Global vectors for word representation
.”
EMNLP 2014 - 2014 Conf Empir Methods Nat Lang Process Proc Conf.
2014
:
1532
1543
. doi:
20.
scikit-learn: machine learning in Python — scikit-learn 1.0.2 documentation.
https://scikit-learn.org/stable/index.html. Accessed January 8,
2022
.
21.
Logistic regression. - PsycNET.
https://psycnet.apa.org/record/1995-97110-007. Accessed January 8,
2022
.
22.
Chen
T.
,
Guestrin
C.
. “
XGBoost: A Scalable Tree Boosting System
.”
Proc 22nd ACM SIGKDD Int Conf Knowl Discov Data Min.
doi:
23.
Biau
G.
,
Scornet
E
. “
A random forest guided tour
.”
TEST 2016 252.
2016
;
25
(
2
):
197
227
. doi:
24.
Peterson
L.E
. “
K-nearest neighbor
.”
Scholarpedia.
2009
;
4
(
2
):
1883
. doi:
25.
Song
Y.Y.
,
Lu
Y
. “
Decision tree methods: applications for classification and prediction
.”
Shanghai Arch Psychiatry.
2015
;
27
(
2
):
130
. doi:
26.
Tensor Flow.
https://www.tensorflow.org/. Accessed January 8,
2022
.
27.
Greff
K.
,
Srivastava
R.K.
,
Koutnik
J.
,
Steunebrink
B.R.
,
Schmidhuber
J
. “
LSTM: A Search Space Odyssey
.”
IEEE Trans Neural Networks Learn Syst.
2017
;
28
(
10
):
2222
2232
. doi:
28.
Huang
Z.
,
Research
B.
,
Xu
W
,
Baidu
K.Y
. “
Bidirectional LSTM-CRF Models for Sequence Tagging.
” August 2015. https://arxiv.org/abs/1508.01991v1. Accessed January 8, 2022.
29.
Albawi
S.
,
Mohammed
T.A.
,
Al-Zawi
S
. “
Understanding of a convolutional neural network
.”
Proc 2017 Int Conf Eng Technol ICET 2017.
2018
;
2018
-
January
:
1
6
. doi:
30.
Garg
S.
,
Panwar
D.S.
,
Gupta
A.
,
Katarya
R
. “
A literature review on sentiment analysis techniques involving social media platforms
.”
PDGC 2020 - 2020 6th Int Conf Parallel, Distrib Grid Comput. November
2020
:
254
-
259
doi:
31.
K.
Nath
,
P.
Soni
,
Anjum
,
A.
Ahuja
and
R.
Katarya
, "
Study of Fake News Detection using Machine Learning and Deep Learning Classification Methods
,"
2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)
,
2021
, pp.
434
438
, DOI: .
This content is only available via PDF.
You do not currently have access to this content.