Subscribe to the PwC Newsletter

Join the community, add a new evaluation result row, sentiment analysis.

1297 papers with code • 39 benchmarks • 93 datasets

Sentiment Analysis is the task of classifying the polarity of a given text. For instance, a text-based tweet can be categorized into either "positive", "negative", or "neutral". Given the text and accompanying labels, a model can be trained to predict the correct sentiment.

Sentiment Analysis techniques can be categorized into machine learning approaches, lexicon-based approaches, and even hybrid methods. Some subcategories of research in sentiment analysis include: multimodal sentiment analysis, aspect-based sentiment analysis, fine-grained opinion analysis, language specific sentiment analysis.

More recently, deep learning techniques, such as RoBERTa and T5, are used to train high-performing sentiment classifiers that are evaluated using metrics like F1, recall, and precision. To evaluate sentiment analysis systems, benchmark datasets like SST, GLUE, and IMDB movie reviews are used.

Further readings:

  • Sentiment Analysis Based on Deep Learning: A Comparative Study

sentiment analysis research topics

Benchmarks Add a Result

sentiment analysis research topics

Most implemented papers

Bert: pre-training of deep bidirectional transformers for language understanding.

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.

Convolutional Neural Networks for Sentence Classification

sentiment analysis research topics

We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks.

Universal Language Model Fine-tuning for Text Classification

sentiment analysis research topics

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.

Bag of Tricks for Efficient Text Classification

facebookresearch/fastText • EACL 2017

This paper explores a simple and efficient baseline for text classification.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

A Structured Self-attentive Sentence Embedding

This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

Deep contextualized word representations

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).

Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training.

Domain-Adversarial Training of Neural Networks

Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains.

Illustration with collage of pictograms of clouds, pie chart, graph pictograms

Sentiment analysis, or opinion mining, is the process of analyzing large volumes of text to determine whether it expresses a positive sentiment, a negative sentiment or a neutral sentiment.

Companies now have access to more data about their customers than ever before, presenting both an opportunity and a challenge: analyzing the vast amounts of textual data available and extracting meaningful insights to guide their business decisions.

From emails and tweets to online survey responses, chats with customer service representatives and reviews, the sources available to gauge customer sentiment are seemingly endless. Sentiment analysis systems help companies better understand their customers, deliver stronger customer experiences and improve their brand reputation.

Discover the power of integrating a data lakehouse strategy into your data architecture, including enhancements to scale AI and cost optimization opportunities.

With more ways than ever for people to express their feelings online, organizations need powerful tools to monitor what’s being said about them and their products and services in near real time. As companies adopt sentiment analysis and begin using it to analyze more conversations and interactions, it will become easier to identify customer friction points at every stage of the customer journey.

Deliver more objective results from customer reviews

The latest artificial intelligence (AI) sentiment analysis tools help companies filter reviews and net promoter scores (NPS) for personal bias and get more objective opinions about their brand, products and services. For example, if a customer expresses a negative opinion along with a positive opinion in a review, a human assessing the review might label it negative before reaching the positive words. AI-enhanced sentiment classification helps sort and classify text in an objective manner, so this doesn’t happen, and both sentiments are reflected.  

Achieve greater scalability of business intelligence programs

Sentiment analysis enables companies with vast troves of unstructured data to analyze and extract meaningful insights from it quickly and efficiently. With the amount of text generated by customers across digital channels, it’s easy for human teams to get overwhelmed with information. Strong, cloud-based, AI-enhanced customer sentiment analysis tools help organizations deliver business intelligence from their customer data at scale, without expending unnecessary resources.

Perform real-time brand reputation monitoring

Modern enterprises need to respond quickly in a crisis. Opinions expressed on social media, whether true or not, can destroy a brand reputation that took years to build. Robust, AI-enhanced sentiment analysis tools help executives monitor the overall sentiment surrounding their brand so they can spot potential problems and address them swiftly.

Sentiment analysis uses natural language processing (NLP) and machine learning (ML) technologies to train computer software to analyze and interpret text in a way similar to humans. The software uses one of two approaches, rule-based or ML—or a combination of the two known as hybrid. Each approach has its strengths and weaknesses; while a rule-based approach can deliver results in near real-time, ML based approaches are more adaptable and can typically handle more complex scenarios.

Rule-based sentiment analysis

In the rule-based approach, software is trained to classify certain keywords in a block of text based on groups of words, or lexicons, that describe the author’s intent. For example, words in a positive lexicon might include “affordable,” “fast” and “well-made,” while words in a negative lexicon might feature “expensive,” “slow” and “poorly made”. The software then scans the classifier for the words in either the positive or negative lexicon and tallies up a total sentiment score based on the volume of words used and the sentiment score of each category.

Machine learning sentiment analysis

With a machine learning (ML) approach, an algorithm is used to train software to gauge sentiment in a block of text using words that appear in the text as well as the order in which they appear. Developers use sentiment analysis algorithms to teach software how to identify emotion in text similarly to the way humans do. ML models continue to “learn” from the data they are fed, hence the name “machine learning”. Here are a few of the most commonly used classification algorithms:

Linear regression: A statistics algorithm that describes a value (Y) based on a set of features (X).

Naive Bayes: An algorithm that uses Bayes’ theorem to categorize words in a block of text.

Support vector machines: A fast and efficient classification algorithm used to solve two-group classification problems.

Deep learning (DL): Also known as an artificial neural network, deep learning is an advanced machine learning technique that links together multiple algorithms to mimic human brain function.

The hybrid approach

A hybrid approach to text analysis combines both ML and rule-based capabilities to optimize accuracy and speed. While highly accurate, this approach requires more resources, such as time and technical capacity, than the other two.

In addition to the different approaches used to build sentiment analysis tools, there are also different types of sentiment analysis that organizations turn to depending on their needs. The three most popular types, emotion based, fine-grained and aspect-based sentiment analysis (ABSA) all rely on the underlying software’s capacity to gauge something called polarity, the overall feeling that is conveyed by a piece of text.

Generally speaking, a text’s polarity can be described as either positive, negative or neutral, but by categorizing the text even further, for example into subgroups such as “extremely positive” or “extremely negative,” some sentiment analysis models can identify more subtle and complex emotions. The polarity of a text is the most commonly used metric for gauging textual emotion and is expressed by the software as a numerical rating on a scale of one to 100. Zero represents a neutral sentiment and 100 represents the most extreme sentiment.

Here are the three most widely used types of sentiment analysis:

Fine-grained (graded)

Fine-grained, or graded, sentiment analysis is a type of sentiment analysis that groups text into different emotions and the level of emotion being expressed. The emotion is then graded on a scale of zero to 100, similar to the way consumer websites deploy star-ratings to measure customer satisfaction.

Aspect-based (ABSA)

Aspect based sentiment analysis (ABSA) narrows the scope of what’s being examined in a body of text to a singular aspect of a product, service or customer experience a business wishes to analyze. For example, a budget travel app might use ABSA to understand how intuitive a new user interface is or to gauge the effectiveness of a customer service chatbot. ABSA can help organizations better understand how their products are succeeding or falling short of customer expectations.

Emotional detection

Emotional detection sentiment analysis seeks to understand the psychological state of the individual behind a body of text, including their frame of mind when they were writing it and their intentions. It is more complex than either fine-grained or ABSA and is typically used to gain a deeper understanding of a person’s motivation or emotional state. Rather than using polarities, like positive, negative or neutral, emotional detection can identify specific emotions in a body of text such as frustration, indifference, restlessness and shock.

Organizations conduct sentiment analysis for a variety of reasons. Here are some of the most popular use cases.  

Support teams use sentiment analysis to deliver more personalized responses to customers that accurately reflect the mood of an interaction. AI-based chatbots that use sentiment analysis can spot problems that need to be escalated quickly and prioritize customers in need of urgent attention. ML algorithms deployed on customer support forums help rank topics by level-of-urgency and can even identify customer feedback that indicates frustration with a particular product or feature. These capabilities help customer support teams process requests faster and more efficiently and improve customer experience.

By using sentiment analysis to conduct social media monitoring brands can better understand what is being said about them online and why. For example, is a new product launch going well? Monitoring sales is one way to know, but will only show stakeholders part of the picture. Using sentiment analysis on customer review sites and social media to identify the emotions being expressed about the product will enable a far deeper understanding of how it is landing with customers.

By turning sentiment analysis tools on the market in general and not just on their own products, organizations can spot trends and identify new opportunities for growth. Maybe a competitor’s new campaign isn’t connecting with its audience the way they expected, or perhaps someone famous has used a product in a social media post increasing demand. Sentiment analysis tools can help spot trends in news articles, online reviews and on social media platforms, and alert decision makers in real time so they can take action.

While sentiment analysis and the technologies underpinning it are growing rapidly, it is still a relatively new field. According to “Sentiment Analysis,” by Liu Bing (2020) the term has only been widely used since 2003. 1 There is still much to be learned and refined, here are some of the most common drawbacks and challenges.

Lack of context

Context is a critical component for understanding what emotion is being expressed in a block of text and one that frequently causes sentiment analysis tools to make mistakes. On a customer survey, for example, a customer might give two answers to the question: “What did you like about our app?” The first answer might be “functionality” and the second, “UX”. If the question being asked was different, for example, “What didn’t you like about our app?” it changes the meaning of the customer’s response without changing the words themselves. To correct this problem, the algorithm would need to be given the original context of the question the customer was responding to, a time-consuming tactic known as pre or post  processing.

Use of irony and sarcasm

Regardless of the level or extent of its training, software has a hard time correctly identifying irony and sarcasm in a body of text. This is because often when someone is being sarcastic or ironic it’s conveyed through their tone of voice or facial expression and there is no discernable difference in the words they’re using. For example, when analyzing the phrase, “Awesome, another thousand-dollar parking ticket—just what I need,” a sentiment analysis tool would likely mistake the nature of the emotion being expressed and label it as positive because of the use of the word “awesome”.

Negation is when a negative word is used to convey a reversal of meaning in a sentence. For example, consider the sentence, “I wouldn’t say the shoes were cheap." What’s being expressed, is that the shoes were probably expensive, or at least moderately priced, but a sentiment analysis tool would likely miss this subtlety.  

Idiomatic language

Idiomatic language, such as the use of—for example—common English phrases like “Let’s not beat around the bush,” or “Break a leg ,” frequently confounds sentiment analysis tools and the ML algorithms that they’re built on. When human language phrases like the ones above are used on social media channels or in product reviews, sentiment analysis tools will either incorrectly identify them—the “break a leg” example could be incorrectly identified as something painful or sad, for example—or miss them completely.

Organizations who decide they want to deploy sentiment analysis to better understand their customers have two options for how they can go about it: either purchase an existing tool or build one of their own.

Businesses opting to build their own tool typically use an open-source library in a common coding language such as Python or Java. These libraries are useful because their communities are steeped in data science. Still, organizations looking to take this approach will need to make a considerable investment in hiring a team of engineers and data scientists.

Acquiring an existing software as a service (SaaS) sentiment analysis tool requires less initial investment and allows businesses to deploy a pre-trained machine learning model rather than create one from scratch. SaaS sentiment analysis tools can be up and running with just a few simple steps and are a good option for businesses who aren’t ready to make the investment necessary to build their own.

Today’s most effective customer support sentiment analysis solutions use the power of AI and ML to improve customer experiences. IBM watsonx Assistant is a market leading, conversational artificial intelligence platform powered by large language models (LLMs) that enables organizations to build AI-powered voice agents and chatbots that deliver superior automated self-service support to their customers on a simple, easy-to-use interface.

Discover how artificial intelligence leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind.

Gain a deeper understanding of machine learning along with important definitions, applications and concerns within businesses today.

Learn about the importance of mitigating bias in sentiment analysis and see how AI is being trained to be more neutral, unbiased and unwavering.

IBM watsonx Assistant helps organizations provide better customer experiences with an AI chatbot that understands the language of the business, connects to existing customer care systems, and deploys anywhere with enterprise security and scalability. watsonx Assistant automates repetitive tasks and uses machine learning to resolve customer support issues quickly and efficiently.

1 “Sentiment Analysis (Second edition),"  (link resides outside ibm.com), Liu, Bing, Cambridge University Press, September 23, 2020

sentiment analysis Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Aspect-based Sentiment Analysis using Dependency Parsing

In this paper, an aspect-based Sentiment Analysis (SA) system for Hindi is presented. The proposed system assigns a separate sentiment towards the different aspects of a sentence as well as it evaluates the overall sentiment expressed in a sentence. In this work, Hindi Dependency Parser (HDP) is used to determine the association between an aspect word and a sentiment word (using Hindi SentiWordNet) and works on the idea that closely connected words come together to express a sentiment about a certain aspect. By generating a dependency graph, the system assigns the sentiment to an aspect having a minimum distance between them and computes the overall polarity of the sentence. The system achieves an accuracy of 83.2% on a corpus of movie reviews and its results are compared with baselines as well as existing works on SA. From the results, it has been observed that the proposed system has the potential to be used in emerging applications like SA of product reviews, social media analysis, etc.

Sentiment Analysis Applied to News from the Brazilian Stock Market

Trg-datt: the target relational graph and double attention network based sentiment analysis and prediction for supporting decision making.

The management of public opinion and the use of big data monitoring to accurately judge and verify all kinds of information are valuable aspects in the enterprise management decision-making process. The sentiment analysis of reviews is a key decision-making tool for e-commerce development. Most existing review sentiment analysis methods involve sequential modeling but do not focus on the semantic relationships. However, Chinese semantics are different from English semantics in terms of the sentence structure. Irrelevant contextual words may be incorrectly identified as cues for sentiment prediction. The influence of the target words in reviews must be considered. Thus, this paper proposes the TRG-DAtt model for sentiment analysis based on target relational graph (TRG) and double attention network (DAtt) to analyze the emotional information to support decision making. First, dependency tree-based TRG is introduced to independently and fully mine the semantic relationships. We redefine and constrain the dependency and use it as the edges to connect the target and context words. Second, we design dependency graph attention network (DGAT) and interactive attention network (IAT) to form the DAtt and obtain the emotional features of the target words and reviews. DGAT models the dependency of the TRG by aggregating the semantic information. Next, the target emotional enhancement features obtained by the DGAT are input to the IAT. The influence of each target word on the review can be obtained through the interaction. Finally, the target emotional enhancement features are weighted by the impact factor to generate the review's emotional features. In this study, extensive experiments were conducted on the car and Meituan review data sets, which contain consumer reviews on cars and stores, respectively. The results demonstrate that the proposed model outperforms the existing models.

A Comprehensive Guideline for Bengali Sentiment Annotation

Sentiment Analysis (SA) is a Natural Language Processing (NLP) and an Information Extraction (IE) task that primarily aims to obtain the writer’s feelings expressed in positive or negative by analyzing a large number of documents. SA is also widely studied in the fields of data mining, web mining, text mining, and information retrieval. The fundamental task in sentiment analysis is to classify the polarity of a given content as Positive, Negative, or Neutral . Although extensive research has been conducted in this area of computational linguistics, most of the research work has been carried out in the context of English language. However, Bengali sentiment expression has varying degree of sentiment labels, which can be plausibly distinct from English language. Therefore, sentiment assessment of Bengali language is undeniably important to be developed and executed properly. In sentiment analysis, the prediction potential of an automatic modeling is completely dependent on the quality of dataset annotation. Bengali sentiment annotation is a challenging task due to diversified structures (syntax) of the language and its different degrees of innate sentiments (i.e., weakly and strongly positive/negative sentiments). Thus, in this article, we propose a novel and precise guideline for the researchers, linguistic experts, and referees to annotate Bengali sentences immaculately with a view to building effective datasets for automatic sentiment prediction efficiently.

Employee Sentiment Analysis Towards Remote Work during COVID-19 Using Twitter Data

Topic modelling and sentiment analysis of global warming tweets.

With the increasing extreme weather events and various disasters, people are paying more attention to environmental issues than ever, particularly global warming. Public debate on it has grown on various platforms, including newspapers and social media. This paper examines the topics and sentiments of the discussion of global warming on Twitter over a span of 18 months using two big data analytics techniques—topic modelling and sentiment analysis. There are seven main topics concerning global warming frequently debated on Twitter: factors causing global warming, consequences of global warming, actions necessary to stop global warming, relations between global warming and Covid-19; global warming’s relation with politics, global warming as a hoax, and global warming as a reality. The sentiment analysis shows that most people express positive emotions about global warming, though the most evoked emotion found across the data is fear, followed by trust. The study provides a general and critical view of the public’s principal concerns and their feelings about global warming on Twitter.

Transparent Aspect-Level Sentiment Analysis Based on Dependency Syntax Analysis and Its Application on COVID-19

Aspect-level sentiment analysis identifies fine-grained emotion for target words. There are three major issues in current models of aspect-level sentiment analysis. First, few models consider the natural language semantic characteristics of the texts. Second, many models consider the location characteristics of the target words, but ignore the relationships among the target words and among the overall sentences. Third, many models lack transparency in data collection, data processing, and results generating in sentiment analysis. In order to resolve these issues, we propose an aspect-level sentiment analysis model that combines a bidirectional Long Short-Term Memory (LSTM) network and a Graph Convolutional Network (GCN) based on Dependency syntax analysis (Bi-LSTM-DGCN). Our model integrates the dependency syntax analysis of the texts, and explicitly considers the natural language semantic characteristics of the texts. It further fuses the target words and overall sentences. Extensive experiments are conducted on four benchmark datasets, i.e., Restaurant14, Laptop, Restaurant16, and Twitter. The experimental results demonstrate that our model outperforms other models like Target-Dependent LSTM (TD-LSTM), Attention-based LSTM with Aspect Embedding (ATAE-LSTM), LSTM+SynATT+TarRep and Convolution over a Dependency Tree (CDT). Our model is further applied to aspect-level sentiment analysis on “government” and “lockdown” of 1,658,250 tweets about “#COVID-19” that we collected from March 1, 2020 to July 1, 2020. The experimental results show that Twitter users’ positive and negative sentiments fluctuated over time. Through the transparency analysis in data collection, data processing, and results generating, we discuss the reasons for the evolution of users’ emotions over time based on the tweets and on our models.

Aspect Based Sentiment Analysis of Unlabeled Reviews Using Linguistic Rule Based LDA

In this digital era, people are very keen to share their feedback about any product, services, or current issues on social networks and other platforms. A fine analysis of these feedbacks can give a clear picture of what people think about a particular topic. This work proposed an almost unsupervised Aspect Based Sentiment Analysis approach for textual reviews. Latent Dirichlet Allocation, along with linguistic rules, is used for aspect extraction. Aspects are ranked based on their probability distribution values and then clustered into predefined categories using frequent terms with domain knowledge. SentiWordNet lexicon uses for sentiment scoring and classification. The experiment with two popular datasets shows the superiority of our strategy as compared to existing methods. It shows the 85% average accuracy when tested on manually labeled data.

Aspect Based Sentiment Analysis of Unlabeled Reviews using Linguistic Rule Based LDA

Measuring citizen satisfaction with e-government services by using sentiment analysis technology, export citation format, share document.

sentiment analysis research topics

Sentiment Analysis: Decoding Emotions for Research

sentiment analysis research topics

Introduction

What is sentiment analysis, what is an example of sentiment analysis, why is sentiment analysis important, how do you collect sentiments, how do you analyze sentiments, what are the current challenges for sentiment analysis.

Sentiment analysis is the process of determining whether textual data contains a positive sentiment or a negative sentiment. Researchers use sentiment analysis tools to provide additional clarity and context to the messages conveyed in words to deliver more meaningful insights.

In this article, we'll look at the importance of sentiments, how researchers analyze sentiments, and what strategies and tools can help you in your research .

sentiment analysis research topics

Sentiment analysis is a subset of natural language processing (NLP) that focuses on extracting and understanding the emotional content from data . The primary objective is to classify the polarity of a text as positive, negative, or neutral. This classification is essential for understanding customer sentiment, gauging public opinion, and conducting in-depth research on various topics.

At its core, a sentiment analysis system employs machine learning techniques and algorithms to dissect the language used in text data from many sources, such as:

  • written feedback
  • news articles
  • survey records
  • social media posts

One of the most refined forms of this method is aspect-based sentiment analysis. Rather than merely classifying the overall sentiment of a document, this kind of analysis pinpoints specific topics or aspects within the text and evaluates the sentiment towards each. Such sentiment analysis technologies with natural language processing can also be used for opinion mining.

A simple example

Consider a product review that states, "The camera on this phone is excellent, but the battery life is short." A sentiment analysis model would recognize the positive sentiment towards the camera and the negative sentiment towards the battery life, rather than giving a blanket sentiment score.

Sentiment analysis tools are varied, ranging from simple models that identify positive and negative terms to sophisticated sentiment analysis models that rely on machine learning and data scientists for insightful sentiment analysis. Such tools work by assigning a sentiment score to words or phrases, often based on their context. The result? A sentiment analysis solution that deciphers the nuances of human language, turning unstructured data into actionable insights.

Ultimately, an accurate sentiment analysis bridges the gap between the vast world of text-based data and the need to understand the underlying emotions and opinions it contains. Whether you're a researcher looking to perform sentiment analysis on news articles or a business keen on understanding customer feedback, sentiment analysis is a pivotal tool in today's data-driven world.

sentiment analysis research topics

For deeper insights, turn to ATLAS.ti

Make the most of your data with the most comprehensive qualitative data analysis available. Download a free trial today.

Sentiment analysis offers tangible examples of its applications across diverse fields. From businesses striving to enhance their products to researchers aiming to grasp public sentiment on various issues, the power of sentiment analysis is evident.

By examining specific sectors, we can better understand the profound impact this analysis has on our decision-making processes and the vast potential it holds in shaping perceptions.

Market research

Conducting market research often consists of analyzing sentiment to gauge public reactions to a product or service. Using sentiment analysis tools, companies can sift through survey responses and online reviews, identifying patterns that might not be immediately apparent.

For example, if a new beverage receives predominantly positive reviews for its taste but negative comments about its packaging, this analytical approach can highlight these specific sentiments, guiding the company in refining its offering.

sentiment analysis research topics

Customer feedback

Customer feedback is a goldmine of sentiment analysis datasets for businesses aiming to improve their services. By implementing a sentiment analysis system, companies can categorize feedback as positive, negative, or neutral, making it easier to prioritize areas for improvement.

Suppose a hotel chain discovers that a significant number of negative words in customer reviews pertain to room cleanliness. In that case, they can take immediate measures to address this concern, enhancing the overall guest experience.

sentiment analysis research topics

Social media platforms

Social media is awash with opinions and feedback. By employing models for the analysis of sentiments, businesses and researchers can tap into real-time feelings of the masses.

For instance, if a celebrity endorses a brand and sentiment analysis reflects a surge in positive words associated with that brand, it can be concluded that the endorsement had a favorable impact. Conversely, if a political figure makes a statement and the analysis indicates a spike in negative words related to the topic, it provides insights into public opinion.

sentiment analysis research topics

Sentiment analysis has rapidly become a crucial tool in today's digital age, helping businesses, researchers, and individuals decode the emotions hidden within vast amounts of data. But why has it garnered such significance?

The reasons are manifold, but they all converge on the idea that understanding sentiment offers a deeper, more nuanced view of human reactions and opinions.

Sentiment analysis use cases & applications

The applications of sentiment analysis are diverse and expansive. For instance, in the realm of politics, sentiment analysis can be used to gauge public opinion on policies or candidates, offering insights that can guide campaign strategies.

In the healthcare sector, sentiment analysis can capture patient feedback, allowing providers to fine-tune their services and improve patient experiences.

Moreover, educators can use sentiment analysis to understand student feedback, making curriculum adjustments that align with student needs and preferences.

sentiment analysis research topics

Benefits of sentiment analysis

Beyond its various applications, the benefits of sentiment analysis are profound. Firstly, it offers an efficient way to process large volumes of unstructured data , turning it into actionable insights. Businesses, for example, can use sentiment analysis to get ahead of potential public relations crises by identifying negative sentiments early.

Furthermore, it provides rule-based systems that can circumvent the time-consuming task of manually reviewing each piece of feedback. This not only saves time but also reduces the risk of human bias.

Most significantly, by understanding both positive and negative phrases and their context, organizations can better align their strategies and offerings with their audience's true feelings and needs.

sentiment analysis research topics

Collecting sentiments involves gathering data from various sources to be analyzed for emotional content. This task, while seemingly straightforward, requires a systematic approach to ensure that the data obtained is both relevant and of high quality.

One of the primary sources for sentiment collection is social media platforms. Platforms like Twitter, Facebook, and Instagram are brimming with user-generated content that reflects public opinion on a vast array of topics. By utilizing specialized web scraping tools or APIs provided by these platforms, one can amass large datasets of posts, comments, and reviews to analyze.

sentiment analysis research topics

Customer reviews on e-commerce websites, such as Amazon or Yelp, are another treasure trove of sentiments. These reviews often provide detailed insights into customer sentiment about products, services, and overall brand perception. Similarly, survey responses, when designed with open-ended questions, can provide valuable data that captures the sentiments of the respondents.

In the news and media sector, news articles and op-eds are rich sources of sentiment. Collecting sentiments from these sources can help gauge public sentiment on current events, governmental decisions, or societal issues.

Forums and online communities, like Reddit or specialized industry forums, offer another avenue. Here, users often engage in in-depth discussions, providing nuanced views that are ripe for sentiment analysis.

However, while collecting sentiments, it's essential to consider privacy and ethical guidelines. Ensuring that data is anonymized and devoid of personally identifiable information is crucial. Moreover, always be aware of terms of service when extracting data from online platforms, as some might have restrictions on data scraping.

Analyzing sentiments is a multifaceted process that goes beyond merely identifying positive or negative words. It examines the context, nuances, and the intricate elements of human language. With advancements in machine learning and data science, this analysis has become more refined and precise.

Sentiment scores

At the foundation of this analytical approach lies the sentiment score. This score is usually a numerical value assigned to a piece of text, indicating its overall sentiment. For instance, a system to analyze sentiment might assign values on a scale from -1 (negative) to 1 (positive), with 0 representing a neutral sentiment. Sentiment scores provide a quick overview, enabling researchers and businesses to categorize large datasets swiftly.

Sentiment analysis algorithms

A machine learning algorithm, natural language toolkit, or artificial neural networks can power sentiment analysis work. These range from simple rule-based algorithms, which identify sentiments based on predefined lists of positive and negative words, to more complex machine learning techniques. Machine learning-based sentiment analysis models, especially those utilizing deep learning, consider the broader context in which words are used, leading to more advanced sentiment analysis.

Sentiment analysis tools

There's a plethora of tools available, each tailored for different requirements. Some tools are designed for specific industries, while others are more general-purpose. Many of these tools leverage advanced models, making it easier for users without a deep technical background to extract meaningful insights from textual data. The qualitative data analysis software ATLAS.ti, for example, includes a sentiment analysis tool to automatically code data .

Sentiment analysis, despite its transformative potential and growing adoption, is not without its share of challenges. The intricacies of language and emotion often pose complexities that even the most advanced systems can find challenging to navigate.

Sarcasm and irony : One of the most significant challenges is detecting sarcasm and irony. A statement like "Oh, great! Another flat tire!" may be classified as positive by rudimentary analysis models because of the word "great." However, the context clearly indicates a negative sentiment.

Cultural nuances : Cultural and regional variations in language can affect sentiment interpretation. A word or phrase that's considered positive in one culture might be neutral or even negative in another. Without a culturally-aware model, these nuances can easily be missed.

Short and ambiguous texts : Platforms like Twitter, with their character limitations, often contain short and sometimes ambiguous messages. Without ample context, determining the sentiment of such messages can be tricky.

Polysemy : Words with multiple meanings, based on context, can pose challenges. For instance, the word "light" can be positive when referring to a "light meal" but negative when talking about "light rain" during a planned outdoor event.

Emotionally complex statements : Some statements might contain mixed emotions, making them hard to classify. For example, "I love how this camera captures colors, but its weight is a bit much for me." This statement contains both positive and negative sentiments about the same product.

Evolution of language : Language is dynamic. New words, slang, and expressions constantly emerge, especially on digital platforms. Keeping sentiment analysis tools updated to recognize and correctly interpret these new terms is a continual challenge.

Addressing these challenges requires a combination of improved algorithms, larger and more diverse training datasets, and a deeper understanding of linguistics and cultural contexts. As technology advances and sentiment analysis solutions become more sophisticated, the hope is that these challenges will diminish, leading to even more accurate and insightful outcomes.

sentiment analysis research topics

Make ATLAS.ti your own sentiment analysis solution

Powerful auto-coding tools for sentiment analysis and opinion mining are at your fingertips, starting with a free trial.

sentiment analysis research topics

Logo

Sentiment Analysis: A Definitive Guide

What is sentiment analysis, sentiment analysis examples.

  • How Does It Work?
  • Sentiment Anaysis Tools

Emojis as representations of sentiments: positive, neutral, and negative to show how sentiment analysis works. Text reads 'my experience so far has been fantastic!' (positive), 'The product is ok, I guess' (neutral), and 'your support team is useless' (negative).

Sentiment analysis (or opinion mining) is a natural language processing (NLP) technique used to determine whether data is positive, negative or neutral. Sentiment analysis is often performed on textual data to help businesses monitor brand and product sentiment in customer feedback , and understand customer needs.

Start analyzing your text for sentiment

Learn more about how sentiment analysis works, its challenges, and how you can use sentiment analysis to improve processes, decision-making, customer satisfaction and more.

Once you’re familiar with the basics, get started with easy-to-use sentiment analysis tools that are ready to use right off the bat.

Types of Sentiment Analysis

Why is sentiment analysis important.

  • Sentiment Analysis Examples & Break Down of Trustpilot Reviews

How Does Sentiment Analysis Work?

Sentiment analysis challenges.

  • Sentiment Analysis Applications

Sentiment Analysis Tools & Tutorials

Sentiment analysis research & courses.

The Basics of Sentiment Analysis

Sentiment analysis is the process of detecting positive or negative sentiment in text. It’s often used by businesses to detect sentiment in social data, gauge brand reputation, and understand customers.

Sentiment analysis focuses on the polarity of a text ( positive, negative, neutral ) but it also goes beyond polarity to detect specific feelings and emotions ( angry, happy, sad , etc), urgency ( urgent, not urgent ) and even intentions ( interested v. not interested ).

Depending on how you want to interpret customer feedback and queries, you can define and tailor your categories to meet your sentiment analysis needs. In the meantime, here are some of the most popular types of sentiment analysis:

Graded Sentiment Analysis

If polarity precision is important to your business, you might consider expanding your polarity categories to include different levels of positive and negative:

  • Very positive
  • Very negative

This is usually referred to as graded or fine-grained sentiment analysis, and could be used to interpret 5-star ratings in a review, for example:

  • Very Positive = 5 stars
  • Very Negative = 1 star

Emotion detection

Emotion detection sentiment analysis allows you to go beyond polarity to detect emotions, like happiness, frustration, anger, and sadness.

Many emotion detection systems use lexicons (i.e. lists of words and the emotions they convey) or complex machine learning algorithms .

One of the downsides of using lexicons is that people express emotions in different ways. Some words that typically express anger, like bad or kill (e.g. your product is so bad or your customer support is killing me ) might also express happiness (e.g. this is bad ass or you are killing it ).

Aspect-based Sentiment Analysis

Usually, when analyzing sentiments of texts you’ll want to know which particular aspects or features people are mentioning in a positive, neutral, or negative way.

That's where aspect-based sentiment analysis can help, for example in this product review: "The battery life of this camera is too short" , an aspect-based classifier would be able to determine that the sentence expresses a negative opinion about the battery life of the product in question.

Multilingual sentiment analysis

Multilingual sentiment analysis can be difficult. It involves a lot of preprocessing and resources. Most of these resources are available online (e.g. sentiment lexicons), while others need to be created (e.g. translated corpora or noise detection algorithms), but you’ll need to know how to code to use them.

Alternatively, you could detect language in texts automatically with a language classifier, then train a custom sentiment analysis model to classify texts in the language of your choice.

Since humans express their thoughts and feelings more openly than ever before, sentiment analysis is fast becoming an essential tool to monitor and understand sentiment in all types of data.

Automatically analyzing customer feedback , such as opinions in survey responses and social media conversations, allows brands to learn what makes customers happy or frustrated, so that they can tailor products and services to meet their customers’ needs.

For example, using sentiment analysis to automatically analyze 4,000+ open-ended responses in your customer satisfaction surveys could help you discover why customers are happy or unhappy at each stage of the customer journey.

Maybe you want to track brand sentiment so you can detect disgruntled customers immediately and respond as soon as possible. Maybe you want to compare sentiment from one quarter to the next to see if you need to take action. Then you could dig deeper into your qualitative data to see why sentiment is falling or rising.

The overall benefits of sentiment analysis include :

  • Sorting Data at Scale

Can you imagine manually sorting through thousands of tweets, customer support conversations, or surveys ? There’s just too much business data to process manually. Sentiment analysis helps businesses process huge amounts of unstructured data in an efficient and cost-effective way.

  • Real-Time Analysis

Sentiment analysis can identify critical issues in real-time, for example is a PR crisis on social media escalating? Is an angry customer about to churn? Sentiment analysis models can help you immediately identify these kinds of situations, so you can take action right away.

  • Consistent criteria

It’s estimated that people only agree around 60-65% of the time when determining the sentiment of a particular text. Tagging text by sentiment is highly subjective, influenced by personal experiences, thoughts, and beliefs.

By using a centralized sentiment analysis system, companies can apply the same criteria to all of their data, helping them improve accuracy and gain better insights.

The applications of sentiment analysis are endless. So, to help you understand how sentiment analysis could benefit your business, let’s take a look at some examples of texts that you could analyze using sentiment analysis.

Then, we’ll jump into a real-world example of how Chewy, a pet supplies company, was able to gain a much more nuanced (and useful!) understanding of their reviews through the application of sentiment analysis.

To understand the goal and challenges of sentiment analysis, here are some examples:

Basic examples of sentiment analysis data

  • Netflix has the best selection of films
  • Hulu has a great UI
  • I dislike like the new crime series
  • I hate waiting for the next series to come out

More challenging examples of sentiment analysis

  • I do not dislike horror movies. (phrase with negation)
  • Disliking horror movies is not uncommon. (negation, inverted word order)
  • Sometimes I really hate the show. (adverbial modifies the sentiment)
  • I love having to wait two months for the next series to come out! ( sarcasm)
  • The final episode was surprising with a terrible twist at the end (negative term used in a positive way)
  • The film was easy to watch but I would not recommend it to my friends. (difficult to categorize)
  • I LOL’d at the end of the cake scene (often hard to understand new terms)

Now, let’s take a look at some real reviews on Trustpilot and see how MonkeyLearn’s sentiment analysis tools fare when it comes to recognizing and categorizing sentiment.

Case Study: Sentiment analysis on TrustPilot Reviews

Chewy is a pet supplies company – an industry with no shortage of competition, so providing a superior customer experience (CX) to their customers can be a massive difference maker.

For this reason, online reviews can be an extremely valuable source of information to gain customer insights to improve their CX. Chewy has thousands of reviews in TrustPilot, this is what their review archive looks like:

The overall star rating of Chewy reviews on Trustpilot: 3.6 star rating from over 9,000 reviews, with 82% leaving a score of 'excellent'.

Via TrustPilot

It is easy to draw a general conclusion about Chewy’s relative success from this alone - 82% of responses being excellent is a great starting place.

But TrustPilot’s results alone fall short if Chewy’s goal is to improve its services. This perfunctory overview fails to provide actionable insight , the cornerstone, and end goal, of effective sentiment analysis. 

If Chewy wanted to unpack the what and why behind their reviews, in order to further improve their services, they would need to analyze each and every negative review at a granular level.

But with sentiment analysis tools , Chewy could plug in their 5,639 (at the time) TrustPilot reviews to gain instant sentiment analysis insights.

We uploaded and analyzed Chewy’s reviews to MonkeyLearn’s all-in-one data analysis and visualization studio to generate the following dashboard:

MonkeyLearn data visualization dashboard showing how reviews have been filtered by topic and sentiment. An example of aspect-based sentiment analysis in the form of graphs, pie charts, word clouds, and tagged data.

Chewy TrustPilot Reviews Sample

Feel free to click this link to peruse the results at your leisure - as this sample dashboard is a public demo, you can click through and explore the inputs and filters at work yourself.

While there is a ton more to explore, in this breakdown we are going to focus on four sentiment analysis data visualization results that the dashboard has visualized for us.

  • Overall Sentiment
  • Sentiment over Time
  • Sentiment by Rating
  • Sentiment by Topic

1. Overall sentiment

We’ll begin by pulling the relevant graphic from the above dashboard. 

Overall sentiment of Chewy's reviews, split by positive (38.2%), negative (40.8%), and neutral (21%) sentiment in a pie chart.

You’ll notice that these results are very different from TrustPilot’s overview (82% excellent, etc). This is because MonkeyLearn’s sentiment analysis AI performs advanced sentiment analysis, parsing through each review sentence by sentence, word by word. 

What you are left with is an accurate assessment of everything customers have written, rather than a simple tabulation of stars. This analysis can point you towards friction points much more accurately and in much more detail. 

Read up on the mechanics of how sentiment analysis works below .

2. Sentiment over time

Here’s our handy-dandy sentiment over time graph, blown up:

sentiment analysis research topics

This data visualization sample is classic temporal datavis, a datavis type that tracks results and plots them over a period of time.

This graph expands on our Overall Sentiment data - it tracks the overall proportion of positive, neutral, and negative sentiment in the reviews from 2016 to 2021.

This graph informs the gradual change in the content of their written reviews over this five year period. For instance, negative responses went down from 2019-2020, then jumped back up to previous levels in 2021.

3. Sentiment by rating

The number of reviews and proportion of sentiment broken down by rating.

Now we jump to something that anchors our text-based sentiment to TrustPilot’s earlier results.

By taking each TrustPilot category from 1-Bad to 5-Excellent, and breaking down the text of the written reviews from the scores you can derive the above graphic.

Looking at the results, and courtesy of taking a deeper look at the reviews via sentiment analysis, we can draw a couple interesting conclusions right off the bat.

  • TrustPilots results aren’t useless - the better reviews have higher proportions of positive sentiment and the worse reviews have more negative sentiment. But, all reviews contain a little bit of all types of sentiment - we’ve learned that our reviews are nuanced and thus likely have even more hidden insight for us! 
  • Our reviews are polarized. They skew in amounts towards 5 and 1.

These quick takeaways point us towards goldmines for future analysis. Namely, the positive sentiment sections of negative reviews and the negative section of positive ones, and the 2 - 4 reviews (why do they feel the way they do, how could we improve their scores?). 

4. Sentiment by Topic

Number of reviews and proportion of sentiment broken down by topics: customer support, shipping, product, pricing, and website

Finally, we can take a look at Sentiment by Topic to begin to illustrate how sentiment analysis can take us even further into our data.

The above chart applies product-linked text classification in addition to sentiment analysis to pair given sentiment to product/service specific features, this is known as aspect-based sentiment analysis .

This means we can know how our customers feel about what, helping us zero in and fix specific pain points or issues. 

These are all great jumping off points designed to visually demonstrate the value of sentiment analysis - but they only scratch the surface of its true power.

Read on for a step-by-step walkthrough of how sentiment analysis works.

How Does Sentiment Analysis Work?

Sentiment analysis, otherwise known as opinion mining, works thanks to natural language processing (NLP) and machine learning algorithms , to automatically determine the emotional tone behind online conversations.

There are different algorithms you can implement in sentiment analysis models, depending on how much data you need to analyze, and how accurate you need your model to be. We’ll go over some of these in more detail, below.

Sentiment analysis algorithms fall into one of three buckets:

  • Rule-based: these systems automatically perform sentiment analysis based on a set of manually crafted rules.
  • Automatic: systems rely on machine learning techniques to learn from data.
  • Hybrid systems combine both rule-based and automatic approaches.

Rule-based Approaches

Usually, a rule-based system uses a set of human-crafted rules to help identify subjectivity, polarity, or the subject of an opinion.

These rules may include various NLP techniques developed in computational linguistics, such as:

  • Stemming , tokenization , part-of-speech tagging and parsing .
  • Lexicons (i.e. lists of words and expressions).

Here’s a basic example of how a rule-based system works:

  • Defines two lists of polarized words (e.g. negative words such as bad , worst , ugly , etc and positive words such as good , best , beautiful , etc).
  • Counts the number of positive and negative words that appear in a given text.
  • If the number of positive word appearances is greater than the number of negative word appearances, the system returns a positive sentiment, and vice versa. If the numbers are even, the system will return a neutral sentiment.

Rule-based systems are very naive since they don't take into account how words are combined in a sequence. Of course, more advanced processing techniques can be used, and new rules added to support new expressions and vocabulary. However, adding new rules may affect previous results, and the whole system can get very complex. Since rule-based systems often require fine-tuning and maintenance, they’ll also need regular investments.

Automatic Approaches

Automatic methods, contrary to rule-based systems, don't rely on manually crafted rules, but on machine learning techniques. A sentiment analysis task is usually modeled as a classification problem, whereby a classifier is fed a text and returns a category, e.g. positive, negative, or neutral.

Here’s how a machine learning classifier can be implemented:

How does Sentiment Analysis Work

The Training and Prediction Processes

In the training process (a), our model learns to associate a particular input (i.e. a text) to the corresponding output (tag) based on the test samples used for training. The feature extractor transfers the text input into a feature vector. Pairs of feature vectors and tags (e.g. positive , negative , or neutral ) are fed into the machine learning algorithm to generate a model.

In the prediction process (b), the feature extractor is used to transform unseen text inputs into feature vectors. These feature vectors are then fed into the model, which generates predicted tags (again, positive , negative , or neutral ).

Feature Extraction from Text

The first step in a machine learning text classifier is to transform the text extraction or text vectorization, and the classical approach has been bag-of-words or bag-of-ngrams with their frequency.

More recently, new feature extraction techniques have been applied based on word embeddings (also known as word vectors ). This kind of representations makes it possible for words with similar meaning to have a similar representation, which can improve the performance of classifiers.

Classification Algorithms

The classification step usually involves a statistical model like Naïve Bayes, Logistic Regression, Support Vector Machines, or Neural Networks:

  • Naïve Bayes : a family of probabilistic algorithms that uses Bayes’s Theorem to predict the category of a text.
  • Linear Regression : a very well-known algorithm in statistics used to predict some value (Y) given a set of features (X).
  • Support Vector Machines : a non-probabilistic model which uses a representation of text examples as points in a multidimensional space. Examples of different categories (sentiments) are mapped to distinct regions within that space. Then, new texts are assigned a category based on similarities with existing texts and the regions they’re mapped to.
  • Deep Learning : a diverse set of algorithms that attempt to mimic the human brain, by employing artificial neural networks to process data.

Hybrid Approaches

Hybrid systems combine the desirable elements of rule-based and automatic techniques into one system. One huge benefit of these systems is that results are often more accurate.

Sentiment analysis is one of the hardest tasks in natural language processing because even humans struggle to analyze sentiments accurately.

Data scientists are getting better at creating more accurate sentiment classifiers, but there’s still a long way to go. Let’s take a closer look at some of the main challenges of machine-based sentiment analysis:

  • Subjectivity & Tone
  • Context & Polarity
  • Irony & Sarcasm

Comparisons

Defining neutral, human annotator accuracy, subjectivity and tone.

There are two types of text: subjective and objective. Objective texts do not contain explicit sentiments, whereas subjective texts do. Say, for example, you intend to analyze the sentiment of the following two texts:

The package is nice.

The package is red.

Most people would say that sentiment is positive for the first one and neutral for the second one, right? All predicates (adjectives, verbs, and some nouns) should not be treated the same with respect to how they create sentiment. In the examples above, nice is more subjective than red .

Context and Polarity

All utterances are uttered at some point in time, in some place, by and to some people, you get the point. All utterances are uttered in context. Analyzing sentiment without context gets pretty difficult. However, machines cannot learn about contexts if they are not mentioned explicitly. One of the problems that arise from context is changes in polarity . Look at the following responses to a survey:

Everything about it.

Absolutely nothing!

Imagine the responses above come from answers to the question What did you like about the event? The first response would be positive and the second one would be negative, right? Now, imagine the responses come from answers to the question What did you DISlike about the event? The negative in the question will make sentiment analysis change altogether.

A good deal of preprocessing or postprocessing will be needed if we are to take into account at least part of the context in which texts were produced. However, how to preprocess or postprocess data in order to capture the bits of context that will help analyze sentiment is not straightforward.

Irony and Sarcasm

When it comes to irony and sarcasm , people express their negative sentiments using positive words, which can be difficult for machines to detect without having a thorough understanding of the context of the situation in which a feeling was expressed.

For example, look at some possible answers to the question, Did you enjoy your shopping experience with us?

Yeah, sure. So smooth!

Not one, but many!

What sentiment would you assign to the responses above? The first response with an exclamation mark could be negative, right? The problem is there is no textual cue that will help a machine learn, or at least question that sentiment since yeah and sure often belong to positive or neutral texts.

How about the second response? In this context, sentiment is positive, but we’re sure you can come up with many different contexts in which the same response can express negative sentiment.

How to treat comparisons in sentiment analysis is another challenge worth tackling. Look at the texts below:

This product is second to none.

This is better than older tools.

This is better than nothing.

The first comparison doesn’t need any contextual clues to be classified correctly. It’s clear that it’s positive.

The second and third texts are a little more difficult to classify, though. Would you classify them as neutral , positive , or even negative ? Once again, context can make a difference. For example, if the ‘older tools’ in the second text were considered useless, then the second text is pretty similar to the third text.

There are two types of emojis according to Guibon et al. . Western emojis (e.g. :D) are encoded in only one or two characters, whereas Eastern emojis (e.g. ¯ \ (ツ) / ¯) are a longer combination of characters of a vertical nature. Emojis play an important role in the sentiment of texts, particularly in tweets.

You’ll need to pay special attention to character-level, as well as word-level, when performing sentiment analysis on tweets. A lot of preprocessing might also be needed. For example, you might want to preprocess social media content and transform both Western and Eastern emojis into tokens and whitelist them (i.e. always take them as a feature for classification purposes) in order to help improve sentiment analysis performance.

Here’s a quite comprehensive list of emojis and their unicode characters that may come in handy when preprocessing.

Defining what we mean by neutral is another challenge to tackle in order to perform accurate sentiment analysis. As in all classification problems, defining your categories -and, in this case, the neutral tag- is one of the most important parts of the problem. What you mean by neutral , positive , or negative does matter when you train sentiment analysis models. Since tagging data requires that tagging criteria be consistent, a good definition of the problem is a must.

Here are some ideas to help you identify and define neutral texts:

  • Objective texts . So called objective texts do not contain explicit sentiments, so you should include those texts into the neutral category.
  • Irrelevant information . If you haven’t preprocessed your data to filter out irrelevant information, you can tag it neutral. However, be careful! Only do this if you know how this could affect overall performance. Sometimes, you will be adding noise to your classifier and performance could get worse.
  • Texts containing wishes . Some wishes like, I wish the product had more integrations are generally neutral. However, those including comparisons like, I wish the product were better are pretty difficult to categorize

Sentiment analysis is a tremendously difficult task even for humans. On average, inter-annotator agreement (a measure of how well two (or more) human labelers can make the same annotation decision) is pretty low when it comes to sentiment analysis. And since machines learn from labeled data , sentiment analysis classifiers might not be as precise as other types of classifiers.

Still, sentiment analysis is worth the effort, even if your sentiment analysis predictions are wrong from time to time. By using MonkeyLearn’s sentiment analysis model , you can expect correct predictions about 70-80% of the time you submit your texts for classification.

If you are new to sentiment analysis, then you’ll quickly notice improvements. For typical use cases, such as ticket routing, brand monitoring, and VoC analysis , you’ll save a lot of time and money on tedious manual tasks.

Sentiment Analysis Use Cases & Applications

Sentiment Analysis Applications and Examples

The applications of sentiment analysis are endless and can be applied to any industry, from finance and retail to hospitality and technology. Below, we’ve listed some of the most popular ways that sentiment analysis is being used in business:

Social Media Monitoring

Brand monitoring.

  • Voice of customer (VoC)

Customer Service

Market research.

Sentiment analysis is used in social media monitoring , allowing businesses to gain insights about how customers feel about certain topics, and detect urgent issues in real time before they spiral out of control.

On the fateful evening of April 9th, 2017, United Airlines forcibly removed a passenger from an overbooked flight. The nightmare-ish incident was filmed by other passengers on their smartphones and posted immediately. One of the videos, posted to Facebook, was shared more than 87,000 times and viewed 6.8 million times by 6pm on Monday, just 24 hours later.

The fiasco was only magnified by the company’s dismissive response. On Monday afternoon, United’s CEO tweeted a statement apologizing for “having to re-accommodate customers.”

This is exactly the kind of PR catastrophe you can avoid with sentiment analysis. It’s an example of why it’s important to care, not only about if people are talking about your brand, but how they’re talking about it. More mentions don't equal positive mentions.

Brands of all shapes and sizes have meaningful interactions with customers, leads, even their competition, all across social media. By monitoring these conversations you can understand customer sentiment in real time and over time, so you can detect disgruntled customers immediately and respond as soon as possible.

Most marketing departments are already tuned into online mentions as far as volume – they measure more chatter as more brand awareness. But businesses need to look beyond the numbers for deeper insights.

Not only do brands have a wealth of information available on social media, but across the internet, on news sites, blogs, forums, product reviews, and more. Again, we can look at not just the volume of mentions, but the individual and overall quality of those mentions.

In our United Airlines example, for instance, the flare-up started on the social media accounts of just a few passengers. Within hours, it was picked up by news sites and spread like wildfire across the US, then to China and Vietnam, as United was accused of racial profiling against a passenger of Chinese-Vietnamese descent. In China, the incident became the number one trending topic on Weibo , a microblogging site with almost 500 million users.

And again, this is all happening within mere hours of the incident.

Brand monitoring offers a wealth of insights from conversations happening about your brand from all over the internet. Analyze news articles, blogs, forums, and more to gauge brand sentiment , and target certain demographics or regions, as desired. Automatically categorize the urgency of all brand mentions and route them instantly to designated team members.

Get an understanding of customer feelings and opinions, beyond mere numbers and statistics. Understand how your brand image evolves over time, and compare it to that of your competition. You can tune into a specific point in time to follow product releases, marketing campaigns, IPO filings, etc., and compare them to past events.

Real-time sentiment analysis allows you to identify potential PR crises and take immediate action before they become serious issues. Or identify positive comments and respond directly, to use them to your benefit.

Example: Expedia Canada

Around Christmas time, Expedia Canada ran a classic “escape winter” marketing campaign. All was well, except for the screeching violin they chose as background music. Understandably, people took to social media, blogs, and forums. Expedia noticed right away and removed the ad.

Then, they created a series of follow-up spin-off videos: one showed the original actor smashing the violin; another invited a real negative Twitter user to rip the violin out of the actor’s hands on screen. Though their original campaign was a flop, Expedia were able to redeem themselves by listening to their customers and responding.

Sentiment analysis allows you to automatically monitor all chatter around your brand and detect and address this type of potentially-explosive scenario while you still have time to defuse it.

Voice of Customer (VoC)

Social media and brand monitoring offer us immediate, unfiltered, and invaluable information on customer sentiment , but you can also put this analysis to work on surveys and customer support interactions.

Net Promoter Score (NPS) surveys are one of the most popular ways for businesses to gain feedback with the simple question: Would you recommend this company, product, and/or service to a friend or family member? These result in a single score on a number scale.

Businesses use these scores to identify customers as promoters, passives, or detractors. The goal is to identify overall customer experience , and find ways to elevate all customers to “promoter” level, where they, theoretically, will buy more, stay longer, and refer other customers.

Numerical (quantitative) survey data is easily aggregated and assessed. But the next question in NPS surveys, asking why survey participants left the score they did, seeks open-ended responses, or qualitative data.

Open-ended survey responses were previously much more difficult to analyze, but with sentiment analysis these texts can be classified into positive and negative (and everywhere in between) offering further insights into the Voice of Customer (VoC) .

Sentiment analysis can be used on any kind of survey – quantitative and qualitative – and on customer support interactions, to understand the emotions and opinions of your customers. Tracking customer sentiment over time adds depth to help understand why NPS scores or sentiment toward individual aspects of your business may have changed.

You can use it on incoming surveys and support tickets to detect customers who are ‘strongly negative’ and target them immediately to improve their service. Zero in on certain demographics to understand what works best and how you can improve.

Real-time analysis allows you to see shifts in VoC right away and understand the nuances of the customer experience over time beyond statistics and percentages.

Discover how we analyzed the sentiment of thousands of Facebook reviews , and transformed them into actionable insights.

Example: McKinsey City Voices project

In Brazil, federal public spending rose by 156% from 2007 to 2015, while satisfaction with public services steadily decreased. Unhappy with this counterproductive progress, the Urban Planning Department recruited McKinsey to help them focus on user experience, or “citizen journeys,” when delivering services. This citizen-centric style of governance has led to the rise of what we call Smart Cities.

McKinsey developed a tool called City Voices, which conducts citizen surveys across more than 150 metrics, and then runs sentiment analysis to help leaders understand how constituents live and what they need, in order to better inform public policy. By using this tool, the Brazilian government was able to uncover the most urgent needs – a safer bus system, for instance – and improve them first.

If this can be successful on a national scale, imagine what it can do for your company.

We already looked at how we can use sentiment analysis in terms of the broader VoC, so now we’ll dial in on customer service teams.

We all know the drill: stellar customer experiences means a higher rate of returning customers. Leading companies know that how they deliver is just as, if not more, important as what they deliver. Customers expect their experience with companies to be immediate, intuitive, personal, and hassle-free. If not, they’ll leave and do business elsewhere. Did you know that one in three customers will leave a brand after just one bad experience ?

You can use sentiment analysis and text classification to automatically organize incoming support queries by topic and urgency to route them to the correct department and make sure the most urgent are handled right away.

Analyze customer support interactions to ensure your employees are following appropriate protocol. Increase efficiency, so customers aren’t left waiting for support. Decrease churn rates; after all it’s less hassle to keep customers than acquire new ones.

Discover how we analyzed customer support interactions on Twitter .

sentiment analysis research topics

Sentiment analysis empowers all kinds of market research and competitive analysis. Whether you’re exploring a new market, anticipating future trends, or seeking an edge on the competition, sentiment analysis can make all the difference.

You can analyze online reviews of your products and compare them to your competition. Maybe your competitor released a new product that landed as a flop. Find out what aspects of the product performed most negatively and use it to your advantage.

Follow your brand and your competition in real time on social media. Locate new markets where your brand is likely to succeed. Uncover trends just as they emerge, or follow long-term market leanings through analysis of formal market reports and business journals.

You’ll tap into new sources of information and be able to quantify otherwise qualitative information. With social data analysis you can fill in gaps where public data is scarce, like emerging markets.

Discover how to analyze the sentiment of hotel reviews on TripAdvisor or perform sentiment analysis on Yelp restaurant reviews .

Sentiment Analysis Resources

Sentiment analysis is a vast topic, and it can be intimidating to get started. Luckily, there are many useful resources, from helpful tutorials to all kinds of free online tools, to help you take your first steps.

Free Online Sentiment Analysis Tools

A good start to your journey is to simply play around with a sentiment analysis tool. A little first-hand experience will help you understand how it works

Next, to take your sentiment analysis further, you’ll want to try out MonkeyLearn’s sentiment analysis and keyword template . First, you’ll need sign up, then walk through the following steps:

1. Choose Keyword + Sentiment Analysis template

Choose template.

2. Upload your data

Uplad your data.

​​If you don't have a CSV, you can use our sample dataset .

3. Match the CSV columns to the dashboard fields

In this template, there is only one field: text. If you have more than one column in your dataset, choose the column that has the text you would like to analyze.

Match csv columns to fields.

4. Name your workflow

Name your dashboard.

5. Wait for your data to import

Waiting for the data to import.

6. Explore your dashboard!

Explore your dashboard.

  • Filter by sentiment or keyword.
  • Share via email with other coworkers.

Open Source vs SaaS (Software as a Service) Sentiment Analysis Tools

When it comes to sentiment analysis (and text analysis in general), you have two choices: build your own solution or buy a tool .

Open source libraries in languages like Python and Java are particularly well positioned to build your own sentiment analysis solution because their communities lean more heavily toward data science, like natural language processing and deep learning for sentiment analysis . But you’ll need a team of data scientists and engineers on board, huge upfront investments, and time to spare.

SaaS tools offer the option to implement pre-trained sentiment analysis models immediately or custom-train your own, often in just a few steps. These tools are recommended if you don’t have a data science or engineering team on board, since they can be implemented with little or no code and can save months of work and money (upwards of $100,000).

Another key advantage of SaaS tools is that you don't even need to know how to code; they provide integrations with third-party apps, like MonkeyLearn’s Zendesk, Excel and Zapier Integrations .

If you want to get started with these out-of-the-box tools, check out this guide to the best SaaS tools for sentiment analysis , which also come with APIs for seamless integration with your existing tools.

Or start learning how to perform sentiment analysis using MonkeyLearn’s API and the pre-built sentiment analysis model, with just six lines of code. Then, train your own custom sentiment analysis model using MonkeyLearn’s easy-to-use UI.

  • Tutorial on sentiment analysis in python using MonkeyLearn’s API.

If you’re still convinced that you need to build your own sentiment analysis solution, check out these tools and tutorials in various programming languages:

Sentiment Analysis Python

  • Scikit-learn is the go-to library for machine learning and has useful tools for text vectorization. Training a classifier on top of vectorizations, like frequency or tf-idf text vectorizers is quite straightforward. Scikit-learn has implementations for Support Vector Machines, Naïve Bayes, and Logistic Regression, among others.
  • NLTK has been the traditional NLP library for Python. It has an active community and offers the possibility to train machine learning classifiers.
  • SpaCy is an NLP library with a growing community. Like NLTK, it provides a strong set of low-level functions for NLP and support for training text classifiers.
  • TensorFlow , developed by Google, provides a low-level set of tools to build and train neural networks. There's also support for text vectorization, both on traditional word frequency and on more advanced through-word embeddings.
  • Keras provides useful abstractions to work with multiple neural network types, like recurrent neural networks (RNNs) and convolutional neural networks (CNNs) and easily stack layers of neurons. Keras can be run on top of Tensorflow or Theano. It also provides useful tools for text classification.
  • PyTorch is a recent deep learning framework backed by some prestigious organizations like Facebook, Twitter, Nvidia, Salesforce, Stanford University, University of Oxford, and Uber. It has quickly developed a strong community.

Tutorials to try out:

Python web scraping and sentiment analysis : this tutorial provides a step-by-step guide on how to analyze the top 100 subreddits by sentiment. It explains how to use Beautiful Soup , one of the most popular Python libraries for web scraping that collects the names of the top subreddit web pages (subreddits like /r/funny, /r/AskReddit and /r/todayilearned).

Using Praw library , it demonstrates how to interact with the Reddit API and extract the comments from these subreddits. Then, learn how to use TextBlob to perform sentiment analysis on the extracted comments. Code: https://github.com/jg-fisher/redditSentiment

Twitter sentiment analysis using Python and NLTK : This step-by-step guide shows you how to train your first sentiment classifier. The author uses Natural Language Toolkit NLTK to train a classifier on tweets. Making Sentiment Analysis Easy with Scikit-learn : This tutorial explains how to train a logistic regression model for sentiment analysis.

Making Sentiment Analysis Easy with Scikit-learn : This tutorial explains how to train a logistic regression model for sentiment analysis.

Sentiment Analysis Javascript

Java is another programming language with a strong community around data science with remarkable data science libraries for NLP.

  • OpenNLP : a toolkit that supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, language detection and coreference resolution.
  • Stanford CoreNLP : a Java suite of core NLP tools provided by The Stanford NLP Group.
  • Lingpipe : a Java toolkit for processing text using computational linguistics. LingPipe is often used for text classification and entity extraction.
  • Weka : a set of tools created by The University of Waikato for data pre-processing, classification, regression, clustering, association rules, and visualization.

Sentiment analysis research and courses

After learning the basics of sentiment analysis, and understanding how it can help you, you might want to delve further into the topic:

Sentiment Analysis Papers

The literature around sentiment analysis is massive; there are more than 55,700 scholarly articles, papers, theses, books, and abstracts out there.

The following are the most frequently cited and read papers in the sentiment analysis community in general:

  • Opinion mining and sentiment analysis (Pang and Lee, 2008)
  • Recognizing contextual polarity in phrase-level sentiment analysis (Wilson, Wiebe and Hoffmann, 2005).
  • A survey of opinion mining and sentiment analysis (Liu and Zhang, 2012)
  • Sentiment analysis and opinion mining (Liu, 2012)
  • How to Perform Text Mining with Sentiment Analysis

Sentiment Analysis Books

Bing Liu is a thought leader in the field of machine learning and has written a book about sentiment analysis and opinion mining.

Useful for those starting research on sentiment analysis, Liu does a wonderful job of explaining sentiment analysis in a way that is highly technical, yet understandable. In the book, he covers different aspects of sentiment analysis including applications, research, sentiment classification using supervised and unsupervised learning, sentence subjectivity, aspect-based sentiment analysis, and more.

For those who want to learn about deep-learning based approaches for sentiment analysis, a relatively new and fast-growing research area, take a look at Deep-Learning Based Approaches for Sentiment Analysis .

Sentiment Analysis Courses and Lectures

Another good way to go deeper with sentiment analysis is mastering your knowledge and skills in natural language processing (NLP), the computer science field that focuses on understanding ‘human’ language.

By combining machine learning, computational linguistics, and computer science, NLP allows a machine to understand natural language including people's sentiments, evaluations, attitudes, and emotions from written language.

There are a large number of courses, lectures, and resources available online, but the essential NLP course is the Stanford Coursera course by Dan Jurafsky and Christopher Manning . By taking this course, you will get a step-by-step introduction to the field by two of the most reputable names in the NLP community.

If you want a more hands-on course, you should enroll in the Data Science: Natural Language Processing (NLP) in Python on Udemy. This course gives you a good introduction to NLP and what it can do, but it will also make you build different projects in Python, including a spam detector, a sentiment analyzer, and an article spinner. Most of the lectures are really short (~5 minutes) and the course strikes the right balance between practical and theoretical content.

Sentiment Analysis Datasets

The key part for mastering sentiment analysis is working on different datasets and experimenting with different approaches. First, you’ll need to get your hands on data and procure a dataset which you will use to carry out your experiments.

The following are some of our favorite sentiment analysis datasets for experimenting with sentiment analysis and a machine learning approach. They’re open and free to download:

  • Product reviews : this dataset consists of a few million Amazon customer reviews with star ratings, super useful for training a sentiment analysis model.
  • Restaurant reviews : this dataset consists of 5,2 million Yelp reviews with star ratings.
  • Movie reviews : this dataset consists of 1,000 positive and 1,000 negative processed reviews. It also provides 5,331 positive and 5,331 negative processed sentences / snippets.
  • Fine food reviews : this dataset consists of ~500,000 food reviews from Amazon. It includes product and user information, ratings, and a plain text version of every review.
  • Twitter airline sentiment on Kaggle : this dataset consists of ~15,000 labeled tweets (positive, neutral, and negative) about airlines.
  • First GOP Debate Twitter Sentiment : this dataset consists of ~14,000 labeled tweets (positive, neutral, and negative) about the first GOP debate in 2016.

If you are interested in rule-based approach, the following is a varied list of sentiment analysis lexicons that will come in handy. These lexicons provide a set of dictionaries of words with labels specifying their sentiments across different domains. The following lexicons are really useful to identify the sentiment of texts:

  • Sentiment Lexicons for 81 Languages : this dataset contains both positive and negative sentiment lexicons for 81 languages.
  • SentiWordNet : this dataset contains about 29,000 words with a sentiment score between 0 and 1.
  • Opinion Lexicon for Sentiment Analysis : this dataset provides a list of 4,782 negative words and 2,005 positive words in English.
  • Wordstat Sentiment Dictionary : this dataset includes ~ 4800 positive and ~ 9000 negative words.
  • Emoticon Sentiment Lexicon : this dataset contains a list of 477 emoticons labeled as positive, neutral, or negative.

Parting words

Sentiment analysis can be applied to countless aspects of business, from brand monitoring and product analytics, to customer service and market research. By incorporating it into their existing systems and analytics, leading brands (not to mention entire cities) are able to work faster, with more accuracy, toward more useful ends.

Sentiment analysis has moved beyond merely an interesting, high-tech whim, and will soon become an indispensable tool for all companies of the modern age. Ultimately, sentiment analysis enables us to glean new insights, better understand our customers, and empower our own teams more effectively so that they do better and more productive work.

MonkeyLearn is an online platform that makes it easy to perform text analytics with machine learning and data visualization tools.

If you need help building a sentiment analysis system for your business, visit MonkeyLearn Studio and request a demo .

Related Posts

  • The Best Free Word Cloud to Visualize Your Data
  • Visualize Sentiments in a Word Cloud
  • Keyword Extraction: A Guide to Finding Keywords in Text

GDPR

MonkeyLearn Inc. All rights reserved 2024

Sentiment Analysis Projects & Topics For Beginners [2024]

Sentiment Analysis Projects & Topics For Beginners [2024]

Are you studying sentiment analysis and want to test your knowledge? If you are, then you’ve come to the right place. In this article, we’re discussing sentiment analysis project ideas with which you can test your knowledge and showcase your understanding.

We know how tricky it is to find great project ideas. We also know how beneficial it is to complete projects. With projects, you can strengthen your knowledge, enhance your portfolio, and bag better roles. 

The following article will talk about some of the best Sentiment Analysis Python project ideas. It will also shed some light on the various types, importance, and applications of Sentiment analysis in today’s world. By the end of it, you’ll be encouraged to work on prominent sentiment analysis and capstone project ideas.

Join Best Machine Learning Course online from the World’s top Universities – Masters, Executive Post Graduate Programs, and Advanced Certificate Program in ML & AI to fast-track your career.

Ads of upGrad blog

So without further ado, let’s get started. 

What is Sentiment Analysis?

Sentiment analysis is a kind of data mining where you measure the inclination of people’s opinions by using NLP (natural language processing), text analysis, and computational linguistics. We perform sentiment analysis mostly on public reviews, social media platforms, and similar sites. Following are the main types of sentiment analysis :

Fine-grained

Fine-grained sentiment analysis gives precise results to what the public opinion is about the subject. It classified its results in different categories such as: Very Negative, Negative, Neutral, Positive, Very Positive. 

Detecting Emotion

This kind of sentiment analysis identifies emotions such as anger, happiness, sadness, and others. Many times, you’ll use lexicons to recognize emotions. However, lexicons have drawbacks too, and in those cases, you’d need to use ML algorithms . 

Based on Aspect

In aspect-based sentiment analysis, you look at the aspect of the thing people are talking about. Suppose you have reviews of a smartphone, you might want to see what the people are talking about its battery life or its screen size. 

Multilingual

Sometimes organizations need to analyze the text of different languages. This form of sentiment analysis is considerably challenging and requires a lot of effort because you’d need many resources. 

Sentiment analysis has many applications in various industries. As it helps in understanding public opinion, companies use sentiment analysis in doing market research and figuring out if their customers like a particular product (or service) or not. Then, according to the findings of the sentiment analysis, the organization can modify the respective product or service and achieve better results. 

Must Read : Free nlp online course !

All in all, it helps companies in understanding their customers better. Companies can serve their customers better when they know where they lag and where they excel. 

Why Is Sentiment Analysis Required?

Before we delve into the various sentiment analysis project ideas, such as Twitter sentiment analysis project idea, or sentiment analysis of IBDb reviews, let’s take a look at some of the reasons why Sentiment analysis is important. 

In this technology-driven world, a majority portion of the data that we come across is unstructured. Whether it is in the form of emails, texts, or documents, the said data need to be properly structured and then analyzed further. This is where sentiment analysis comes into play. It not only helps to store data in an efficient and cost-friendly manner, but you can also solve certain real-time issues with the help of the same. 

Various Approaches Used In Sentiment Analysis

Broadly, there are three main approaches to sentiment analysis. They are-

  • Rule-based approach- Unlike the other approaches, the rule-based approach is quite easy to comprehend. It basically counts the total number of negative and positive words present in the data set. Following this, if the result indicates that the number of positive words is more than the number of negative words, then the sentiment is positive, and vice versa. 
  • Automatic Approach- In this approach, the data set is initially trained, following which predictive analysis is done. After completion of this stage, words are extracted from the text. This can be done with the help of various techniques, some of which might include Linear Regression, Support Vector, and Naive Bayes, among others.
  • Hybrid Approach- As the name suggests, this approach is basically an amalgamation of both the rule-based approach and the automatic approach. It delivers more accurate results when compared to the other approaches. 

Applications of Sentiment Analysis

There is a wide range of applications for sentiment analysis. Some of them have been discussed in the following list. 

  • Social Media- The comments on popular social media sites such as Instagram, Facebook, and Twitter are analyzed and then furthermore categorized into different segments, such as positive, negative, and neutral. 
  • Customer Service- One of the perfect examples might include the comment section in the Google Playstore application, wherein comments from 1 to 5, are usually selected with the help of the various sentiment analysis approaches.
  • Marketing Sector- The marketing industry has benefited a lot from sentiment analysis. It has helped brand owners to understand the review of a product or service, and whether it has been categorized as good or bad by the consumers. 

In the following points, we’ve discussed some prominent sentiment analysis project ideas, pick one according to your interests and expertise:

Sentiment Analysis Project Ideas

The following are our sentiment analysis projects. Our list has projects for all skill levels so that you can choose comfortably:

1. Analyze Amazon Product Reviews

Amazon is the biggest e-commerce store on the planet. This means it also has one of the largest product selections available. Many times, companies want to understand the public opinion on their product and figure out what’s responsible for the same. For that purpose, they perform sentiment analysis on their product reviews. 

It helps them in recognizing the primary issues with their products (if there are any). Some products have thousands of reviews on Amazon while some others only have a few hundred. 

It is one of the most sentiment analysis projects because the demand for such expertise is very high. Companies want experts to analyze their product reviews for market research. 

You can get the dataset for this project here: Amazon Product Reviews Dataset .

Working on this project will make you familiar with many aspects of sentiment analysis. If you’re a beginner, you can start with a small product and analyze reviews of the same. On the other hand, if you’re looking for a challenge, you can take a popular product and analyze its reviews. 

2. Rotten Tomatoes and Their Reviews

Rotten Tomatoes is a review site where you’ll find an aggregate of critics’ opinions on movies and shows. You can find reviews on nearly every show, TV series, or drama there. Admittedly, it’s also a great place to get data from. 

You can perform sentiment analysis on the reviews present on this site as a part of your sentiment analysis projects. The entertainment sector takes critic reviews very seriously. By analyzing critic reviews, a production company can understand why its particular title succeeded (or failed). Critic reviews influence the commercial success of a title considerably as well. 

With sentiment analysis, you can figure out what’s the general opinion of critics on a particular movie or show. This project is an excellent way for you to figure out how sentiment analysis can help entertainment companies such as Netflix. 

You can get the dataset for this project here: Rotten Tomatoes dataset . 

3. Twitter Sentiment Analysis

Twitter is a great place for performing sentiment analysis. You can get public opinion on any topic through this platform. This is one of the intermediate-level sentiment analysis project ideas. You should have some experience in performing opinion mining (another name for sentiment analysis) before you work on this task. As it’s a popular project idea, we’ve discussed in a little more detail:

Prerequisites

You should have a basic knowledge of programming. You can either be familiar with Python or R (it’d be great if you’re familiar with both). However, it’s not necessary to have expert-level knowledge of programming. Apart from programming, you should also know how to split datasets and use the RESTful API because you’ll have to use Twitter API here. You should also be familiar with the Naive Bayes Classifier as we’ll be using it to classify our data later in the project. 

This project isn’t easy, and it’ll take a little time (downloading data from twitter takes hours). 

Working on the Project

First, you’ll need to get authorized credentials from Twitter to use the Twitter API. It takes some time to authorize a Twitter Developer Account, but once you have it, you can go to your dashboard and ‘Create an app’.

After you have the necessary credentials, you can create the function and build a test set. Twitter has a limit on the number of requests one can make through their API, which they have added this limit for security reasons. The ceiling is 180 requests in 15 minutes. You can keep the test set to have 100 tweets.

After creating the test set, you’ll have to build the training set by using Twitter API, which is the hardest part of this project. Make sure that you save the tweets you gather from the API in a CSV file for future use. 

After preparing the training set, you only have to preprocess the tweets present in the datasets. Remember, emojis, images, and other non-textual components don’t affect the polarity of sentiment analysis. To include pictures and other parts in your sentiment analysis, you’ll have to use Deep Learning. Make sure that you remove all the duplicate characters and typos from your data. Data cleaning is vital to get the best results possible.

After cleaning the data, you can use the Naive Bayes Classifier for analyzing the dataset available. Finally, you’ll have to test your model and see if it’s producing the desired results or not. 

This Twitter sentiment analysis project can help you gain practical experience in handling real-world data, applying sentiment analysis techniques, and interpreting results, making it a valuable project for those looking to enhance their skills in data science and natural language processing. 

As you may have realized, this project will take some effort. But performing sentiment analysis on Twitter is a great way to test your knowledge of this subject. It’ll be a great addition to your portfolio (or CV) as well. 

Read more: Sentiment Analysis Using Python: A Hands-on Guide

4. Reviews of Scientific Papers

If you’re interested in using knowledge of machine learning and data science for research purposes, then this project is perfect for you. You can perform sentiment analysis on reviews of scientific papers and understand what leading experts think about a particular topic. Such a finding can help you research them accordingly. 

Here’s the dataset so you can get started on this project: Machine Learning Dataset . The dataset we’ve shared here has N = 405 instances. And it’s stored in JSON format. Working on this project will make you familiar with the applications of machine learning in scientific research. The dataset has some reviews in Spanish and some in English.

5. Analyze IMDb Reviews

IMDb is an entertainment review website where people leave their opinions on different movies and shows. You can perform sentiment analysis on the reviews present there as well. Just like the Rotten Tomatoes project we discussed previously, this one will help you learn about the applications of data science and machine learning in the entertainment industry.

Reviews of shows and movies help production companies in understanding why their title failed (or succeeded). 

The dataset for this project is quite old and small. But it’s an excellent way for a beginner to test his/her skills on a new dataset. Here’s a link to the dataset: IMDb reviews dataset .

6. Analyze a Company’s Reputation (News + Social Media)

You can pick a company you like and perform a detailed sentiment analysis on it. You can also choose a trending topic and cover it in your sentiment analysis for a more precise result. We can discuss the example of Uber here. They are one of the most prominent startups in the world and have a global customer base. You can perform a sentiment analysis to understand public opinion on this company.

To find the public opinion on Uber, we’ll first start by getting data from the relevant sources, which in this case are Uber’s Facebook page and Twitter page. By analyzing the conversations between the users there, we can figure out the overall brand perception in the market. You’ll need categories to separate different datasets. In this example, you can use Payment, Service, Cancel, Safety, and Price. 

Now that we know what we want to work on and where we have to go, we can get started.

Best Machine Learning and AI Courses Online

Sentiment analysis on facebook.

We’ll first begin with their Facebook page. It has more than 30,000 comments, and after we perform the analysis under the categories we mentioned previously (Payment, Service, Cancel, Safety, and Price) we found that most of the positive comments were about the Price section. On the other hand, the category with the highest percentage of negative feedback was service. However, while performing this analysis, we also kept in mind that Facebook’s comments are filled with spam, suggestions, news, and various other pieces of information. 

For sentiment analysis, we only have to look at opinions. 

So, we removed all the unnecessary categories, and as expected, our results changed. Now, negative comments held a majority in all sections, and their ratio in respective categories changed. In Price related comments, the percentage of negative comments rose by 20%. 

That’s why it’s essential to perform data cleaning. It helps you get accurate results. 

In-demand Machine Learning Skills

Sentiment analysis on twitter.

We’ve already discussed the sentiment analysis of tweets in this article. So we’ll follow a similar approach here and analyze people’s tweets where they tag Uber or reply to their tweets. Here, the category with the highest percentage of positive tweets was Payment, and the second-highest was Safety. This also shows how different social media give different results. 

However, we would have to perform data cleaning here as well. For that purpose, we’ll remove tweets with unrelated intent (spam, news, marketing, etc.). You’d notice how much the percentage of different categories changes here too. 

In our case, Payment saw a decline of 12% in its share of positive tweets and Safety became the category with the highest percentage of positive responses. Apart from that, Safety lost around 2-4% in its share of positive tweets. With this data, you can also find out what are the most popular topics among people when they talk about Uber on these platforms. 

So, on Twitter, we found that the most popular categories were payment, Cancel, and service. 

You should know that brands take this data very seriously. It helps them figure out what problems they need to work on and how they can solve the same. These tweets are, after all, feedback of customers. In this case, Uber can use the findings of these tweets to understand which parts of its services have faults and how they can fix them. 

Sentiment Analysis of News

To understand the public opinion on any organization, you’ll have to analyze the news about it as well. In our example, we’ll check the news articles about Uber. After we analyze the content present in those news articles, we’ll segregate our findings in the categories mentioned above (Payment, Service, Cancel, Safety, and Price). 

Apart from that, we’ll also classify different articles according to their popularity. The more popular an article is, the more it’ll affect public opinion. You can measure the popularity of every article according to the number of shares they have. A column with higher shares would undoubtedly be more popular than one with fewer shares. 

Also Read: Top 4 Data Analytics Project Ideas: Beginner to Expert Level 

The Results

In our example, we looked at Uber and the public opinion on this company. After we’ve analyzed Facebook, Twitter, and news, we’d know whether the general sentiment on Uber is positive, negative, or neutral. 

You can follow this approach to create sentiment project analysis ideas. You can start with a small company that doesn’t have a high online presence and performs sentiment analysis on multiple channels to understand if it’s perceived positively or negatively. If you want to increase the challenge, you can make it more complicated and perform analysis for a major company (like we did in our example).

7. Hate Speech Detection Model

Apart from the Twitter sentiment analysis project topic, the hate speech detection model is yet another very interesting area to explore in sentiment analysis python . Hate speech basically refers to any kind of communication or language used against a person, or a group, based on their sexuality, race, color, and religion, among other factors. This includes all kinds of verbal, in verbal, written or behavioral communication. The main task of the hate speech detection model is to identify and classify the hate speech from a given text. The same can be achieved by training the model on data, which is used to classify sentiments. 

Along with these, you can further explore many sentiment analysis and capstone project ideas following your enrollment in relevant degree or certificate programs, such as the ones offered on upGrad.

Popular AI and ML Blogs & Free Courses

8. education course reviews.

Analyzing sentiments in reviews for online courses or educational resources can be a valuable sentiment analysis projects for final year . It involves delving into student perspectives on learning experiences. By collecting and processing diverse datasets from platforms like upGrad, this project can help young learners understand the concept better. 

Most importantly, this is a sentiment analysis project using Machine Learning and NLP. Students can learn how to employ sentiment analysis techniques, such as Natural Language Processing and Machine Learning, to categorize reviews as positive, negative, or neutral. The analysis extends to identifying common themes and topics within the reviews, shedding light on aspects like engaging content or challenges with instructional clarity. A noteworthy aspect of the project involves exploring potential correlations between sentiment expressions and academic performance metrics, such as grades or completion rates. 

The findings can contribute meaningful insights into the factors influencing student satisfaction and success in online education. Ethical considerations, including data privacy, are essential throughout the project, and the results can inform recommendations for improving the quality of online courses and educational resources. 

All in all, this project on sentiment analysis supported by visualizations and comprehensive reporting provides a nuanced understanding of the student experience in virtual learning environments.

9. Social Media Influencer Impact

Analyzing sentiments in social media posts related to influencers involves studying the emotions and opinions expressed by users regarding specific influencers. The is a great example of a sentiment analysis NLP project that entails collecting a dataset of social media posts, comments, and mentions related to various influencers across social media platforms. NLP techniques are then applied to analyze the textual content, categorizing sentiments as positive, negative, or neutral.

The impact on followers is a key aspect of sentiment analysis projects like these. By correlating sentiment analysis results with engagement metrics, such as likes, shares, and comments, researchers can gauge how influencers affect their audience. It’s essential to explore the reasons behind sentiment shifts, such as the influencer’s content, behavior, or external factors.

This project on sentiment analysis has practical applications for marketing and brand management. Businesses and influencers can use the findings to understand their online presence, identify areas for improvement, and tailor content to better resonate with their audience. 

10. Sports Match Analysis

This sentiment analysis project begins by collecting a dataset of social media content related to specific sports matches, encompassing platforms like Twitter, Facebook, or dedicated sports forums. Natural Language Processing (NLP) techniques are then applied to analyze the textual content, categorizing sentiments as positive, negative, or neutral.

The analysis extends to exploring fan reactions, team sentiments, and the overall sentiment landscape during different phases of a sports match. Researchers may investigate how events within a match, such as goals, controversial plays, or game-changing moments, influence public opinions. For instance, spikes in positive sentiment may occur when a team scores, while negative sentiments may arise in response to referee decisions or unfavorable outcomes.

Visualizations, such as sentiment trend graphs or heatmaps, can be used to illustrate the ebb and flow of sentiments over time. Learners working on this sentiment analysis project can get enough knowledge coupled with hands-on experience to contribute to social media strategies for sports marketing, fan engagement. They can even inform sports commentators and analysts about the impact of events on public sentiment.

Movie Trailer Reactions

Analyzing sentiments expressed in comments or social media posts related to movie trailers offers a captivating project that provides valuable insights into audience anticipation and excitement for upcoming films. This sentiment analysis using Machine Learning project involves collecting and preprocessing textual data from various platforms, such as YouTube or Twitter, where movie trailers are shared. By applying sentiment analysis techniques and potentially incorporating emotion analysis, the goal is to categorize audience reactions as positive, negative, or neutral. 

Word clouds or sentiment distribution charts are the key visualizations of this project. They can then be used to present a comprehensive overview of audience sentiments. For a more advanced project, predictive modeling may be explored to estimate a movie’s potential success based on the sentiments expressed in the trailer reactions. 

This sentiment analysis using Machine Learning project report  offers filmmakers and movie studios actionable insights into the effectiveness of their promotional campaigns, audience engagement levels, and trends within the film industry, making it a valuable endeavor for learners interested in the intersection of data analysis and the entertainment industry.

Financial News Sentiment Analysis

Analyzing sentiments in financial news articles and social media posts related to the stock market or specific companies constitutes a dynamic project with multifaceted components. The project involves collecting a diverse dataset encompassing financial news and social media content, followed by meticulous text preprocessing to ensure data quality.

Employing sentiment analysis techniques, such as natural language processing or machine learning models, allows the categorization of sentiments into positive, negative, or neutral, providing insights into the overall sentiment landscape surrounding particular companies or the stock market. The analysis extends to examining the impact of sentiments on market trends and investor behavior, exploring correlations between sentiment trends and stock price movements, and identifying influential events in the financial world.

Additionally, the project delves into the influence of social media sentiments on investor decisions and may even include building predictive models for estimating stock price movements based on sentiment analysis results.

Ultimately, this sentiment analysis project with source code offers practical applications for investors, financial analysts, and companies, providing a nuanced understanding of how sentiments shape the complex landscape of financial markets.

Why Should Learners Take Up Sentiment Analysis Projects?

Taking up a project on sentiment analysis can be highly beneficial to both beginners and final yearv students, These projects empower them with practical skills, industry relevance, and the ability to make a worthy impact, fostering a holistic learning experience. Here are a few reasons why you should give these projects a go:-

  • Practical application of skills: A sentiment analysis project provide learners with a hands-on opportunity to apply theoretical concepts and skills learned in areas such as natural language processing, machine learning, and data analysis. Engaging in a real-world project allows learners to bridge the gap between theory and practical implementation.
  • Skill development: Working on a sentiment analysis project with source code can help you develop a diverse set of skills, including data collection, data preprocessing, machine learning model implementation, and result interpretation. These skills are highly transferable and applicable in various domains.
  • Understanding data context: Sentiment analysis projects often involve analyzing text data from diverse sources. For example, a sentiment analysis Python project can help students learn the nuances of that language, context, and cultural variations present in real-world data. Understanding these complexities is crucial for accurate sentiment analysis.
  • Problem-solving and critical thinking: You will find numerous sentiment analysis projects with source code that will require you to formulate research questions, design methodologies, and make decisions on data preprocessing and model selection. Engaging in such projects enhances problem-solving skills and encourages critical thinking.
  • Portfolio building: Completing sentiment analysis projects for final year can help expert learners, ready to step into their professional lives build a strong portfolio showcasing their practical skills. A portfolio is a valuable asset when applying for jobs or pursuing further education, as it demonstrates hands-on experience and the ability to work on real-world problems.
  • Industry relevance: Sentiment analysis is widely used across industries for customer feedback analysis, market research, and brand management. By working on a sentiment analysis Python project, NLP project, ML project and the like, you can gain insights into industry-relevant applications of their skills, making them more attractive to potential employers.
  • Stay updated with technology: The field of sentiment analysis is dynamic, with ongoing advancements in techniques and tools. Engaging in an sentiment analysis NLP project or any sentiment analysis project using Machine Learning or Python, can help learners keep abreast of the latest developments in the field and encourages a mindset of continuous learning.
  • Communication and presentation skills: Summarizing and presenting the findings of sentiment analysis projects require effective communication skills. Learners have the opportunity to practice articulating complex technical concepts in a clear and concise manner.
  • Personal interest and motivation: Sentiment analysis projects can be chosen based on personal interests, making the learning process more engaging and motivating. Learners are more likely to invest time and effort when working on projects that align with their passions.
  • Contribution to knowledge: Sentiment analysis projects can contribute to the broader understanding of sentiment patterns in various domains. Learners have the opportunity to make meaningful contributions to research and gain a sense of accomplishment.

Final Thoughts

Sentiment Analysis is an essential topic in machine learning. It has numerous applications in multiple fields. If you want to learn more about this topic, then you can head to our blog and find many new resources.

On the other hand, if you want to get a comprehensive and structured learning experience, also  if you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s Executive PG Programme in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Profile

Pavan Vadapalli

Something went wrong

Machine Learning Skills To Master

  • Artificial Intelligence Courses
  • Tableau Courses
  • NLP Courses
  • Deep Learning Courses

Our Popular Machine Learning Course

Machine Learning Course

Our Trending Machine Learning Courses

  • Advanced Certificate Programme in Machine Learning and NLP from IIIT Bangalore - Duration 8 Months
  • Master of Science in Machine Learning & AI from LJMU - Duration 18 Months
  • Executive PG Program in Machine Learning and AI from IIIT-B - Duration 12 Months

Frequently Asked Questions (FAQs)

Sentiment analysis is becoming a crucial tool for monitoring and understanding client sentiment as they share their opinions and emotions more openly than ever before. Brands can know what makes clients satisfied or frustrated by automatically evaluating customer feedback, such as comments in survey replies and social media dialogues. This allows them to customize products and services to match their customers' demands. For example, employing sentiment analysis to examine 4,000+ surveys about your business could help you figure out if customers like your pricing and customer service.

Even humans struggle to effectively interpret sentiments, making sentiment analysis one of the most difficult tasks in nlp. Every utterance is made at some moment in time, in some location, by and to some people, and so on. All statements are made in context. People convey their negative attitudes using positive phrases in irony and sarcasm, which can be difficult for robots to recognize without a detailed knowledge of the situation in which an emotion was expressed. Another difficulty worth tackling in sentiment analysis is how to handle comparisons. Another issue to overcome in order to undertake effective sentiment analysis is defining what we mean by neutral.

When working on a classification problem, it's critical to pick the test and training corpora wisely. Domain knowledge is required for a set of features to act in the classification process. In most data science situations, using a classification method on a cleaned corpora rather than a noisy corpus is advised. Keywords that appear infrequently in the corpus do not usually have a role in text classification. These infrequent characteristics can be removed, resulting in improved model performance. It's generally a good idea to reduce terms to their simplest versions. Lemmatization is the name for this method.

Explore Free Courses

Study Abroad Free Course

Learn more about the education system, top universities, entrance tests, course information, and employment opportunities in Canada through this course.

Marketing

Advance your career in the field of marketing with Industry relevant free courses

Data Science & Machine Learning

Build your foundation in one of the hottest industry of the 21st century

Management

Master industry-relevant skills that are required to become a leader and drive organizational success

Technology

Build essential technical skills to move forward in your career in these evolving times

Career Planning

Get insights from industry leaders and career counselors and learn how to stay ahead in your career

Law

Kickstart your career in law by building a solid foundation with these relevant free courses.

Chat GPT + Gen AI

Stay ahead of the curve and upskill yourself on Generative AI and ChatGPT

Soft Skills

Build your confidence by learning essential soft skills to help you become an Industry ready professional.

Study Abroad Free Course

Learn more about the education system, top universities, entrance tests, course information, and employment opportunities in USA through this course.

Suggested Blogs

Artificial Intelligence course fees

by venkatesh Rajanala

29 Feb 2024

Artificial Intelligence in Banking 2024: Examples & Challenges

by Pavan Vadapalli

27 Feb 2024

Top 9 Python Libraries for Machine Learning in 2024

19 Feb 2024

Top 15 IoT Interview Questions & Answers 2024 – For Beginners & Experienced

by Kechit Goyal

Data Preprocessing in Machine Learning: 7 Easy Steps To Follow

18 Feb 2024

Artificial Intelligence Salary in India [For Beginners & Experienced] in 2024

17 Feb 2024

45+ Interesting Machine Learning Project Ideas For Beginners [2024]

by Jaideep Khare

16 Feb 2024

A Complete Guide to Sentiment Analysis

“That movie was a colossal disaster… I absolutely hated it! Waste of time and money #skipit”

“Have you seen the new season of XYZ? It is so good!”

“You should really check out this new app, it’s awesome! And it makes your life so convenient.”

By reading these comments, can you figure out what the emotions behind them are?

They may seem obvious to you because we, as humans, are capable of discerning the complex emotional sentiments behind the text.

Not only have we been educated to understand the meanings, intentions, and grammar behind each of these particular sentences, but we’ve also personally felt many of these emotions before and, from our own experiences, can conjure up the deeper meaning behind these words.

Moreover, we’re also extremely familiar with the real-world objects that the text is referring to.

This doesn’t apply to machines, but they do have other ways of determining positive and negative sentiments! How do they do this, exactly? By using sentiment analysis. In this article, we will discuss how a computer can decipher emotions by using sentiment analysis methods, and what the implications of this can be. If you want to skip ahead to a certain section, simply use the clickable menu:

  • What is sentiment analysis?
  • How does sentiment analysis work?
  • Sentiment analysis use cases
  • Machine learning and sentiment analysis
  • Advantages of sentiment analysis
  • Disadvantages of sentiment analysis
  • Key takeaways and next steps

1. What is sentiment analysis?

With computers getting smarter and smarter, surely they’re able to decipher and discern between the wide range of different human emotions, right?

Wrong—while they are intelligent machines, computers can neither see nor feel any emotions, with the only input they receive being in the form of zeros and ones—or what’s more commonly known as binary code.

However, on the other hand, computers excel at the one thing that humans struggle with: processing large amounts of data quickly and effectively. So, theoretically, if we could teach machines how to identify the sentiments behind the plain text, we could analyze and evaluate the emotional response to a certain product by analyzing hundreds of thousands of reviews or tweets.

This would, in turn, provide companies with invaluable feedback and help them tailor their next product to better suit the market’s needs. So, what kind of process is this? Sentiment analysis!

Sentiment analysis, also known as opinion mining , is the process of determining the emotions behind a piece of text. Sentiment analysis aims to categorize the given text as positive, negative, or neutral.

Furthermore, it then identifies and quantifies subjective information about those texts with the help of:

  • natural language processing (NLP)
  • text analysis
  • computational linguistics
  • machine learning

2. How does sentiment analysis work?

There are two main methods for sentiment analysis: machine learning and lexicon-based.

The machine learning method leverages human-labeled data to train the text classifier, making it a supervised learning method.

The lexicon-based approach breaks down a sentence into words and scores each word’s semantic orientation based on a dictionary. It then adds up the various scores to arrive at a conclusion.

In this example, we will look at how sentiment analysis works using a simple lexicon-based approach. We’ll take the following comment as our test data:

Step 1: Cleaning

The initial step is to remove special characters and numbers from the text. In our example, we’ll remove the exclamation marks and commas from the comment above.

That movie was a colossal disaster I absolutely hated it Waste of time and money skipit

Step 2: Tokenization

Tokenization is the process of breaking down a text into smaller chunks called tokens, which are either individual words or short sentences.

Breaking down a paragraph into sentences is known as sentence tokenization , and breaking down a sentence into words is known as word tokenization .

[ ‘That’, ‘movie’, ‘was’, ‘a’, ‘colossal’, ‘disaster’, ‘I’, ‘absolutely’, ‘hated’, ‘it’,  ‘Waste’, ‘of’, ‘time’, ‘and’, ‘money’, ‘skipit’ ]

Step 3: Part-of-speech (POS) tagging

Part-of-speech tagging is the process of tagging each word with its grammatical group, categorizing it as either a noun, pronoun, adjective, or adverb—depending on its context.

This transforms each token into a tuple of the form (word, tag). POS tagging is used to preserve the context of a word.

[ (‘That’, ‘DT’), 

  (‘movie’, ‘NN’), 

  (‘was’, ‘VBD’),  

  (‘a’, ‘DT’) 

  (‘colossal’, ‘JJ’), 

  (‘disaster’, ‘NN’),  

  (‘I’, ‘PRP’), 

  (‘absolutely’, ‘RB’), 

  (‘hated’, ‘VBD’), 

  (‘it’, ‘PRP’),  

  (‘Waste’, ‘NN’) , 

  (‘of’, ‘IN’), 

  (‘time’, ‘NN’), 

  (‘and’, ‘CC’),

  (‘money’, ‘NN’),  

  (‘skipit’, ‘NN’) ]

Step 4: Removing stop words

Stop words are words like ‘have,’ ‘but,’ ‘we,’ ‘he,’ ‘into,’ ‘just,’ and so on. These words carry information of little value, andare generally considered noise, so they are removed from the data.

[ ‘movie’, ‘colossal’, ‘disaster’, ‘absolutely’, ‘hated’, Waste’, ‘time’, ‘money’, ‘skipit’ ]

Step 5: Stemming

Stemming is a process of linguistic normalization which removes the suffix of each of these words and reduces them to their base word. For example, loved is reduced to love, wasted is reduced to waste. Here, hated is reduced to hate.

[ ‘movie’, ‘colossal’, ‘disaster’, ‘absolutely’, ‘hate’, ‘Waste’, ‘time’, ‘money’, ‘skipit’ ]

Step 6: Final Analysis

In a lexicon-based approach, the remaining words are compared against the sentiment libraries, and the scores obtained for each token are added or averaged.

Sentiment libraries are a list of predefined words and phrases which are manually scored by humans. For example, ‘worst’ is scored -3, and ‘amazing’ is scored +3. 

With a basic dictionary, our example comment will be turned into:

movie= 0, colossal= 0, disaster= -2,  absolutely=0, hate=-2, waste= -1, time= 0, money= 0, skipit= 0

This makes the overall score of the comment -5 , classifying the comment as negative.

3. Sentiment analysis use cases

Sentiment analysis is used to swiftly glean insights from enormous amounts of text data, with its applications ranging from politics, finance, retail, hospitality, and healthcare. For instance, consider its usefulness in the following scenarios:

  • Brand reputation management:  Sentiment analysis allows you to track all the online chatter about your brand and spot potential PR disasters before they become major concerns. 
  • Voice of the customer: The “voice of the customer” refers to the feedback and opinions you get from your clients all over the world. You can improve your product and meet your clients’ needs with the help of this feedback and sentiment analysis.
  • Voice of the employee:   Employee satisfaction can be measured for your company by analyzing reviews on sites like Glassdoor, allowing you to determine how to improve the work environment you have created.
  • Market research: You can analyze and monitor internet reviews of your products and those of your competitors to see how the public differentiates between them, helping you glean indispensable feedback and refine your products and marketing strategies accordingly. Furthermore, sentiment analysis in market research can also anticipate future trends and thus have a first-mover advantage.

Other applications for sentiment analysis could include:

  • Customer support
  • Social media monitoring
  • Voice assistants & chatbots
  • Election polls
  • Customer experience about a product
  • Stock market sentiment and market movement
  • Analyzing movie reviews

4. Machine learning and sentiment analysis

Sentiment analysis tasks are typically treated as classification problems in the machine learning approach.

Data analysts use historical textual data—which is manually labeled as positive, negative, or neutral—as the training set. They then complete feature extraction on this labeled dataset, using this initial data to train the model to recognize the relevant patterns. Next, they can accurately predict the sentiment of a fresh piece of text using our trained model.

Naive Bayes, logistic regression, support vector machines, and neural networks are some of the classification algorithms commonly used in sentiment analysis tasks. The high accuracy of prediction is one of the key advantages of the machine learning approach.

5. Advantages of sentiment analysis

Considering large amounts of data on the internet are entirely unstructured, data analysts need a way to evaluate this data.

With regards to sentiment analysis, data analysts want to extract and identify emotions, attitudes, and opinions from our sample sets. Reading and assigning a rating to a large number of reviews, tweets, and comments is not an easy task, but with the help of sentiment analysis, this can be accomplished quickly.

Another unparalleled feature of sentiment analysis is its ability to quickly analyze data such as new product launches or new policy proposals in real time. Thus, sentiment analysis can be a cost-effective and efficient way to gauge and accordingly manage public opinion.

6. Disadvantages of sentiment analysis

Sentiment analysis, as fascinating as it is, is not without its flaws.

Human language is nuanced and often far from straightforward. Machines might struggle to identify the emotions behind an individual piece of text despite their extensive grasp of past data. Some situations where sentiment analysis might fail are:

  • Sarcasm, jokes, irony. These things generally don’t follow a fixed set of rules, so they might not be correctly classified by sentiment analytics systems.
  • Nuance. Words can have multiple meanings and connotations, which are entirely subject to the context they occur in.
  • Multipolarity. When the given text is positive in some parts and negative in others.
  • Negation detection. It can be challenging for the machine because the function and the scope of the word ‘not’ in a sentence is not definite; moreover, suffixes and prefixes such as ‘non-,’ ‘dis-,’ ‘-less’ etc. can change the meaning of a text.

7. Key takeaways and next steps

In this article, we examined the science and nuances of sentiment analysis. While sentimental analysis is a method that’s nowhere near perfect, as more data is generated and fed into machines, they will continue to get smarter and improve the accuracy with which they process that data. 

All in all, sentimental analysis has a large use case and is an indispensable tool for companies that hope to leverage the power of data to make optimal decisions.

For those who believe in the power of data science and want to learn more, we recommend taking this free, 5-day introductory course in data analytics . You could also read more about related topics by reading any of the following articles:

  • The Best Data Books for Aspiring Data Analysts
  • PyTorch vs TensorFlow: What Are They And Which Should You Use?
  • These Are the Best Data Bootcamps for Learning Python

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories
  • Artificial Intelligence
  • Market Research
  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Survey Data Analysis & Reporting
  • Sentiment Analysis

What is sentiment analysis?

What is sentiment analysis used for, why is sentiment analysis important, use cases for sentiment analysis, types of sentiment analysis, pros and cons of using a sentiment analysis system, how does sentiment analysis work, sentiment analysis challenges, three places to analyze customer sentiment, sentiment analysis tools, analyzing customer sentiment, creating better experiences, try qualtrics for free, sentiment analysis and how to leverage it.

20 min read From survey results and customer reviews to social media mentions and chat conversations, today’s businesses have access to data from numerous sources. But how can teams turn all of that data into meaningful insights? Find out how sentiment analysis can help.

When it comes to branding, simply having a great product or service is not enough.  In order to determine the true impact of a brand, organizations must leverage data from across customer feedback channels to fully understand the market perception of their offerings.

Quantitative feedback available via metrics such as net promoter scores can provide some information about brand performance, but qualitative feedback in the form of unstructured data provides more nuanced insight into how people actually “feel” about your brand .

Sifting through textual data, however, can be extremely time-consuming. Whether analyzing solicited feedback via channels such as surveys or examining unsolicited feedback found on social media, online forums, and more, it’s impossible to comprehensively identify and integrate data on brand sentiment when relying solely on manual processes.

Leveraging an omnichannel analytics platform allows teams to collect all of this information and aggregate it into a complete view. Once obtained, there are many ways to analyze and enrich the data, one of which involves conducting sentiment analysis. Sentiment analysis can be used to improve customer experience through direct and indirect interactions with your brand. Let’s consider the definition of sentiment analysis, how it works and when to use it.

Learn how TextiQ can help you conduct advanced sentiment analysis

Sentiment refers to the positivity or negativity expressed in text. Sentiment analysis provides an effective way to evaluate written or spoken language to determine if the expression is favorable, unfavorable, or neutral, and to what degree. Because of this, it gives a useful indication of how the customer felt about their experience.

If you’ve ever left an online review, made a comment about a brand or product online, or answered a large-scale market research survey , there’s a chance your responses have been through sentiment analysis.

Sentiment analysis is part of the greater umbrella of text mining, also known as text analysis . This type of analysis extracts meaning from many sources of text, such as surveys , reviews, public social media, and even articles on the Web. A score is then assigned to each clause based on the sentiment expressed in the text. For example, -1 for negative sentiment and +1 for positive sentiment. This is done using natural language processing (NLP).

Positive neutral and negative sentiment chart

Today’s algorithm-based sentiment analysis tools can handle huge volumes of customer feedback consistently and accurately. A type of text analysis , sentiment analysis, reveals how positive or negative customers feel about topics ranging from your products and services to your location, your advertisements, or even your competitors.

Accurate sentiment analysis can be difficult to conduct, what’s the benefit? Why do we use an AI-powered tool to categorize natural language feedback rather than our human brains?

Mostly, it’s a question of scale. Sentiment analysis is helpful when you have a large volume of text-based information that you need to generalize from.

For example, let’s say you work on the marketing team at a major motion picture studio, and you just released a trailer for a movie that got a huge volume of comments on Twitter.

You can read some – or even a lot – of the comments, but you won’t be able to get an accurate picture of how many people liked or disliked it unless you look at every last one and make a note of whether it was positive, negative or neutral. That would be prohibitively expensive and time-consuming, and the results would be prone to a degree of human error.

On top of that, you’d have a risk of bias coming from the person or people going through the comments. They might have certain views or perceptions that color the way they interpret the data, and their judgment may change from time to time depending on their mood, energy levels, and other normal human variations.

On the other hand, sentiment analysis tools provide a comprehensive, consistent overall verdict with a simple button press.

From there, it’s up to the business to determine how they’ll put that sentiment into action .

Sentiment analysis is critical because it helps provide insight into how customers perceive your brand .

Customer feedback – whether that’s via social media, the website, conversations with service agents, or any other source – contains a treasure trove of useful business information, but it isn’t enough to know what customers are talking about. Knowing how they feel will give you the most insight into how their experience was. Sentiment analysis is one way to understand those experiences.

Sometimes known as “opinion mining,” sentiment analysis can let you know if there has been a change in public opinion toward any aspect of your business. Peaks or valleys in sentiment scores give you a place to start if you want to make product improvements, train sales reps or customer care agents, or create new marketing campaigns.

We live in a world where huge amounts of written information are produced and published every moment, thanks to the internet, news articles, social media, and digital communications. Sentiment analysis can help companies keep track of how their brands and products are perceived, both at key moments and over a period of time.

It can also be used in market research , PR, marketing analysis, reputation management , stock analysis and financial trading, customer experience , product design , and many more fields.

Here are a few scenarios where sentiment analysis can save time and add value:

  • Social media listening – in day-to-day monitoring, or around a specific event such as a product launch
  • Analyzing survey responses for a large-scale research program
  • Processing employee feedback in a large organization
  • Identifying very unhappy customers so you can offer closed-loop follow up
  • See where sentiment trends are clustered in particular groups or regions
  • Competitor research – checking your approval levels against comparable businesses

Airline onboard experience sentiment by category

Not all sentiment analysis is done the same way. There are different ways to approach it and a range of different algorithms and processes that can be used to do the job depending on the context of use and the desired outcome.

Basic sub-types of sentiment analysis include:

  • Detecting sentiment This means parsing through text and sorting opinionated data (such as “I love this!”) from objective data (like “the restaurant is located downtown”).
  • Categorizing sentiment This means detecting whether the sentiment is positive, negative, or neutral. Your tools may also add weighting to these categories, e.g very positive, positive, neutral, somewhat negative, negative.
  • Clause-level Analysis Sometimes, the text contains mixed or ambivalent opinions, for example, “staff was very friendly but we waited too long to be served”. Being able to score feedback at the clause level indicates when there are both good and bad opinions expressed in one place , and can be useful in case the positives and negatives within a text cancel each other out and return a misleading neutral sentiment

In addition, you can choose whether to view the results of sentiment analysis at:

  • Document-level (useful for professional reviews or press coverage)
  • Sentence level (for short comments and evaluations)
  • Sub-sentence level (for picking out the meaning in phrases or short clauses within a sentence)

Sentiment analysis is a powerful tool that offers a number of advantages, but like any research method, it has some limitations.

Advantages of sentiment analysis:

  • Accurate, unbiased results
  • Enhanced insights
  • More time and energy available for staff do to higher-level tasks
  • Consistent measures you can use to track sentiment over time

Disadvantages of sentiment analysis:

  • Best for large and numerous data sets. To get real value out of sentiment analysis tools, you need to be analyzing large quantities of textual data on a regular basis.
  • Sentiment analysis is still a developing field, and the results are not always perfect. You may still need to sense-check and manually correct results occasionally.

Sentiment analysis uses machine learning, statistics, and natural language processing (NLP) to find out how people think and feel on a macro scale. Sentiment analysis tools take written content and process it to unearth the positivity or negativity of the expression.

This is done in a couple of ways:

  • Rule-based sentiment analysis This method uses a lexicon, or word-list, where each word is given a score for sentiment, for example “great” = 0.9, “lame” = -0.7, “okay” = 0.1 Sentences are assessed for overall positivity or negativity using these weightings. Rule-based systems usually require additional finessing to account for sarcasm, idioms, and other verbal anomalies.
  • Machine learning-based sentiment analysis A computer model is given a training set of natural language feedback, manually tagged with sentiment labels. It learns which words and phrases have a positive sentiment or a negative sentiment. Once trained, it can then be used on new data sets.

In some cases, the best results come from combining the two methods.

Sentiment analysis of client feedback

Developing sentiment analysis tools is technically an impressive feat, since human language is grammatically intricate, heavily context-dependent, and varies a lot from person to person. If you say “I loved it,” another person might say “I’ve never seen better,” or “Leaves its rivals in the dust”. The challenge for an AI tool is to recognize that all these sentences mean the same thing.

Another challenge is to decide how language is interpreted since this is very subjective and varies between individuals. What sounds positive to one person might sound negative or even neutral to someone else. In designing algorithms for sentiment analysis, data scientists must think creatively in order to build useful and reliable tools.

Getting the correct sentiment classification

Sentiment classification requires your sentiment analysis tools to be sophisticated enough to understand not only when a data snippet is positive or negative, but how to extrapolate sentiment even when both positive and negative words are used. On top of that, it needs to be able to understand context and complications such as sarcasm or irony.

Human beings are complicated, and how we express ourselves can be similarly complex. Many types of sentiment analysis tools use a simple view of polarity (positive/neutral/negative), which means much of the meaning behind the data is lost.

Let’s see an example:

“I hated the setup process, but the product was easy to use so in the end, I think my purchase was worth it.”

A less sophisticated sentiment analysis tool might see the sentiment expressed here as “neutral” because the positive – “the product was easy to use so, in the end, I think my purchase was worth it” – and negative-tagged sentiments – “I hated the setup process” – cancel each other out.

However, polarity isn’t so cut-and-dry as being one or the other here. The final part – “in the end, I think my purchase was worth it” – means that as a human analyzing the text, we can see that generally, this customer felt mostly positive about the experience. That’s why a scale from positive to negative is needed, and why a sentiment analysis tool adds weighting along a scale of 1-11.

How satisfied are you with our service? Likert scale question

Scores are assigned with attention to grammar, context, industry, and source, and Qualtrics gives users the ability to adjust the sentiment scores to be even more business-specific.

Understanding context

Context is key for a sentiment analysis model to be correct. This means you need to make sure that your sentiment scoring tool not only knows that “happy” is positive—and that “not happy” is not, but understands that certain words that are context-dependent are viewed correctly.

As human beings, we know customers are pleased when they mention how “thin” their new laptop is, but that they’re complaining when they talk about the “thin” walls in your hotel. We understand that context.

Obviously, a tool that flags “thin” as negative sentiment in all circumstances is going to lose accuracy in its sentiment scores. The context is important.

This is where training natural language processing (NLP) algorithms come in. Natural language processing is a way of mimicking the human understanding of language, meaning context becomes more readily understood by your sentiment analysis tool.

Sentiment analysis algorithms are trained using this system over time, using deep learning to understand instances with context and apply that learning to future data. This is why a sophisticated sentiment analysis tool can help you to not only analyze vast volumes of data more quickly but also discern what context is common or important to your customers .

In a world of endless opinions on the Web, how people “feel” about your brand can be important for measuring the customer experience .

Consumers desire likable brands that understand them; brands that provide memorable on-and-offline experiences. The more in-tune a consumer feels with your brand, the more likely they’ll share feedback, and the more likely they’ll buy from you too. According to our Consumer trends research , 62% of consumers said that businesses need to care more about them, and 60% would buy more as a result.

But the opposite is true as well. As a matter of fact, 71 percent of Twitter users will take to the social media platform to voice their frustrations with a brand.

These conversations, both positive and negative, should be captured and analyzed to improve the customer experience. Sentiment analysis can help.

1. Text analysis for surveys

Surveys are a great way to connect with customers directly, and they’re also ripe with constructive feedback . The feedback within survey responses can be quickly analyzed for sentiment scores.

For the survey itself, consider questions that will generate qualitative customer experience metrics, some examples include:

  • What was your most recent experience like?
  • How much better (or worse) was your experience compared to your expectations?
  • What is something you would have changed about your experience?

Remember, the goal here is to acquire honest textual responses from your customers so the sentiment within them can be analyzed. Another tip is to avoid close-ended questions that only generate “yes” or “no” responses. These types of questions won’t serve your analysis well.

Next, use a text analysis tool to break down the nuances of the responses. TextiQ is a tool that will not only provide sentiment scores but extract key themes from the responses.

After the sentiment is scored from survey responses, you’ll be able to address some of the more immediate concerns your customers have during their experiences.

Another great place to find text feedback is through customer reviews .

2. Text analysis for customer reviews

Did you know that 72 percent of customers will not take action until they’ve read reviews on a product or service? An astonishing 95 percent of customers read reviews prior to making a purchase. In today’s feedback-driven world, the power of customer reviews and peer insight is undeniable.

Review sites like G2 are common first-stops for customers looking for honest feedback on products and services. This feedback, like that in surveys, can be analyzed.

The benefit of customer reviews compared to surveys is that they’re unsolicited, which often leads to more honest and in-depth feedback.

To improve the customer experience, you can take the sentiment scores from customer reviews – positive, negative, and neutral – and identify gaps and pain points that may have not been addressed in the surveys. Remember, negative feedback is just as (if not more) beneficial to your business than positive feedback.

3. Text analysis for social media

Another way to acquire textual data is through social media analysis.

Monitoring tools ingest publicly available social media data on platforms such as Twitter and Facebook for brand mentions and assign sentiment scores accordingly. This has its upsides as well considering users are highly likely to take their uninhibited feedback to social media.

Regardless, a staggering 70 percent of brands don’t bother with feedback on social media. Because social media is an ocean of big data just waiting to be analyzed, brands could be missing out on some important information.

When choosing sentiment analysis technologies, bear in mind how you will use them. There are a number of options out there, from open-source solutions to in-built features within social listening tools. Some of them are limited in scope, while others are more powerful but require a high level of user knowledge.

Text iQ is a natural language processing tool within the Experience Management Platform™ that allows you to carry out sentiment analysis online using just your browser. It’s fully integrated, meaning that you can view and analyze your sentiment analysis results in the context of other data and metrics, including those from third-party platforms.

Like all our tools, it’s designed to be straightforward, clear, and accessible to those without specialized skills or experience, so there’s no barrier between you and the results you want to achieve.

When it comes to understanding the customer experience, the key is to always be on the lookout for customer feedback. Sentiment analysis is not a one-and-done effort and requires continuous monitoring. By reviewing your customers’ feedback on your business regularly, you can proactively get ahead of emerging trends and fix problems before it’s too late.  Acquiring feedback and analyzing sentiment can provide businesses with a deep understanding of how customers truly “feel” about their brand. When you’re able to understand your customers, you’re able to provide a more robust customer experience.

Related resources

Analysis & Reporting

Margin of error 11 min read

Data saturation in qualitative research 8 min read, thematic analysis 11 min read, behavioral analytics 12 min read, statistical significance calculator: tool & complete guide 18 min read, regression analysis 19 min read, data analysis 31 min read, request demo.

Ready to learn more about Qualtrics?

10 Sentiment Analysis Project Ideas with Source Code [2024]

Explore some of the best sentiment analysis project ideas for the final year project using machine learning with source code for practice.

10 Sentiment Analysis Project Ideas with Source Code [2024]

Emotions are essential, not only in personal life but in business as well. How your customers and target audience feel about your products or brand provides you with the context necessary to evaluate and improve the product, business, marketing , and communications strategy. Sentiment analysis or opinion mining helps researchers and companies extract insights from user-generated social media and web content. 

data_science_project

Ecommerce product reviews - Pairwise ranking and sentiment analysis

Downloadable solution code | Explanatory videos | Tech Support

Irrespective of the industry or vertical, brands have become imperative to understand consumers’ feelings about the brand and products. With cut-throat competition in the NLP and ML industry for high-paying jobs, a boring cookie-cutter resume might not just be enough. Instead, working on a sentiment analysis project with real datasets will help you stand out in job applications and improve your chances of receiving a call back from your dream company. 

ProjectPro Free Projects on Big Data and Data Science

Building a portfolio of projects will give you the hands-on experience and skills required for performing sentiment analysis. In this blog, you’ll learn more about the benefits of sentiment analysis and ten project ideas divided by difficulty level. 

Table of Contents

What is sentiment analysis, beginner level sentiment analysis project ideas, intermediate level sentiment analysis project ideas, advanced sentiment analysis project ideas, top sentiment analysis project ideas with source code using machine learning.

Top Sentiment Analysis Project Ideas

Let's put first things first to understand what exactly is sentiment analysis and how it benefits the business.

New Projects

Sentiment analysis is used to analyze raw text to drive objective quantitative results using natural language processing, machine learning, and other data analytics techniques. It is used to detect positive or negative sentiment in text, and often businesses use it to gauge branded reputation among their customers. 

There are various types of sentiment analysis where the models focus on feelings and emotions, urgency, even intentions, and polarity. The most popular types of sentiment analysis are:

Fine-grained sentiment analysis

Emotion detection

Aspect based sentiment analysis

Multilingual sentiment analysis

Sentiment analysis is critical because it helps businesses to understand the emotion and sentiments of their customers. Companies analyze customers’ sentiment through social media conversations and reviews so they can make better-informed decisions. The Global Sentiment Analysis Software Market is projected to reach US$4.3 billion by the year 2027. Between 2017 and 2023, the global sentiment analysis market will increase by a CAGR of 14%. 

The overall benefits of sentiment analysis include:

Sorting Data at Scale: With sentiment analysis, companies don't have to sort through customer support conversations manually, thousands of tweets, and surveys. Sentiment analysis helps businesses process vast amounts of data efficiently.

Real-Time Analysis: It helps to identify critical issues in real-time. For example, is a crisis on social media escalating? Is there an angry customer about to churn? With Sentiment analysis models, businesses can immediately identify customer pain points and take action right away.

Consistent criteria: A centralized sentiment analysis system can improve accuracy and deliver better insights since tagging text by sentiment is highly subjective, influenced by personal experiences, thoughts, and beliefs. 

Get Closer To Your Dream of Becoming a Data Scientist with 70+ Solved End-to-End ML Projects

1. Amazon Product Reviews

The first beginner-friendly Sentiment Analysis project idea is about evaluating Amazon product reviews. Amazon is one of the biggest e-commerce stores, and it also has a wide product selection. When companies want to understand public opinion, performing sentiment analysis helps them recognize what customers like about their products. It also helps to figure out the primary issues with their products. 

The sentiment analysis for this project begins from scraping raw alpha-numeric text of various products from Amazon. These reviews have to go through a text processing stage that primarily uses TFIDF (Term Frequency Inverse Document Frequency) to convert text into integers. Classification models like SVM (Support Vector Machines) can label a given sentence ‘Positive’ or ‘Negative.’ 

Expertise in this project is in demand since companies want experts to use sentiment analysis to analyze their product reviews for market research. A beginner can start with less popular products, whereas people seeking a challenge can pick a popular product and analyze its reviews. 

The dataset for Amazon Product Reviews: Amazon Product Reviews Dataset . 

Access the Sentiment Analysis Project on Product Reviews with Source Code

Get FREE Access to Machine Learning Example Codes for Data Cleaning, Data Munging, and Data Visualization

2. Rotten Tomatoes Movie Reviews

Rotten Tomatoes is a movie and shows review site where critics and movie fans leave reviews. The platform has reviews of nearly every TV series, show, or drama from most languages. It's a substantial dataset source for performing sentiment analysis on the reviews.

The movie review analysis is a classic multi-class model problem since a movie can have multiple sentiments -- negative, somewhat negative, neutral, fairly positive, and positive. Since a movie review can have additional characters like emojis and special characters, the extracted data must go through data normalization. Text processing stages like tokenization and bag of words (number of occurrences of words within the text) can be performed by using the NLTK (natural language toolkit) library.

The entertainment industry takes critic reviews seriously, and it also helps production houses to understand why their series or movie succeeded (or failed). Critic reviews also influence the commercial success of a drama and movie since people check reviews before booking their movie tickets. With sentiment analysis, production houses can figure out the general opinion of critics. You can use one of two Rotten Tomatoes dataset for this project: the Rotten Tomatoes dataset or Kaggle's dataset . 

Access the Sentiment Analysis Project on Movie Reviews with Source Code

Here's what valued users are saying about ProjectPro

user profile

Director Data Analytics at EY / EY Tech

user profile

Abhinav Agarwal

Graduate Student at Northwestern University

Not sure what you are looking for?

3. Analyze IMDb Reviews

The next idea on our list is a machine learning sentiment analysis project. Like Rotten Tomatoes, IMDb is an entertainment review website where people leave their opinions on various movies and TV series. You can perform sentiment analysis on the reviews to find what viewers liked/disliked about the show. This beginner-friendly sentiment analysis project will help you learn about data science and machine learning applications in the entertainment industry.

A movie review generally consists of some common words (articles, prepositions, pronouns, conjunctions, etc.) in any language. These repetitive words are called stopwords that do not add much information to text. NLP libraries like spaCY efficiently remove stopwords from review during text processing. This reduces the size of the dataset and improves multi-class model performance because the data would only contain meaningful words.

These results are useful for production companies to understand why their title succeeded or failed. Beginners can use the small IMDb reviews dataset to test their skills. You can use the IMDb Dataset of 50k movie reviews for an advanced take of the same project. 

Explore Categories

4. Reviews of Scientific Papers

Sentiment analysis of citation contexts in research/review papers is an unexplored field, primarily because of the existing myth that most research papers have a positive citation. Additionally, negative citations are hardly explicit, and the criticisms are often veiled. There is a lack of explicit sentiment expressions, and it poses a significant challenge for successful polarity identification. 

Deriving sentiments from research papers require both fundamental and intricate analysis. In such cases, rule-based analysis can be done using various NLP concepts like Latent Dirichlet Allocation (LDA) to segregate research papers into different classes by understanding the abstracts. LDA models are statistical models that derive mathematical intuition on a set of documents using the ‘topic-model’ concept. 

Sentiment analysis of citations in scientific papers and articles is an exciting project idea that can help researchers figure out what experts in their fields think about a topic. Here’s the scientific paper dataset to get started on this project: Machine Learning Dataset . It has N = 405 instances and is stored in JSON format. 

5. Track Customer Sentiment Over Time

As the business changes, so do customer interests and sentiments. When businesses start a new product line or change the prices of their products, it will affect customer sentiment. Tracking customer sentiment over time will help you measure and understand it. A change in sentiment score indicates if your changes emotionally resonate with the customers. Tracking both positive and negative sentiments will help companies improve products and fix blunders. 

Learners can use open-source libraries like TensorFlow Hub, which can help you perform text-processing on the raw text, like removing punctuations and splitting them into spaces. You can use the deep neural network (DNN) classifier model from the TensorFlow estimator class to better understand customer sentiment. A DNN classifier consists of many layers and perceptrons that propagate for enhancing accuracy.

You can use the Predicting Customer Satisfaction dataset or pick a dataset from data.world . 

Explore More  Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

6. Customer Feedback Project

The following sentiment analysis example project is gaining insights from customer feedback. If a business offers services and requests users to leave feedback on your forum or email, this project can help determine their satisfaction with your services. It can also determine employees' emotional satisfaction with your company and its processes. Sentiment analysis can read beyond simple sentences and detect sarcasm, read common chat acronyms (LOL, ROFL, etc.), and correct common mistakes like misused and misspelled words.

Understanding sentiments of customer feedback involves text-processing techniques like part-of-speech tagging and lemmatization (transforming a word to its root form). These transformations help developers to clean data for feature engineering . Learners can perform such analysis in a short duration while using fewer resources with cloud services tools like IBM Watson or Amazon Comprehend.

You can use the Customer Feedback Dataset for this project. 

Upskill yourself for your dream job with industry-level big data projects with source code

7. Analyze a Company’s Reputation (News + Social Media)

For this intermediate sentiment analysis project, you can pick any company to perform a detailed opinion analysis. Sentiment analysis will help you to understand public opinion on the company and its products. 

To find the public opinion on any company, start with collecting data from the relevant sources, like their Facebook and Twitter page. Analyze the conversations between the users to find the overall brand perception in the market. For a more detailed analysis, you can scrape data from various review sites.

Python provides many scraping libraries like ‘Beautiful Soup’ to collect data from websites. This data can then be converted into a dataframe using the Pandas library. To perform NLP operations on a dataframe, the Gensim library can be effectively used to carry out N-gram analysis apart from basic text processing. N-gram analysis helps you to understand the relative meaning by combining two or more words. If two words are combined, it is termed ‘Bi-gram,’ and the connection of three words is called ‘Tri-gram’ analysis. This analysis considers the association of words to understand the actual sentiment of the text. For instance, if Bi-gram analysis is performed on the text “battery performance is not good ,” it will reflect a negative sentiment.

Get confident to build end-to-end projects

Access to a curated library of 250+ end-to-end industry projects with solution code, videos and tech support.

8. Twitter Sentiment Analysis

In the first advanced sentiment analysis project, you'll learn how to make a Twitter sentiment analysis project using Python. Twitter helps corporations, businesses, and governments to get public opinion on any trending topic. For this Twitter sentiment analysis Python project, you should have some basic or intermediate experience in performing opinion mining. 

You must also have some experience with RESTful APIs since Twitter API is required to extract data. The project also uses the Naive Bayes Classifier to classify the data later in the project. It's a time-consuming project but will show your expertise in opinion mining. 

Start with getting authorized credentials from Twitter, create the function, and build your first test set using the Twitter API. Unless you know how to use deep learning for non-textual components, they won't affect the polarity of sentiment analysis. Remove duplicate characters and typos since data cleaning is vital to get the best results. Use the Naive Bayes Classifier for analyzing the sentiment. Finally, test your model and see whether it's producing the desired results. 

Performing sentiment analysis on tweets is a fantastic way to test your knowledge of this subject. It'll be a great addition to your data science portfolio (or CV) as well. 

Access Job Recommendation System Project with Source Code

9. Sentiment Analysis Based on News Topics during COVID-19

It's been over a year since the first lockdown in many countries worldwide because of the COVID-19 pandemic. The pandemic not only endangered our physical health, but the social distancing posed a significant threat to our emotional stability. With a sentiment analysis project on COVID-19 news, you can understand how others responded to the pandemic and misinformation?

The Textblob sentiment analysis for a research project is helpful to explore public sentiments. You can either use Twitter, Facebook, or LinkedIn to gather user-generated content reflecting the public's reactions towards this pandemic. For a more advanced approach, you can compare public opinion from January 2020 to December 2020 and January 2021 to October 2021. 

10. Toxic Comments Classification

Everything from forums, blogs, discussion boards, and websites like Wikipedia encourages people to share their knowledge. However, not every user takes part appropriately. Some see these platforms as an avenue to vent their insecurity, rage, and prejudices on social issues, organizations, and the government. Platforms like Wikipedia that run on user-generated content depend on user discussion to curate and approve content. Maintaining positivity requires the community to flag and remove harmful content quickly. 

For the next advanced level sentiment analysis project, you can create a classifier model to predict if the input text is inappropriate (toxic). Use the Toxic Comment Classification Challenge dataset for this project.

Over the years, analyses were mostly limited to structured data within organizations. However, companies now realize the benefits of unstructured data for generating insights that could enhance their business operations. Consequently, there is a rising demand for professionals who can person various NLP-based analyses, including sentiment analysis, for assisting companies in making informed decisions. Gaining expertise by performing the above-listed projects can differentiate you in the competitive data science industry, leading to a better job opportunity for your career growth.

Access Solved Big Data and Data Science Projects

About the Author

author profile

ProjectPro is the only online platform designed to help professionals gain practical, hands-on experience in big data, data engineering, data science, and machine learning related technologies. Having over 270+ reusable project templates in data science and big data with step-by-step walkthroughs,

arrow link

© 2024

© 2024 Iconiq Inc.

Privacy policy

User policy

Write for ProjectPro

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Survey on sentiment analysis: evolution of research methods and topics

Jingfeng cui.

1 Institute of High Performance Computing, A*STAR, 1 Fusionopolis Way, Singapore, 138632 Singapore

2 School of Information Management, Nanjing Agricultural University, 1 Weigang, Nanjing, 210095 China

Zhaoxia Wang

3 School of Computing and Information Systems, Singapore Management University, 80 Stamford Rd, Singapore, 178902 Singapore

Seng-Beng Ho

Erik cambria.

4 School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798 Singapore

Associated Data

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Sentiment analysis, one of the research hotspots in the natural language processing field, has attracted the attention of researchers, and research papers on the field are increasingly published. Many literature reviews on sentiment analysis involving techniques, methods, and applications have been produced using different survey methodologies and tools, but there has not been a survey dedicated to the evolution of research methods and topics of sentiment analysis. There have also been few survey works leveraging keyword co-occurrence on sentiment analysis. Therefore, this study presents a survey of sentiment analysis focusing on the evolution of research methods and topics. It incorporates keyword co-occurrence analysis with a community detection algorithm. This survey not only compares and analyzes the connections between research methods and topics over the past two decades but also uncovers the hotspots and trends over time, thus providing guidance for researchers. Furthermore, this paper presents broad practical insights into the methods and topics of sentiment analysis, while also identifying technical directions, limitations, and future work.

Introduction

Web 2.0 has driven the proliferation of user-generated content on the Internet. This content is closely related to the lives, emotions, and opinions of users. Therefore, analysis of this user-generated data is beneficial for monitoring public opinion and assisting in making decisions. Sentiment analysis, as one of the most popular applications of text-based analytics, can be used to mine people’s attitudes, emotions, appraisals, and opinions about issues, entities, topics, events, and products (Cambria et al. 2022a , b , c , d ; Injadat et al. 2016 ; Jiang et al. 2017 ; Liang et al. 2022 ; Oueslati et al. 2020 ; Piryani et al. 2017 ). Sentiment analysis can help us interpret emotions in unstructured texts as positive, negative, or neutral, and even calculate how strong or weak the emotions are. Today, sentiment analysis is widely used in various fields, such as business, finance, politics, education, and services. This analytical technique has gained broad acceptance not only among researchers but also among governments, institutions, and companies (Khatua et al. 2020 ; Liu et al. 2012 ; Sánchez-Rada and Iglesias 2019 ; Wang et al. 2020b ). It helps policy leaders, businessmen, and service people make better decisions.

The majority of user-generated content data is unstructured text, which increases the great difficulty of sentiment analysis. Since 2000, researchers have been exploring techniques and methods to enhance the accuracy of such analysis. The popularity of social media platforms has brought people around the world closer together. With the continuous advancement of technology, the research topics, application fields, and core methods and technologies of sentiment analysis are also constantly changing.

Comparing and analyzing papers from specific disciplines can help researchers gain a comprehensive understanding of the field. There have been many surveys on sentiment analysis (Nair et al. 2019 ; Obiedat et al. 2021 ; Raghuvanshi and Patil 2016 ). However, there is a lack of adequate discussion on the connections between research methods and topics in the field, as well as on their evolution over time. In 1983, Callon et al. proposed co-word analysis (Callon et al. 1983 ). It can effectively reflect the correlation strength of information items in text data. Co-word analysis based on the frequency of co-occurrence of keywords used to describe papers can reveal the core contents of the research in specific fields. An evolutionary analysis of the associations between core contents is helpful for a comprehensive understanding of the research hotspots and frontiers in the field (Deng et al. 2021 ). It can provide guidance for researchers, especially those who are new to the field, and help them determine research directions, avoid repetitive research, and better discover and grasp the research trends in this field (Wang et al. 2012 ). To fill in the gap in existing research, we conduct keyword co-occurrence analysis and evolution analysis with informetric tools to explore the research hotspots and trends of sentiment analysis.

The main contributions of this survey are as follows:

  • Using keyword co-occurrence analysis and the informetric tools, the paper presents a survey on sentiment analysis, explores and discovers useful information.
  • A keyword co-occurrence network is constructed by combining the paper title, abstract, and author keywords. Through the keyword co-occurrence network and community detection algorithm, the research methods and topics in the field of sentiment analysis, along with their evolution in the past two decades, are discussed.
  • The paper summarizes the research hotspots and trends in sentiment analysis. It also highlights practical implications and technical directions.

The remainder of this paper is organized as follows: In Sect.  2 , we summarize and analyze the existing surveys on sentiment analysis and present the research purpose and methodologies of this paper. Section  3 details the survey methodology, including the collection and processing of scientific publications, visualization, and analysis using different methods and tools. In Sect.  4 , we analyze the results obtained from the keyword co-occurrence analysis and evolution analysis, along with the research hotspots and trends in sentiment analysis identified through the analysis results. Finally, in Sect.  5 , we summarize the research conclusions as well as the practical implications and technical directions of sentiment analysis. We also clarify the limitations of this paper and make suggestions for future work.

Existing surveys on sentiment analysis

Sentiment analysis is a concept encompassing many tasks, such as sentiment extraction, sentiment classification, opinion summarization, review analysis, sarcasm detection or emotion detection, etc. Since the 2000s, sentiment analysis has become a popular research field in natural language processing (Hussein 2018 ). In the existing surveys, the researchers mainly conducted specific analyses of the tasks, technologies, methods, analysis granularity, and application fields involved in the sentiment analysis process.

Surveys on contents and topics of sentiment analysis

When research on sentiment analysis was still in its infancy, the contents and topics of surveys mainly focused on sentiment analysis tasks, analysis granularity, and application areas. Kumer et al. reviewed the basic terms, tasks, and levels of granularity related to sentiment analysis (Kumar and Sebastian 2012 ). They also discussed some key feature selection techniques and the applications of sentiment analysis in business, politics, recommender systems and other fields. Nassirtoussi et al. explored the application of sentiment analysis in market prediction (Nassirtoussi et al. 2014 ). Medhat et al. analyzed the improvement of the algorithms proposed in 2010–2013 and their application fields (Medhat et al. 2014 ). Ravi et al. analyzed the papers related to opinion mining and sentiment analysis from 2002 to 2015. Their study mainly discussed the necessary tasks, methods, applications, and unsolved problems in the field of sentiment analysis (Ravi and Ravi 2015 ).

Existing surveys of the applications of sentiment analysis have focused more on the domains of market research, medicine, and social media in recent years. Rambocas et al. examined the application of sentiment analysis in marketing research from three main perspectives, including the unit of analysis, sampling design, and methods used in sentiment detection and statistical analysis (Rambocas and Pacheco 2018 ). Cheng et al. summarized techniques based on semantic, sentiment, and event extraction, as well as hybrid methods employed in stock forecasting (Cheng et al. 2022 ). Yue et al. categorized and compared a large number of techniques and approaches in the social media domain. That study also introduced different types of data and advanced research tools, and discussed their limitations (Yue et al. 2019 ). In the context of the COVID-19 epidemic, Alamoodi et al. reviewed and analyzed articles on the occurrence of different types of infectious diseases in the past 10 years. They reviewed the applications of sentiment analysis from the identified 28 articles, summarizing the adopted techniques such as dictionary-based models, machine learning models, and mixed models (Alamoodi et al. 2021b ); Alamoodi et al. also conducted a review of the applications of sentiment analysis for vaccine hesitancy (Alamoodi et al. 2021a ). Researchers also reviewed the application of sentiment analysis in the fields of election prediction (Brito et al. 2021 ), education (Kastrati et al. 2021 ; Zhou and Ye 2020 ) and service industries (Adak et al. 2022 ).

Quite a number of research works investigated sentiment analysis works in non-English languages. Sentiment analysis in Chinese (Peng et al. 2017 ), Arabic (Al-Ayyoub et al. 2019 ; Boudad et al. 2018 ; Nassif et al. 2021 ; Oueslati et al. 2020 ), Urdu (Khattak et al. 2021 ), Spanish (Angel et al. 2021 ), and Portuguese (Pereira 2021 ) were conducted. They mainly reviewed the classification frameworks of the sentiment analysis process, supported language resources (dictionaries, natural language processing tools, corpora, ontologies, etc.), and deep learning models used (CNN, RNN, and transfer learning) for each of the languages involved.

Surveys on methods of sentiment analysis

Before machine learning technology became mature, researchers were particularly concerned about feature extraction methods. For example, Feldman summarized methods for extracting preferred entities from indirect opinions and methods for dictionary acquisition (Feldman 2013 ). Asghar et al. reviewed the natural language processing techniques for extracting features based on part of speech and term position; statistical techniques for extracting features based on word frequency and decision tree model; and techniques for combining part of speech tagging, syntactic feature analysis, and dictionaries (Asghar et al. 2014 ). Koto et al. discussed the best features for Twitter sentiment analysis prior to 2014 by comparing 9 feature sets (Koto and Adriani 2015 ). They found that the current best features for sentiment analysis of Twitter texts are AFINN (a list of English terms used for sentiment analysis manually rated by Finn Årup Nielsen) (Nielsen 2011 ) and Senti-Strength (Thelwall et al. 2012 ). Taboada sorted out the characteristics of words, phrases, and sentence patterns in sentiment analysis from the perspective of linguistics (Taboada 2016 ). Besides, Schouten and Frasinar conducted a comprehensive and in-depth critical evaluation of 15 sentiment analysis web tools (Schouten and Frasincar 2015 ). Medhat et al. ( 2014 ) and Ravi et al. (Ravi and Ravi 2015 ) also analyzed the early algorithms for sentiment analysis.

In the study by Schouten et al., the authors focused on aspect-level sentiment analysis, combing the techniques of aspect-level sentiment analysis before 2014, such as frequency-based, syntax-based, supervised machine learning, unsupervised machine learning, and hybrid approaches. They concluded that the latest technology was moving beyond the early stages (Schouten and Frasincar 2015 ). As research into sentiment analysis became more and more popular and there was important progress made in the development of deep learning technologies, researchers started to pay more attention to the techniques and methods of sentiment analysis. Deep learning methods in particular became the focus of discussions among researchers.

Prabha et al. analyzed various deep learning methods used in different applications at the level of sentence and aspect/object sentiment analysis, including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-term Memory (LSTM) (Prabha and Srikanth 2019 ). They discussed the advantages and disadvantages of these methods and their performance parameters. Ain et al. introduced deep learning techniques such as Deep Neural Network (DNN), CNN and Deep Belief Network (DBN) to solve sentiment analysis tasks like sentiment classification, cross-lingual problems, and product review analysis (Ain et al. 2017 ). Zhang et al. investigated deep learning and machine learning techniques for sentiment analysis in the contexts of aspect extraction and categorization, opinion expression extraction, opinion holder extraction, sarcasm analysis, multimodal data, etc. (Zhang et al. 2018 ). Habimana et al. compared the performance of deep learning methods on specific datasets and proposed that performance could be improved using models including Bidirectional Encoder Representations from Transformers (BERT), sentiment-specific word embedding models, cognitive-based attention models, and commonsense knowledge (Habimana et al. 2020 ). Wang et al. reviewed and discussed existing analytical models for sentiment classification and proposed a computational emotion-sensing model (Wang et al. 2020b ).

Some researchers also discussed web tools (Zucco et al. 2020 ), fuzzy logic algorithms (Serrano-Guerrero et al. 2021 ), transformer models (Acheampong et al. 2021 ), and sequential transfer learning (Chan et al. 2022 ) for sentiment analysis.

Overall survey methodology

With the increase in the popularity of sentiment analysis research, more related research results began to accumulate. Researchers needed to systematically organize and analyze results from a large number of publications to perform literature reviews. They used different survey methodologies to conduct surveys of a large number of papers.

Content analysis is a powerful approach to characterizing the contents of each study by carefully reading its content and manually identifying, coding, and organizing key information in it. A literature review is formed as a result of the repeated use of this approach (Elo and Kyngäs 2008 ; Stemler 2000 ). Content analysis has been used for different studies and systematic reviews (Qazi et al. 2015 , 2017 ). For example, Birjali et al. have studied the most commonly used classification techniques in sentiment analysis from a large amount of literature and introduced the application areas and sentiment classification processes, including preprocessing and feature selection (Birjali et al. 2021 ). They conducted a comprehensive analysis of the papers, discovering that supervised machine learning algorithms are the most commonly used techniques in the field. A complete review of methods and evaluation for sentiment analysis tasks and their applications was conducted by Wankhade et al. ( 2022 ). They compared the strengths and weaknesses of the methods, and discussed the future challenges of sentiment analysis in terms of both the methods and the forms of the data. Although this method can review the research contents and penetrate into the cores of the papers most systematically, it requires a considerable amount of manpower and time for in-depth literature reading.

The systematic literature review guideline proposed by Kitchenham and Charters has gradually attracted the attention of researchers (Kitchenham 2004 ; Kitchenham and Charters 2007 ; Sarsam et al. 2020 ). This review process is divided into six stages: research question definition, search strategy formulation, inclusion and exclusion criteria definition, quality assessment, data extraction, and data synthesis. Researchers can eliminate a large number of retrieved papers by using this standard process and finally conducting further analysis and research on a small number of papers. Kumar et al. reviewed context-based sentiment analysis in social multimedia between 2006 and 2018. From the 573 papers retrieved in the initial search, they finally selected 37 papers to use in discussing sentiment analysis techniques (Kumar and Garg 2020 ). This approach was also used by Kumar et al. in their research on sentiment analysis on Twitter using soft computing techniques. They selected 60 articles out of 502 for follow-up analysis (Kumar and Jaiswal 2020 ). Zunic et al. selected 86 papers from 299 papers retrieved in the period 2011–2019 to discuss the application of sentiment analysis techniques in the field of health and well-being (Zunic et al. 2020 ); Ligthart et al. followed Kitchenham’s guideline and identified 14 secondary studies. They provided an overview of specific sentiment analysis tasks and of the features and methods required for different tasks (Ligthart et al. 2021 ). Obiedat (Obiedat et al. 2021 ), Angel (Angel et al. 2021 ) and Lin (Lin et al. 2022 ) also all followed this guideline to select literature for further analysis. This method can reduce the amount of literature that requires in-depth reading, but in the case of a large amount of literature, more effort is still required to search and screen the material than in traditional literature review methods (Kitchenham and Charters 2007 ).

There are also a few authors who have used informetric methods to review papers. Piryani et al. conducted an informetric analysis of research on opinion mining and sentiment analysis from 2000 to 2015 (Piryani et al. 2017 ). The authors used social network analysis, literature co-citation analysis, and other methods in the paper. They analyzed publication growth rates; the most productive countries, institutions, journals, and authors; and topic density maps and keyword bursts, among other elements. To a certain extent, they interpreted core authors, core papers, areas of research focus in this field, and the current state of national cooperation. In order to explore the application of sentiment analysis in building smart societies, Verma collected 353 papers published between 2010 and 2021 (Verma 2022 ). Using a topic analysis perspective combined with the Louvain algorithm, the author identified four sub-topics in the research field. Similarly, Mantyla et al. employed LDA techniques and manual classification to explore the topic structures of sentiment analysis articles (Mäntylä et al. 2018 ). The informetric methods use natural language processing technologies to intuitively conduct topic mining and analysis of a large number of papers. Through topic clustering, the literature is organized and analyzed, which reduces the time researchers spend on reading the literature in depth. These methods are suitable for exploring research topics and trends in the field.

Summary of advantages and disadvantages of the existing surveys

In the following, we discuss the advantages and disadvantages of the existing surveys from a number of different points of view.

From the point of view of the contents and topics of sentiment analysis

As summarized in Table ​ Table1, 1 , the researchers organized the literature and conducted depth investigations of the contents and topics of sentiment analysis. They reviewed the tasks of sentiment analysis (e.g., different text granularity, opinion mining, spam review detection, and emotion detection), the application areas of sentiment analysis (e.g. market, medicine, social media, and election prediction), and different languages for sentiment analysis, such as Chinese, Spanish, and Arabic (Adak et al. 2022 ; Al-Ayyoub et al. 2019 ; Alamoodi et al. ( 2021a , b ); Alonso et al. 2021 ; Angel et al. 2021 ; Boudad et al. 2018 ; Brito et al. 2021 ; Cheng et al. 2022 ; Hussain et al. 2019 ; Kastrati et al. 2021 ; Khattak et al. 2021 ; Koto and Adriani 2015 ; Kumar and Sebastian 2012 ; Ligthart et al. 2021 ; Medhat et al. 2014 ; Nassif et al. 2021 ; Nassirtoussi et al. 2014 ; Oueslati et al. 2020 ; Peng et al. 2017 ; Pereira 2021 ; Rambocas and Pacheco 2018 ; Ravi and Ravi 2015 ; Schouten and Frasincar 2015 ; Sharma and Jain 2020 ; Yue et al. 2019 ; Zhou and Ye 2020 ). They summarized the methods and application prospects of sentiment analysis under different contents and topics. As the field has grown, new topics have emerged, and knowledge from other fields has been gradually integrated into it. In recent years, the popularity of social media has aroused increasing interest in sentiment analysis research, and the number of papers published, especially those related to different topics of sentiment analysis, has grown rapidly. However, the existing surveys cover a short time range, and there has not been a survey dedicated to the evolution of research contents or topics of sentiment analysis. There have also been few survey works analyzing the connections between topics and methods, or their evolution (e.g., how the contents and topics of sentiment analysis have changed over time).

Advantages and disadvantages of the existing surveys

From the point of view of the methods of sentiment analysis

Some researchers reviewed different techniques and methods of sentiment analysis in different application areas and tasks. They analyzed and discussed sentiment analysis methods based on lexicons, rules, part of speech, term position, statistical techniques, supervised and unsupervised machine learning methods, as well as deep learning methods like LSTM, CNN, RNN, DNN, DBN, BERT, and other hybrid approaches (Acheampong et al. 2021 ; Ain et al. 2017 ; Alamoodi et al. 2021b ; Asghar et al. 2014 ; Chan et al. 2022 ; Cheng et al. 2022 ; Feldman 2013 ; Habimana et al. 2020 ; Koto and Adriani 2015 ; Kumar, Akshi and Sebastian 2012 ; Medhat et al. 2014 ; Prabha and Srikanth 2019 ; Ravi and Ravi 2015 ; Schouten and Frasincar 2015 ; Serrano-Guerrero et al. 2021 ; Taboada 2016 ; Wang et al. 2020b ; Yue et al. 2019 ; Zhang et al. 2018 ; Zucco et al. 2020 ). These researchers also compared the advantages and disadvantages of each method. As summarized in Table ​ Table1, 1 , even though existing surveys analyze the techniques and methods of sentiment analysis, providing good insights, there has not been a survey that analyzes the evolution of research methods over time. There have also been few survey works that focuses on the connections between topics and methods of sentiment analysis, and their evolution over time.

From the point of view of the overall survey methodology

The survey methods used have mainly been the content analysis method, Kitchenham and Charters' guideline, and the informetric methods. As summarized in Table ​ Table1, 1 , the content analysis method can effectively analyze the contents of research papers in depth, but it does not address the issue of the evolution of the research methods and topics (Bengtsson 2016 ; Birjali et al. 2021 ; Elo and Kyngäs 2008 ; Krippendorff 2018 ; Qazi et al. 2015 , 2017 ; Wankhade et al. 2022 ). Although the number of papers that need to be read in depth can be reduced by following Kitchenham and Charters' guideline, more effort is needed to search and screen literature than in traditional literature review methods (Angel et al. 2021 ; Kitchenham 2004 ; Kitchenham and Charters 2007 ; Kumar and Garg 2020 ; Ligthart et al. 2021 ; Lin et al. 2022 ; Obiedat et al. 2021 ; Sarsam et al. 2020 ; Zunic et al. 2020 ). The informetric methods are best suited to investigating the research methods and topics of sentiment analysis (Bar-Ilan 2008 ; Mäntylä et al. 2018 ; Piryani et al. 2017 ; Santos et al. 2019 ; Verma 2022 ). There are three surveys using informetric techniques and tools that are well suited for analysis of a large number of papers over many years (Mäntylä et al. 2018 ; Piryani et al. 2017 ; Verma 2022 ). However, the evolution of research methods and topics of sentiment analysis over time has not been studied with informetric methods. There have also been few survey works that leverages keyword co-occurrence analysis and community detection to analyze the connections between research methods and topics, and their evolution over time.

Therefore, to address the gaps in the existing surveys, this study presents a survey on the research methods and topics, and their evolution over time. It combines keyword co-occurrence analysis and informetric analysis tools to reveal the methods and topics of sentiment analysis and their evolution in this field from 2002 to 2022.

The following section, Sect.  3 , describes our proposed survey methodology in detail.

The proposed survey methodology

This section describes our proposed survey methodology, including collection of scientific publications, processing of scientific publications, as well as visualization and analysis using different methods and tools. The overall scheme of this survey (Fig.  2 ) is also presented in the end of Sect.  3 to better visualize and summarize the proposed survey methodology in this research.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig2_HTML.jpg

Graphical representation of the overall scheme of this survey. Module A: Collection of scientific publications; Module B: Processing of scientific publications; Module C: Visualization and analysis using different methods and tools; Module D: Result analysis and discussions considering various aspects

Collection of scientific publications

We collected research data from the Web of Science platform. We used keywords such as "sentiment analysis," "sentiment mining," and "sentiment classification" to search for relevant papers as data samples. In examining the retrieved papers, we found that some paper topics, paper types, and publication journals were not related to sentiment analysis, so we excluded them. The papers we included were mainly related to the sentiment analysis of texts. We excluded papers on sentiment analysis related to image processing, video processing, speech processing, biological signal processing, etc. Therefore, the retrieval strategy was as follows:

Topic Search (TS) = ("sentiment analy*" or "sentiment mining" or "sentiment classification") And Abstract (AB) = "sentiment" NOT TS = ("face image*" or "speech recognition" or "speech emotion" or "physiological signal*" or "music emotion*" or "facial feature extraction" or "video emotion" or "electroencephalography " or "biosignal*" or "image process*") NOT Title = ("facial" or "speech" or "sound*" or "face" or "dance" or "temperature" or "image*" or "spoken" or "electroencephalography" or "EEG" or "biosignal*" or "voice*" not AB = "facial."

The results in conferences are given the same relevance as journal papers. We chose four databases in the Web of Science: two conference citation databases (Conference Proceedings Citation Index—Social Sciences & Humanities [CPCI-SSH], and Conference Proceedings Citation Index—Science [CPCI-S]), and two journal citation databases (Science Citation Index Expanded [SCI-Expanded] and Social Sciences Citation Index [SSCI]). Given the various forms of words such as "analyzing" and "analysis," a truncated search technique (marked with an asterisk) was used to prevent the omission of relevant papers. The time frame of the retrieved papers was from January 2002 to January 2022, and the publication types of the papers included "article," "conference paper," "review," and "edited material." A total of 9,714 papers were obtained from the four databases above. These included 3,809 articles, 5,633 proceeding papers, 267 reviews, and 5 pieces of editorial material from 2002 to 2022. Overall, there were 104 papers from January 2022. The number of papers each year from 2002 to 2021 is shown in Fig.  1 .

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig1_HTML.jpg

The number of papers each year from 2002 to 2021

Processing of scientific publications

In this process, our purpose was to extract the key contents of the papers, which are used to analyze the research methods and topics in the field of sentiment analysis. Due to their limited number, the author keywords in each paper often cannot fully represent the key content of the paper. We found that combining the title and abstract could better reflect the core information. Therefore, we synthesized the title, abstract, and author keywords of each paper to extract keywords that represented the main research method and topic of the paper involved using KeyBERT 1 . KeyBERT is a keyword extraction technique that uses BERT embedding to create keywords and key phrases that most closely resemble document content (Grootendorst and Warmerdam 2021 ). The specific keyword extraction process was as follows:

First, we used KeyBERT to extract 8 keywords and eliminated keywords with a weight lower than 0.3. We then combined the extracted keywords with the author keywords and removed duplicates. After that, we standardized the whole collection of keywords and merged synonyms. Finally, we counted the number of keywords and removed meaningless terms like "sentiment analysis," "sentiment classification," and "sentiment mining."

After statistical analysis, we obtained 41,827 keywords with a total word frequency of 88,104. As there were 9,714 papers and 41,827 keywords, we found that most of the keywords with word frequency below 10 were not representative of the research contents of sentiment analysis. As a result, a total of 685 representative keywords were reserved for subsequent analysis. These keywords appeared a total of 30,801 times. Table ​ Table2 2 shows the keywords with word frequency in the top 50.

Keywords with word frequency in the top 50

High-frequency keywords generally represent research hotspots. We therefore extracted high-frequency keywords to serve as the basis for the subsequent analysis. We found that most of the keywords with word frequency 18 and lower, such as "ranking," "mask," "experience," "affect," "online forum," and so on, were not relevant to sentiment analysis. Therefore, the keywords with a word frequency higher than 18 were reserved for analysis. These keywords appeared 25,429 times in the collected data, accounting for close to 83% of all the keywords. We obtained 275 keywords, which were used to analyze the main methods and topics of sentiment analysis.

Visualization and analysis using different methods and tools

Analytical methods.

Keywords are the core natural language vocabulary to express the subject, content, ideas, and research methods of the literature (You et al. 2021 ). Keywords represent the topics of the domain, and cluster analysis of these words can reflect the structure and association of topics. Keyword co-occurrence analysis counts the number of occurrences of a set of keywords in the same document. The strength and number of associations between research contents can be obtained through keyword co-occurrence analysis. Dividing research methods and topics into sub-communities helps researchers to analyze hotspots and trends in methods and topics, as well as to obtain sub-fields of sentiment analysis research (Ding et al. 2001 ).

Visualization and analysis tools

BibExcel 2 is a software tool for analyzing bibliographic data or any text-based data formatted in a similar way (Persson 2017 ). The tool generates structured data files that can be read by Excel for subsequent processing (Persson et al. 2009 ). Our processing steps are as follows. First, we imported the standardized bibliographic data into BibExcel. This tool can help structure the data. Second, we checked and corrected the data and used BibExcel to count the number of co-occurrences of keywords.

We then used Pajek 3 software to visualize the keyword co-occurrence network and divided the sub-communities. Pajek is a large and complex network analysis tool (Batagelj and Andrej 2022 ; Batagelj and Mrvar 1998 ). It can calculate certain indicators to reveal the state and properties of the network involved. In addition, Pajek’s Louvain community detection algorithm can help divide the keyword co-occurrence network into sub-communities, which represent sub-fields of sentiment analysis (Blondel et al. 2008 ; Leydesdorff et al. 2014 ; Rotta and Noack 2011 ). The Louvain community-detection algorithm unfolds a complete hierarchical community structure for the network. It has an advantage in subdividing different areas of study: multiple knowledge structures and details can be shown in one network (Deng et al. 2021 ).

After that, we applied VOSviewer 4 to optimize the visualization of sub-communities (Van Eck and Waltman 2010 ; VOSviewer 2021 ; Perianes-Rodriguez et al. 2016 ; Waltman and Van Eck 2013 ; Waltman et al. 2010 ). VOSviewer can help display the core keywords in each sub-community and the correlation between keywords. It can also reflect the closeness of the association between sub-communities. Finally, we used Excel to count the frequency of keywords for each year and to map the evolution of research methods and topics in the field of sentiment analysis.

Graphical representation of the overall scheme of this survey

This paper proposes and conducts a new research survey on sentiment analysis. The graphical representation of the overall scheme of this survey is shown in Fig.  2 . The main scheme includes four modules: Module A, Collection of scientific publications; Module B, Processing of scientific publications; Module C, Visualization and analysis through different methods and tools, and Module D, Result analysis and discussions based on various aspects.

In Module A, scientific publications are collected from the Web of Science (WOS) platform, as has been detailed in Sect.  3.1 Collection of scientific publications above. Module B, Processing of scientific publications, has been detailed in Sect.  3.2 above. It performs a data processing procedure to obtain key information, which includes all the representative keywords and high-frequency keywords. The title, abstract and keywords of the papers are used to extract such key information using KeyBERT (Grootendorst and Warmerdam 2021 ). Such key information is analyzed and visualized through different methods, including different visualization tools, as introduced in Sect.  3.3 (Module C), Visualization and analysis using different methods and tools, above.

In Module C, the number of co-occurrences of keywords is obtained using BibExcel (Persson 2017 ), the co-occurrences of keywords are analyzed and visualized using Pajek (Blondel et al. 2008 ; Leydesdorff et al. 2014 ; Rotta and Noack 2011 ) and VOSviewer (Van Eck and Waltman 2010 ; VOSviewer 2021 ; Perianes-Rodriguez et al. 2016 ; Waltman and Van Eck 2013 ; Waltman et al. 2010 ). The keyword community network and the keyword community evolution are analyzed and visualized using these tools, as described in Sect.  3.3 (Module C), Visualization and analysis using different methods and tools. According to the visualization and analysis results obtained in Module C, Module D, Result analysis and discussions, will be detailed in Sect.  4 .

In the following section, Sect.  4 (Module D), results are analyzed and discussed considering various aspects, including the research methods and topics of sentiment analysis in each community, the evolution of research methods and topics along with the research hotspots and trends over time.

Results and analysis through various aspects

Research methods and topics of sentiment analysis, overall characteristic analysis.

The high-frequency keywords were presented in Table ​ Table2. 2 . These keywords can be regarded as the main research contents in the field of sentiment analysis. "Twitter" ranks at the top. It is followed by "opinion mining," "natural language processing," "machine learning," and so on. The high-frequency keywords cover the topics of the studies, the contents of the studies, and the techniques and methods used. Based on these keywords, we used Pajek’s Louvain method to construct a keyword co-occurrence network to represent the research methods and topics as shown in Fig.  3 . The keyword co-occurrence network is divided into six communities. The research methods and topics of the six communities include social media platforms (C1), machine learning methods (C2), natural language processing and deep learning methods (C3), opinion mining and text mining (C4), Arabic sentiment analysis (C5), and others, such as domain sentiment analysis and transfer learning, etc. (C6).

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig3_HTML.jpg

Keyword community network

In Fig.  3 , the size of the node represents the number of keywords. The thickness of the line between the nodes represents the number of collaborations between keywords. The top 20 keywords in each community are sorted in descending order, as shown in Table ​ Table3. 3 . The keyword co-occurrence network features of the six sub-communities are described in Table ​ Table4. 4 . The number of nodes shows the number of keywords in each community, and the number of links shows the correlations between the keywords.

The top 20 keywords in each community

Global network characteristics of sub-communities

As shown in Table ​ Table4, 4 , we can see from the number of links between sub-communities that there is a strong correlation between them, especially the link between C3 and C4, which has 1306 lines. The reason may be that the research methods of C4 focus on "opinion mining" and "text mining," while those of C3 focus on "natural language processing" and "deep learning," and C3 provides more technical support for C4 research. In C5 and C6, the research methods and topics are scattered. Their internal links are also low, but the connections with C3 and C4 are relatively high. The contents of C5 and C6 may include some emerging research methods and topics. We will present a specific analysis on the methods and topics of each sub-community in the next subsection.

Analysis on research methods and topics of sub-communities

Analysis on research methods and topics of the c1 community.

Figure  4 shows the keyword co-occurrence network of the C1 community. The research methods and topics of the C1 community focus on three areas: "social media," "topic models," and "covid-19." In the context of big data, web 2.0 technology provides users with a way to express reviews and opinions of services, events, and people. Various social media platforms, such as Twitter, YouTube, and Weibo, have a large amount of users’ emotional data (Momtazi 2012 ). Compared to traditional news media, information on social media spreads more quickly, and people are able to express their feelings more freely. It is important to analyze the emotions generated by the information shared and published on social media (Abdullah and Zolkepli 2017 ; Wang et al. 2014 ). Researchers have been extracting text data from social media platforms for years to detect unexpected events (Bai and Yu 2016 ; Preethi et al. 2015 ), improve the quality of products (Abrahams et al. 2012 ; Isah et al. 2014 ; Myslin et al. 2013 ), understand the direction of public opinion (Fink et al. 2013 ; Groshek and Al-Rawi 2013 ), and so on.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig4_HTML.jpg

The keyword co-occurrence network for the C1 community

Users’ sentiments are often associated with the topics, and the accuracy of sentiment analysis can be improved through the introduction of topic models (Li et al. 2010 ). Among them, the Latent Dirichlet Allocation (LDA) method is cited most frequently. Previous studies found that the LDA method can be effective in subdividing topics and identifying the sentiments of the contents. This method is quite general, and there are also many improved models based on this one that can be applied to any type of web text, helping to enhance the accuracy of sentiment polarity calculation (Chen et al. 2019 ; Liu et al. 2020 ).

As the COVID-19 pandemic has unfolded, a large number of individuals, media and governments have been publishing news and opinions about the COVID-19 crisis on social media platforms. This has resulted in a lot of sentiment analysis studies focusing on COVID-19-related texts exploring the impact of the epidemic on people’s lives (Sari and Ruldeviyani 2020 ; Wang, T. et al. 2020a ), physical health (Berkovic et al. 2020 ; Binkheder et al. 2021 ) and mental health (Yin et al. 2020 ), and so on. Therefore, we can see many related keywords, such as "infodemiology," "healthcare," and "mental health."

Analysis on research methods and topics of the C2 community

The contents of the C2 community mainly focus on "machine learning," "text classification," "feature extraction," and "stock market" (see Fig.  5 ). Most keywords are related to the research methods of sentiment analysis. Machine learning approaches have expanded from topic recognition to more challenging tasks such as sentiment classification. It is very important to explore and compare machine learning methods applied to sentiment classification (Li and Sun 2007 ). Methods like Support Vector Machine (SVM) and Naive Bayes models are widely used (Altrabsheh et al. 2013 ; Dereli et al. 2021 ; Shofiya and Abidi 2021 ; Tan et al. 2009 ; Wang and Lin 2020 ) and are used as benchmarks for the comparisons of models proposed by many researchers (Kumar et al. 2021 ; Sadamitsu et al. 2008 ; Waila et al. 2012 ; Zhang et al. 2019 ). Many algorithms, such as random forest (Al Amrani et al. 2018 ; Fitri et al. 2019 ; Sutoyo et al. 2022 ), tf-idf (Arafin Mahtab et al. 2018 ; Awan et al. 2021 ; Dey et al. 2017 ), logistic regression (Prabhat and Khullar 2017 ; Qasem et al. 2015 ; Sutoyo et al. 2022 ), and n-gram (Ikram and Afzal 2019 ; Singh and Kumari 2016 ; Xiong et al. 2021 ) are used to enhance the accuracy of machine learning, as shown in Fig.  5 .

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig5_HTML.jpg

The keyword co-occurrence network for the C2 community

The trading volume and asset prices of financial commodities or financial instruments are influenced by a variety of factors in the online environment. Machine learning and sentiment analysis are powerful tools that can help gather vast amounts of useful information to predict financial risk effectively (Li et al. 2009 ). Research on the relationship between public sentiment and stock prices has always been the focus of many scholars (Smailović et al. 2014 ; Xing et al. 2018 ). They have used machine learning methods to explore the influence of sentiments on stock prices through sentiment analysis of news articles, and then predicted the trend changes in the stock market (Ahuja et al. 2015 ; Januário et al. 2022 ; Maqsood et al. 2020 ; Picasso et al. 2019 ).

Analysis on research methods and topics of the C3 community

The contents of the C3 community also mainly focus on the methods for sentiment analysis, like "natural language processing", "deep learning," "aspect-based sentiment analysis," and "task analysis" (Fig.  6 ). Sentiment analysis is a sub-field of natural language processing (Nicholls and Song 2010 ), and natural language processing techniques have been widely used in sentiment analysis. Using natural language processing technology can help to better parse text features, such as part-of-speech tagging, word sense disambiguation, keyword extraction, inter-word dependency recognition, semantic parsing, and dictionary construction (Abbasi et al. 2011 ; Syed et al. 2010 ; Trilla and Alías 2009 ). With the rise of deep learning technology, researchers began to introduce it to sentiment analysis. Neural network models like LSTM (Al-Dabet et al. 2021 ; Al-Smadi et al. 2019 ; Li and Qian 2016 ; Schuller et al. 2015 ; Tai et al. 2015 ), CNN (Cai and Xia 2015 ; Jia and Wang 2022 ; Ouyang et al. 2015 ), RNN (Hassan and Mahmood 2017 ; Tembhurne and Diwan 2021 ; You et al. 2016 ), and some combination of these, as well as other models (An and Moon 2022 ; Li et al. 2022 ; Liu et al. 2020a ; Salur and Aydin 2020 ; Zhao et al. 2021 ), have received significant attention.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig6_HTML.jpg

The keyword co-occurrence network for the C3 community

Sentiment analysis granularity is subdivided into document level, sentence level, and aspect level. Document-level sentiment analysis takes the entire document as a unit, but the premise is that the document needs to have a clear attitude orientation—that is, the point of view needs to be clear (Shirsat et al. 2018 ; Wang and Wan 2011 ). Sentence-level sentiment analysis is intended to perform sentiment analysis of the sentences in the document alone (Arulmurugan et al. 2019 ; Liu et al. 2009 ; Nejat et al. 2017 ). Aspect-based analysis is a fundamental and significant task in sentiment analysis. The aim of aspect-level sentiment analysis is to separately summarize positive and negative views about different aspects of a product or entity, although overall sentiment toward a product or entity may tend to be positive or negative (Rao et al. 2021 ; Thet et al. 2010 ). Aspect-level sentiment analysis facilitates a more finely-grained analysis of sentiment than either document or sentence-level analysis (Liang et al. 2022 ; Wang et al. 2020c ). The traditional levels of analysis, such as sentence-level analysis can only calculate the comprehensive sentiment polarity of paragraphs or sentences (Wang et al. 2016 ; Zhang et al. 2021 ). In recent years, the aspect level has become more and more popular, and with the application of deep learning technology, it has become better at capturing the semantic relationship between aspect terms and words in a more quantifiable way (Huang et al. 2018 ). The process of sentiment analysis involves the coordination of multiple tasks, and the subtasks include feature extraction (Bouktif et al. 2020 ; Lin et al. 2020 ), context analysis (Yu et al. 2019 ; Zuo et al. 2020 ), and the application of some analytical models (Tan et al. 2020 ).

Analysis on research methods and topics of the C4 community

The C4 community mainly shows keywords related to the research methods and topics of "opinion mining" and "user review," which is the largest of the six sub-communities (Fig.  7 ). With the popularity of platforms like online review sites and personal blogs on the Internet, opinions and user reviews are readily available on the web. Opinion mining has always been a hot field of research (Khan et al. 2009 ; Poria et al. 2016 ). From Table ​ Table4, 4 , we can see that the link between C3 and C4 has 1306 lines. In opinion mining, researchers use many text mining methods to discover users’ opinions on goods or services, and then help improve the quality of corresponding products or services (Da’u et al. 2020 ; Lo and Potdar 2009 ; Martinez-Camara et al. 2011 ). In addition, scholars have found that the consideration of user opinions can help improve the overall quality of recommender systems (Artemenko et al. 2020 ; Da’u et al. 2020 ; Garg 2021 ; Malandri et al. 2022 ). Therefore, "recommendation system" has a strong correlation with "opinion mining."

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig7_HTML.jpg

The keyword co-occurrence network for C4 community

Evaluation metrics for quantifying the existing approaches are also a popular topic related to opinion mining. There is a keyword named "performance sentiment" in the C4 community. Precision, recall, accuracy and F1-score are the most commonly used evaluation metrics (Dangi et al. 2022 ; Jain et al. 2022 ; JayaLakshmi and Kishore 2022 ; Li et al. 2017 ; Wang et al. 2021 ; Yi and Niblack 2005 ). Some researchers have also used runtimes to calculate the model efficiency (Abo et al. 2021 ; Ferilli et al. 2015 ), p-value to statistically evaluate the relationship or difference between two samples of classification results (JayaLakshmi and Kishore 2022 ; Salur and Aydin 2020 ), paired sample t-tests to verify that the results are not obtained by chance (Nhlabano and Lutu 2018 ), and standard deviation to measure the stability of the model (Chang et al. 2020 ). There have also been researchers who have used G-mean (Wang et al. 2021 ), Pearson Correlation Coefficient (Corr) (Yang et al. 2022 ), Mean Absolute Error (MAE) (Yang et al. 2022 ), Normalized Information Transfer (NIT) and Entropy-Modified Accuracy (EMA) (Valverde-Albacete et al. 2013 ), Mean Squared Error (MSE) (Mao et al. 2022 ), Hamming loss (Liu and Chen 2015 ), Area Under the Curve (AUC) (Abo et al. 2021 ), sensitivity and specificity (Thakur and Deshpande 2019 ), etc.

Analysis on research methods and topics of the C5 & C6 communities

Both sub-communities C5 (Fig.  8 ) and C6 (Fig.  9 ) are small in size. The C5 community has 25 nodes and the C6 community has 41 nodes. The core content of the C5 community is "Arabic sentiment analysis." Before 2011, most resources and systems built in the field of sentiment analysis were tailored to English and other Indo-European languages. It is increasingly necessary to design sentiment analysis systems for other languages (Korayem et al. 2012 ), and researchers are increasingly interested in the study of tweets and texts in the Arabic language (Heikal et al. 2018 ; Khasawneh et al. 2013 ; Oueslati et al. 2020 ). They use technologies such as named entity recognition (Al-Laith and Shahbaz 2021 ), deep learning (Al-Ayyoub et al. 2018 ; Heikal et al. 2018 ), and corpus construction (Alayba et al. 2018 ) to enhance the accuracy of sentiment analysis.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig8_HTML.jpg

The keyword co-occurrence network for the C5 community

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig9_HTML.jpg

The keyword co-occurrence network for the C6 community

The contents of the C6 community are not very concentrated. From the size of the circle, we can see that the keywords "domain adaptation"(Blitzer et al. 2007 ; Glorot et al. 2011 ), "domain sentiment," and "cross-domain" appear more frequently. Cross-domain sentiment classification is intended to address the lack of mass labeling data (Du et al. 2020a ). It has attracted much attention (Du et al. 2020b ; Hao et al. 2019 ; Yang et al. 2020b ). Advances in communication technology have provided valuable interactive resources for people in different regions, and the processing of multilingual user comments has gradually become a key challenge in natural language processing (Martinez-Garcia et al. 2021 ). Therefore, some keywords related to "lingual" have appeared. Other keywords, such as "transfer learning," "active learning," and "semi-supervised learning," are mainly related to sentiment analysis technologies.

Evolution of research methods and topics of sentiment analysis

Overall evolution analysis.

Annual changes in keyword frequency in sentiment analysis research can reflect the evolution of research methods and topics in this field. Based on the keyword community network (Fig.  3 ), we counted the frequency of keywords in each sub-community for each year. The keyword community evolution diagram is shown in Fig.  10 . Since there were fewer papers published before 2006, we combined the occurrences of keywords from 2002 to 2006. We can see that the C1 community and the C3 community have shown a significant growth trend. The C2 community was in a state of growth until 2019, and the frequency of keywords decreased year by year after 2019. The frequency of C4 community keywords continued to increase until 2018 and declined after 2018. The number of keywords in the C5 community and in the C6 community both had a slow growth trend, but the trend was not obvious.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig10_HTML.jpg

Keyword community evolution diagram

Evolution analysis of sub-communities

We selected the high-frequency keywords under each category and plotted the change of word frequency in each year, as shown in Figs.  11 and ​ and12. 12 . In the C1 community, "social medium," "Twitter," "social network," "covid-19," "Latent Dirichlet Allocation," "topic model," and "text analysis" all had significant increases in word frequency, and the growth trend in 2021 was obvious. "Covid-19" appears in 2020, and the word frequency increased rapidly in 2021. Social media platforms have always been the focus of researchers’ attention. Under the influence of COVID-19, more people express their emotions, stress, and thoughts through social media platforms. Sentiment analysis on data from social media platforms related to COVID-19 has become a hot topic (Boon-Itt and Skunkan 2020 ). We believe that due to the impact of COVID-19, the widespread use of social platforms in 2020–2021 has led to a surge in the number of C1-related keywords.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig11_HTML.jpg

C1, C2, C5, C6 communities: High-frequency keyword evolution diagram

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig12_HTML.jpg

C3, C4 communities: High-frequency keyword evolution diagram

The C2 community focuses on the method of "machine learning," and the C3 community focuses on the methods of "deep learning" and "natural language processing." The keywords in the two communities are mainly related to the techniques and methods of sentiment analysis. We have found that before 2016 (Fig.  10 ), the frequency of keywords in the C2 community was higher than that in the C3 community, and in 2016 and later, the frequency of keywords in the C3 community gradually accounted for a larger proportion of the total. This reflects the fact that deep learning-related technologies and methods have become a research hotspot, and the attention given to SVM, Naive Bayes, supervised learning, and other technologies in machine learning has declined. In addition to deep learning models such as Bi-LSTM, Long Short-term Memory, and recurrent neural network in the C3 community, the number of "aspect based" and "feature extraction" keywords have also been growing, which shows that researchers now pay more attention to the aspect level of text granularity in the field of sentiment analysis.

Among the keywords found in the C4 community, the word frequency of the "opinion mining" keyword has decreased since 2018. This shows that in the field of sentiment analysis, researchers have begun to reduce the attention they give to sentiment analysis of opinions on product or service quality, while still maintaining a certain degree of attention to "user review" and "online review." In addition, the number of keywords for "sentiment lexicon" and "lexicon-based" has declined. It may be because, in the context of the widespread application of deep learning technology in recent years, the lexicon-based method requires more time and higher labor costs (Kaity and Balakrishnan 2020 ). However, its accuracy still attracts attention due to the high involvement of experts, especially in non-English languages (Bakar et al. 2019 ; Kydros et al. 2021 ; Piryani et al. 2020 ; Tammina 2020 ; Xing et al. 2019 ; Yurtalan et al. 2019 ).

The high-frequency keywords in the C5 and C6 communities are "Arabic language," "Arabic sentiment analysis," and "transfer learning." Arabic has 30 variants, including the official Modern Standard Arabic (MSA) (ISO 639–3 2017). Arabic dialects are becoming increasingly popular as the language of informal communication on blogs, forums, and social media networks (Lulu and Elnagar 2018 ). This makes them challenging languages for natural language processing and sentiment analysis (Alali et al. 2019 ; Elshakankery and Ahmed 2019 ; Sayed et al. 2020 ). Transfer learning can solve the problem by leveraging knowledge obtained from a large-scale source domain to enhance the classification performance of target domains (Heaton 2018 ). In recent years, based on the success of deep learning technology, this method has gradually attracted attention.

Research hotspots and trends

Through the analysis in Sects.  4.1 and 4.2 , we found that the research methods and topics of sentiment analysis are constantly changing. The keyword topic heat map is shown in Fig.  13 . From this map, we can see that in the past two decades, research hotspots have included social media platforms (such as "social medium," "social network," and "Twitter"); sentiment analysis techniques and methods (such as "machine learning," "svm," "natural language processing," "deep learning," "aspect-based," "text mining," and "sentiment lexicon"), mining of user comments or opinions (e.g., "opinion mining," "user review," and "online review"), and sentiment analysis for non-English languages (e.g., "Arabic sentiment analysis" and "Arabic language").

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig13_HTML.jpg

Keyword topic heat map

With the popularity of digitization, a large amount of user-generated content has appeared on the Internet, where users express their opinions and comments on different topics such as the news, events, activities, products, services, etc. through social media. This is especially so in the case of the Twitter mobile platform, launched in 2006, which has become the most popular social channel (Kumar and Jaiswal 2020 ). However, online text data is mostly unstructured. In order to accurately analyze users’ sentiments, the research methods for sentiment analysis, such as natural language processing technology, and automatic sentiment analysis models have become the focus of researchers’ works. From Fig.  11 , we can see that early technologies and methods are dominated by machine learning and that SVM and Naive Bayes have always been favored by researchers. This has also been confirmed in studies by Neha Raghuvanshi (Raghuvanshi and Patil 2016 ), Harpreet Kaur (Kaur et al. 2017 ), and Marouane Birjali (Birjali et al. 2021 ). With the improvement of neural network and artificial intelligence technology, deep learning technology has been widely used in sentiment analysis, and has resulted in good outcomes (Basiri et al. 2021 ; Ma et al. 2018 ; Prabha and Srikanth 2019 ; Yuan et al. 2020 ). However, deep learning technology still has room for improvement, and the hybrid methods combining sentiment dictionary and semantic analysis are gradually becoming a trend (Prabha and Srikanth 2019 ; Yang et al. 2020a ).

The granularity of sentiment analysis ranges from the early text level to the sentence level and finally to the aspect level, which is currently gaining strong attention. The granularity of sentiment analysis is gradually being refined, but the method is immature at present, and further research work in the future is needed (Agüero-Torales et al. 2021 ; Li et al. 2020 ; Trisna and Jie 2022 ).

Early sentiment analysis was mainly in the English language. In recent years, non-English languages such as Chinese (Lai et al. 2020 ; Peng et al. 2018 ), French (Apidianaki et al. 2016 ; Pecore and Villaneau 2019 ), Spanish (Chaturvedi et al. 2016 ; Plaza-del-Arco et al. 2020 ), Russian (Smetanin 2020 ), and Arabic (Alhumoud and Al Wazrah 2022 ; Ombabi et al. 2020 ) have attracted more and more attention. Furthermore, cross-domain sentiment analysis technology is in urgent need of research and discussion by researchers (Liu et al. 2019 ; Singh et al. 2021 ).

Conclusion and future work

Judging from the increasing number of papers related to sentiment analysis research every year, sentiment analysis has been on the rise. Although there are many surveys on sentiment analysis research, there has not been a survey dedicated to the evolution of research methods and topics of sentiment analysis. This paper has used keyword co-occurrence analysis and the informetric tools to enrich the perspectives and methods of previous studies. Its aims have been to outline the evolution of the research methods and tools, research hotspots and trends and to provide research guidance for researchers.

By adopting keyword co-occurrence analysis and community detection methods, we analyzed the research methods and topics of sentiment analysis, as well as their connections and evolution trends, and summarized the research hotspots and trends in sentiment analysis. We found that research hotspots include social media platforms, sentiment analysis techniques and methods, mining of user comments or opinions, and sentiment analysis for non-English languages. Moreover, deep learning technology, with its hybrid methods combining sentiment dictionary and semantic analysis, fine-grained sentiment analysis methods, and non-English language analysis methods, and cross-domain sentiment analysis techniques have gradually become the research trends.

Practical implications and technical directions of sentiment analysis

Sentiment analysis has a wide range of application targets, such as e-commerce platforms, social platforms, public opinion platforms, and customer service platforms. Years of development have led to many related tasks in sentiment analysis, such as sentiment analysis of different text granularity, sentiment recognition, opinion mining, dialogue sentiment analysis, irony recognition, false information detection, etc. Such analysis can help structure user reviews, support product improvement decisions, discover public opinion hotspots, identify public positions, investigate user satisfaction with products, and so on. As long as user-generated content is involved, sentiment analysis technology can be used to mine the emotions of human actors associated with the content. The improvement of sentiment analysis technology can help machines better understand the thoughts and opinions of users, make machines more intelligent, and make better decisions for policy leaders, businessmen, and service people. However, most of the current sentiment analysis methods are based on sentiment dictionaries, sentiment rules, statistics-based machine learning models, neural network-based deep learning models, and pre-training models, and have yet to achieve true language understanding in the sense of comprehension at the deep semantic level, though this does not prevent them from being useful in certain practical applications.

As an important task in natural language understanding, sentiment analysis has received extensive attention from academia and industry. Coarse-grained sentiment analysis is increasingly unable to meet people's decision-making needs, and for aspect-level sentiment analysis and complex tasks, pure machine learning is still unable to flexibly achieve true language understanding. Once the scene or domain changes, problems such as the domain incompatibility of the sentiment dictionary and the low transfer effect of the model involved keep appearing. At present, the accuracy of sentiment analysis provided by machines is far less than that of humans. To achieve human-like performance for machines, we believe that it is necessary to incorporate human commonsense knowledge and domain knowledge, as well as grounded definitions of concepts, in order for machines to understand natural language at a deeper level. These, combined with rules for affective reasoning to supplement interpretable information, will be effective in improving the performance of sentiment analysis. Future research in this direction can be strengthened to achieve true language understanding in machines.

Limitations and future work

There are some research limitations in this paper. First, we only studied papers written in English and searched from the Web of Science platform. We believe there are papers in other languages or other databases (e.g., Scopus, PubMed, Sci-hub, etc.) that also involve sentiment analysis but that were not included in our study. In addition, the keywords we chose to search in the Web of Science were mainly "sentiment analysis," "sentiment mining," and "sentiment classification." There may be papers related to our research topic that do not have these keywords. To track developments in sentiment analysis research, future studies could replicate this work by employing more precise keywords and using different literature databases.

Second, we selected the main high-frequency keywords for analysis, and some important low-frequency keywords may have been ignored. In future work, we can analyze the changes in each keyword in detail from the perspective of time and obtain more comprehensive analysis results.

Third, the results show that the themes of sentiment analysis cover many fields, such as computer science, linguistics, and electrical engineering, which indicates the trend of interdisciplinary research. Therefore, future work should apply co-citation and diversity measures to explore the interdisciplinary nature of sentiment analysis research.

Acknowledgements

The authors would like to thank the China Scholarship Council (CSC No. 202106850069) for its support for the visiting study.

This work has not received any funding.

Data availability

Declarations.

The authors declare that they have no conflict of interest or competing interest in this article.

This article does not contain any studies with human participants or animals performed by any of the authors.

1 https://github.com/MaartenGr/KeyBERT .

2 https://homepage.univie.ac.at/juan.gorraiz/bibexcel/ .

3 http://mrvar.fdv.uni-lj.si/pajek/ .

4 https://www.vosviewer.com/ .

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Jingfeng Cui, Email: nc.ude.uajn@5004129102 .

Zhaoxia Wang, Email: gs.ude.ums@gnawxz .

Seng-Beng Ho, Email: gs.ude.rats-a.cphi@bsoh .

Erik Cambria, Email: gs.ude.utn@airbmac .

  • Abbasi A, France S, Zhang Z, Chen H. Selecting attributes for sentiment classification using feature relation networks. IEEE Trans Knowl Data Eng. 2011; 23 (3):447–462. doi: 10.1109/TKDE.2010.110. [ CrossRef ] [ Google Scholar ]
  • Abdullah NSD, Zolkepli IA (2017) Sentiment analysis of online crowd input towards Brand Provocation in Facebook, Twitter, and Instagram. In: Proceedings of the international conference on big data and internet of thing, association for computing machinery, pp 67–74. 10.1145/3175684.3175689
  • Abo MEM, Idris N, Mahmud R, Qazi A, Hashem IAT, Maitama JZ, et al. A multi-criteria approach for Arabic dialect sentiment analysis for online reviews: exploiting optimal machine learning algorithm selection. Sustainability. 2021; 13 (18):10018. doi: 10.3390/su131810018. [ CrossRef ] [ Google Scholar ]
  • Abrahams AS, Jiao J, Wang GA, Fan W. Vehicle defect discovery from social media. Decis Support Syst. 2012; 54 (1):87–97. doi: 10.1016/j.dss.2012.04.005. [ CrossRef ] [ Google Scholar ]
  • Acheampong FA, Nunoo-Mensah H, Chen W. Transformer models for text-based emotion detection: a review of BERT-based approaches. Artif Intell Rev. 2021; 54 (8):5789–5829. doi: 10.1007/s10462-021-09958-2. [ CrossRef ] [ Google Scholar ]
  • Adak A, Pradhan B, Shukla N. Sentiment analysis of customer reviews of food delivery services using deep learning and explainable artificial intelligence: systematic review. Foods. 2022; 11 (10):1500. doi: 10.3390/foods11101500. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Agüero-Torales MM, Salas JIA, López-Herrera AG. Deep learning and multilingual sentiment analysis on social media data: an overview. Appl Soft Comput. 2021; 107 :107373. doi: 10.1016/j.asoc.2021.107373. [ CrossRef ] [ Google Scholar ]
  • Ahuja R, Rastogi H, Choudhuri A, Garg B (2015) Stock market forecast using sentiment analysis. In: 2015 2nd International conference on computing for sustainable global development, INDIACom 2015, Bharati Vidyapeeth, New Delhi, pp 1008–1010. 10.48550/arXiv.2204.05783
  • Ain QT, Ali M, Riaz A, Noureen A, Kamranz M, Hayat B, et al. Sentiment analysis using deep learning techniques: a review. Int J Adv Comput Sci Appl. 2017; 8 (6):424–433. doi: 10.14569/ijacsa.2017.080657. [ CrossRef ] [ Google Scholar ]
  • Al-Ayyoub M, Nuseir A, Alsmearat K, Jararweh Y, Gupta B. Deep learning for Arabic NLP: a survey. J Comput Sci. 2018; 26 :522–531. doi: 10.1016/j.jocs.2017.11.011. [ CrossRef ] [ Google Scholar ]
  • Al-Ayyoub M, Khamaiseh AA, Jararweh Y, Al-Kabi MN. A comprehensive survey of Arabic sentiment analysis. Inf Process Manag. 2019; 56 (2):320–342. doi: 10.1016/j.ipm.2018.07.006. [ CrossRef ] [ Google Scholar ]
  • Al-Dabet S, Tedmori S, AL-Smadi M. Enhancing Arabic aspect-based sentiment analysis using deep learning models. Comput Speech Lang. 2021; 69 :1224. doi: 10.1016/j.csl.2021.101224. [ CrossRef ] [ Google Scholar ]
  • Al-Laith A, Shahbaz M. Tracking sentiment towards news entities from Arabic news on social media. Futur Gener Comput Syst. 2021; 118 :467–484. doi: 10.1016/j.future.2021.01.015. [ CrossRef ] [ Google Scholar ]
  • Al-Smadi M, Talafha B, Al-Ayyoub M, Jararweh Y. Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybern. 2019; 10 (8):2163–2175. doi: 10.1007/s13042-018-0799-4. [ CrossRef ] [ Google Scholar ]
  • Alali M, Sharef NM, Murad MAA, Hamdan H, Husin NA. Narrow convolutional neural network for Arabic dialects polarity classification. IEEE Access. 2019; 7 :96272–96283. doi: 10.1109/ACCESS.2019.2929208. [ CrossRef ] [ Google Scholar ]
  • Alamoodi AH, Zaidan BB, Al-Masawa M, Taresh SM, Noman S, Ahmaro IYY, et al. Multi-perspectives systematic review on the applications of sentiment analysis for vaccine hesitancy. Comput Biol Med. 2021; 139 :4957. doi: 10.1016/j.compbiomed.2021.104957. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alamoodi AH, Zaidan BB, Zaidan AA, Albahri OS, Mohammed KI, Malik RQ, et al. Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: a systematic review. Expert Syst Appl. 2021; 167 :114155. doi: 10.1016/j.eswa.2020.114155. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alayba AM, Palade V, England M, Iqbal R (2018) Improving sentiment analysis in arabic using word representation. In: 2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR), IEEE, pp 13–18. 10.1109/ASAR.2018.8480191
  • Alhumoud SO, Al Wazrah AA. Arabic sentiment analysis using recurrent neural networks: a review. Artif Intell Rev. 2022; 55 (1):707–748. doi: 10.1007/s10462-021-09989-9. [ CrossRef ] [ Google Scholar ]
  • Alonso MA, Vilares D, Gómez-Rodríguez C, Vilares J. Sentiment analysis for fake news detection. Electronics. 2021; 10 (11):1348. doi: 10.3390/electronics10111348. [ CrossRef ] [ Google Scholar ]
  • Altrabsheh N, Gaber MM, Cocea M (2013) SA-E: sentiment analysis for education. In: The 5th KES International Conference on Intelligent Decision Technologies (KES-IDT), Sesimbra, Portugal, pp 353–362. 10.3233/978-1-61499-264-6-353
  • Al Amrani Y, Lazaar M, El Kadirp KE. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput Sci. 2018; 127 :511–520. doi: 10.1016/j.procs.2018.01.150. [ CrossRef ] [ Google Scholar ]
  • An H, Moon N. Design of recommendation system for tourist spot using sentiment analysis based on CNN-LSTM. J Ambient Intell Hum Comput. 2022; 13 :1653–1663. doi: 10.1007/s12652-019-01521-w. [ CrossRef ] [ Google Scholar ]
  • Angel SO, Negron APP, Espinoza-Valdez A. Systematic literature review of sentiment analysis in the spanish language. Data Technol Appl. 2021; 55 (4):461–479. doi: 10.1108/DTA-09-2020-0200. [ CrossRef ] [ Google Scholar ]
  • Apidianaki M, Tannier X, Richart C (2016) Datasets for aspect-based sentiment analysis in French. In: Proceedings of the tenth international conference on language resources and evaluation (LREC’16), Portorož, Slovenia: European Language Resources Association (ELRA), pp 1122–1126. https://aclanthology.org/L16-1179
  • Arafin Mahtab S, Islam N, Mahfuzur Rahaman M (2018) Sentiment analysis on Bangladesh cricket with support vector machine. In: 2018 International conference on Bangla Speech and language processing (ICBSLP), IEEE, pp 1–4. 10.1109/ICBSLP.2018.8554585
  • Artemenko O, Pasichnyk V, Kunanets N, Shunevych K (2020) Using sentiment text analysis of user reviews in social media for E-Tourism mobile recommender systems. In: COLINS, CEUR-WS, Aachen, pp 259–271. http://ceur-ws.org/Vol-2604/paper20.pdf
  • Arulmurugan R, Sabarmathi KR, Anandakumar H. Classification of sentence level sentiment analysis using cloud machine learning techniques. Clust Comput. 2019; 22 (1):1199–1209. doi: 10.1007/s10586-017-1200-1. [ CrossRef ] [ Google Scholar ]
  • Asghar MZ, Khan A, Ahmad S, Kundi FM. A review of feature selection techniques in sentiment analysis. J Basic Appl Sci Res. 2014; 4 (3):181–186. doi: 10.3233/IDA-173763. [ CrossRef ] [ Google Scholar ]
  • Awan MJ, Yasin A, Nobanee H, Ali AA, Shahzad Z, Nabeel M, et al. Fake news data exploration and analytics. Electronics. 2021; 10 (19):2326. doi: 10.3390/electronics10192326. [ CrossRef ] [ Google Scholar ]
  • Bai H, Yu G. A Weibo-based approach to disaster informatics: incidents monitor in post-disaster situation via weibo text negative sentiment analysis. Nat Hazards. 2016; 83 (2):1177–1196. doi: 10.1007/s11069-016-2370-5. [ CrossRef ] [ Google Scholar ]
  • Bakar MFRA, Idris N, Shuib L (2019) An enhancement of Malay social media text normalization for Lexicon-based sentiment analysis. In: 2019 International conference on Asian language processing (IALP), IEEE, pp 211–215. 10.1109/IALP48816.2019.9037700
  • Bar-Ilan J. Informetrics at the beginning of the 21st century—a review. J Informet. 2008; 2 (1):1–52. doi: 10.1016/j.joi.2007.11.001. [ CrossRef ] [ Google Scholar ]
  • Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR. ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Futur Gener Comput Syst. 2021; 115 :279–294. doi: 10.1016/j.future.2020.08.005. [ CrossRef ] [ Google Scholar ]
  • Batagelj V, Andrej M (2022) Pajek [Software]. http://mrvar.fdv.uni-lj.si/pajek/
  • Batagelj V, Mrvar A (1998) Pajek-program for large network analysis eds. M. Jünger and P Mutzel. Connections 21(2): 47–57. http://vlado.fmf.uni-lj.si/pub/networks/doc/pajek.pdf
  • Bengtsson M. How to plan and perform a qualitative study using content analysis. NursingPlus Open. 2016; 2 :8–14. doi: 10.1016/j.npls.2016.01.001. [ CrossRef ] [ Google Scholar ]
  • Berkovic D, Ackerman IN, Briggs AM, Ayton D. Tweets by people with arthritis during the COVID-19 pandemic: content and sentiment analysis. J Med Internet Res. 2020; 22 (12):e24550. doi: 10.2196/24550. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Binkheder S, Aldekhyyel RN, Almogbel A, Al-Twairesh N, Alhumaid N, Aldekhyyel SN, et al. Public perceptions around Mhealth applications during Covid-19 pandemic: a network and sentiment analysis of tweets in Saudi Arabia. Int J Environ Res Public Health. 2021; 18 (24):1–22. doi: 10.3390/ijerph182413388. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Birjali M, Kasri M, Beni-Hssane A. A comprehensive survey on sentiment analysis: approaches, challenges and trends. Knowl Based Syst. 2021; 226 :107134. doi: 10.1016/j.knosys.2021.107134. [ CrossRef ] [ Google Scholar ]
  • Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: 45th Annual Meeting of the association of computational linguistics, association for computational linguistics, pp 440–447. 10.1287/ijoc.2013.0585
  • Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008; 2008 (10):P10008. doi: 10.1088/1742-5468/2008/10/P10008. [ CrossRef ] [ Google Scholar ]
  • Boon-Itt S, Skunkan Y. Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health Surv. 2020; 6 (4):1978. doi: 10.2196/21978. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Boudad N, Faizi R, Thami ROH, Chiheb R. Sentiment analysis in Arabic: a review of the literature. Ain Shams Eng J. 2018; 9 (4):2479–2490. doi: 10.1016/j.asej.2017.04.007. [ CrossRef ] [ Google Scholar ]
  • Bouktif S, Fiaz A, Awad M. Augmented textual features-based stock market prediction. IEEE Access. 2020; 8 :40269–40282. doi: 10.1109/ACCESS.2020.2976725. [ CrossRef ] [ Google Scholar ]
  • Brito KDS, Filho RLCS, Adeodato PJL. A systematic review of predicting elections based on social media data: research challenges and future directions. IEEE Trans Comput Soc Syst. 2021; 8 (4):819–843. doi: 10.1109/TCSS.2021.3063660. [ CrossRef ] [ Google Scholar ]
  • Cai G, Xia B (2015) Convolutional neural networks for multimedia sentiment analysis. In: Natural Language Processing and Chinese Computing, Springer, Cham, p 159–167. 10.1007/978-3-319-25207-0_14
  • Callon M, Courtial J-P, Turner WA, Bauin S. From translations to problematic networks: an introduction to co-word analysis. Soc Sci Inf. 1983; 22 (2):191–235. doi: 10.1177/053901883022002003. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Liu Q, Decherchi S, Xing F, Kwok K (2022a) SenticNet 7: a commonsense-based neurosymbolic AI Framework for Explainable Sentiment Analysis. In: LREC, Marseille: European Language Resources Association (ELRA), pp 3829–3839. https://sentic.net/senticnet-7.pdf
  • Cambria E, Dragoni M, Kessler B, Donadello I. Ontosenticnet 2: enhancing reasoning within sentiment analysis. IEEE Intell Syst. 2022; 37 (2):103–110. doi: 10.1109/MIS.2021.3093659. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Kumar A, Al-Ayyoub M, Howard N. Guest editorial: explainable artificial intelligence for sentiment analysis. Knowl Based Syst. 2022; 238 (3):107920. doi: 10.1016/j.knosys.2021.107920. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Xing F, Thelwall M, Welsch R. Sentiment analysis as a multidisciplinary research area. IEEE Trans Artif Intell. 2022; 3 (2):1–4. [ Google Scholar ]
  • Chan JYL, Bea KT, Leow SMH, Phoong SW, Cheng WK. State of the art: a review of sentiment analysis based on sequential transfer learning. Artif Intell Rev. 2022 doi: 10.1007/s10462-022-10183-8. [ CrossRef ] [ Google Scholar ]
  • Chang J-R, Liang H-Y, Chen L-S, Chang C-W. Novel feature selection approaches for improving the performance of sentiment classification. J Ambient Intell Hum Comput. 2020 doi: 10.1007/s12652-020-02468-z. [ CrossRef ] [ Google Scholar ]
  • Chaturvedi I, Cambria E, Vilares D (2016) Lyapunov filtering of objectivity for Spanish sentiment model. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), IEEE, pp 4474–4481. 10.1109/IJCNN.2016.7727785
  • Chen Z, Teng S, Zhang W, Tang H, Zhang Z, He J, et al (2019) LSTM sentiment polarity analysis based on LDA clustering. In: Communications in Computer and Information Science, Springer, Singapore, pp 342–355. 10.1007/978-981-13-3044-5_25
  • Cheng WK, Bea KT, Leow SMH, Chan JY-L, Hong Z-W, Chen Y-L. A review of sentiment, semantic and event-extraction-based approaches in stock forecasting. Mathematics. 2022; 10 (14):2437. doi: 10.3390/math10142437. [ CrossRef ] [ Google Scholar ]
  • Da’u A, Salim N, Rabiu I, Osman A. Recommendation System Exploiting Aspect-Based Opinion Mining with Deep Learning Method. Inf Sci. 2020; 512 :1279–1292. doi: 10.1016/j.ins.2019.10.038. [ CrossRef ] [ Google Scholar ]
  • Dangi D, Bhagat A, Dixit DK. Sentiment analysis of social media data based on chaotic coyote optimization algorithm based time weight-adaboost support vector machine approach. Concurr Comput. 2022; 34 (3):6581. doi: 10.1002/cpe.6581. [ CrossRef ] [ Google Scholar ]
  • Deng S, Xia S, Hu J, Li H, Liu Y. Exploring the topic structure and evolution of associations in information behavior research through co-word analysis. J Librariansh Inf Sci. 2021; 53 (2):280–297. doi: 10.1177/0961000620938120. [ CrossRef ] [ Google Scholar ]
  • Dereli T, Eligüzel N, Çetinkaya C. Content analyses of the international federation of Red Cross and Red Crescent Societies (Ifrc) based on machine learning techniques through Twitter. Nat Hazards. 2021; 106 (3):2025–2045. doi: 10.1007/s11069-021-04527-w. [ CrossRef ] [ Google Scholar ]
  • Dey A, Jenamani M, Thakkar JJ (2017) Lexical Tf-Idf: An n-Gram Feature Space for Cross-Domain Classification of Sentiment Reviews. In: International Conference on Pattern Recognition and Machine Intelligence, Springer, Cham, pp 380–386. 10.1007/978-3-319-69900-4_48
  • Ding Y, Chowdhury GG, Foo S. Bibliometric cartography of information retrieval research by using co-word analysis. Inf Process Manag. 2001; 37 (6):817–842. doi: 10.1016/S0306-4573(00)00051-0. [ CrossRef ] [ Google Scholar ]
  • Du C, Sun H, Wang J, Qi Q, Liao J (2020a) Adversarial and domain-aware BERT for cross-domain sentiment analysis. In: Proceedings of the 58th Annual meeting of the association for computational linguistics, association for computational linguistics, p 4019–4028. 10.18653/v1/2020a.acl-main.370
  • Du Y, He M, Wang L, Zhang H. Wasserstein based transfer network for cross-domain sentiment classification. Knowl Based Syst. 2020; 204 :6162. doi: 10.1016/j.knosys.2020.106162. [ CrossRef ] [ Google Scholar ]
  • Elo S, Kyngäs H. The qualitative content analysis process. J Adv Nurs. 2008; 62 (1):107–115. doi: 10.1111/j.1365-2648.2007.04569.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Elshakankery K, Ahmed MF. HILATSA: a hybrid incremental learning approach for arabic tweets sentiment analysis. Egypt Inform J. 2019; 20 (3):163–171. doi: 10.1016/j.eij.2019.03.002. [ CrossRef ] [ Google Scholar ]
  • Feldman R. Techniques and applications for sentiment analysis. Commun ACM. 2013; 56 (4):82–89. doi: 10.1145/2436256.2436274. [ CrossRef ] [ Google Scholar ]
  • Ferilli S, De Carolis B, Esposito F, Redavid D (2015) Sentiment analysis as a text categorization task: a study on feature and algorithm selection for Italian language. In: 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), IEEE, pp 1–10. 10.1109/DSAA.2015.7344882
  • Fink C, Bos N, Perrone A, Liu E, Kopecky J (2013) Twitter, public opinion, and the 2011 Nigerian Presidential Election. In: 2013 International conference on social computing, IEEE, pp 311–320. 10.1109/SocialCom.2013.50
  • Fitri VA, Andreswari R, Hasibuan MA. Sentiment analysis of social media Twitter with case of anti-LGBT campaign in Indonesia using Naïve Bayes, Decision Tree, and Random Forest Algorithm. Procedia Comput Sci. 2019; 161 :765–772. doi: 10.1016/j.procs.2019.11.181. [ CrossRef ] [ Google Scholar ]
  • Garg S (2021) Drug recommendation system based on sentiment analysis of drug reviews using machine learning. In: 2021 11th International conference on cloud computing, data science & engineering (confluence), IEEE, pp 175–181. 10.1109/Confluence51648.2021.9377188
  • Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: 28th International Conference on Machine Learning, International Machine Learning Society (IMLS), pp 513–520. https://dl.acm.org/doi/10.5555/3104482.3104547
  • Grootendorst M, Warmerdam VD (2021) MaartenGr/KeyBERT (Version 0.5) [Computer program]. 10.5281/ZENODO.5534341.
  • Groshek J, Al-Rawi A. Public sentiment and critical framing in social media content during the 2012 US Presidential Campaign. Soc Sci Comput Rev. 2013; 31 (5):563–576. doi: 10.1177/0894439313490401. [ CrossRef ] [ Google Scholar ]
  • Habimana O, Li Y, Li R, Gu X, Yu G. Sentiment analysis using deep learning approaches: an overview. Sci China Inf Sci. 2020; 63 (1):1–36. doi: 10.1007/s11432-018-9941-6. [ CrossRef ] [ Google Scholar ]
  • Hao Y, Mu T, Hong R, Wang M, Liu X, Goulermas JY. Cross-domain sentiment encoding through stochastic word embedding. IEEE Trans Knowl Data Eng. 2019; 32 (10):1909–1922. doi: 10.1109/TKDE.2019.2913379. [ CrossRef ] [ Google Scholar ]
  • Hassan A, Mahmood A (2017) Efficient deep learning model for text classification based on recurrent and convolutional layers. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, pp 1108–1113. 10.1109/ICMLA.2017.00009
  • Heaton J (2018). Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep Learning. Genetic Programming and Evolvable Machines 19: 305–307. 10.1007/s10710-017-9314-z
  • Heikal M, Torki M, El-Makky N. Sentiment analysis of Arabic tweets using deep learning. Procedia Comput Sci. 2018; 142 :114–122. doi: 10.1016/j.procs.2018.10.466. [ CrossRef ] [ Google Scholar ]
  • Huang B, Ou Y, Carley KM (2018) Aspect level sentiment classification with attention-over-attention neural networks. In: International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Springer, Cham, pp 197–206. 10.1007/978-3-319-93372-6_22
  • Hussain N, Mirza HT, Rasool G, Hussain I, Kaleem M. Spam review detection techniques: a systematic literature review. Appl Sci. 2019; 9 (5):987. doi: 10.3390/app9050987. [ CrossRef ] [ Google Scholar ]
  • Hussein DMEDM. A survey on sentiment analysis challenges. J King Saud Univ. 2018; 30 (4):330–338. doi: 10.1016/j.jksues.2016.04.002. [ CrossRef ] [ Google Scholar ]
  • Ikram MT, Afzal MT. Aspect based citation sentiment analysis using linguistic patterns for better comprehension of scientific knowledge. Scientometrics. 2019; 119 (1):73–95. doi: 10.1007/s11192-019-03028-9. [ CrossRef ] [ Google Scholar ]
  • Injadat MN, Salo F, Nassif AB. Data mining techniques in social media: a survey. Neurocomputing. 2016; 214 :654–670. doi: 10.1016/j.neucom.2016.06.045. [ CrossRef ] [ Google Scholar ]
  • Isah H, Trundle P, Neagu D (2014) Social media analysis for product safety using text mining and sentiment analysis. In: 2014 14th UK Workshop on Computational Intelligence (UKCI), IEEE, pp 1–7. 10.1109/UKCI.2014.6930158
  • ISO 639-3 (2017) Registration Authority. https://iso639-3.sil.org/
  • Jain DK, Boyapati P, Venkatesh J, Prakash M. An intelligent cognitive-inspired computing with big data analytics framework for sentiment analysis and classification. Inf Process Manag. 2022; 59 (1):2758. doi: 10.1016/j.ipm.2021.102758. [ CrossRef ] [ Google Scholar ]
  • Januário BA, de Carosia AEO, da Silva AEA, Coelho GP. Sentiment analysis applied to news from the Brazilian stock market. IEEE Latin Am Trans. 2022; 20 (3):512–518. doi: 10.1109/TLA.2022.9667151. [ CrossRef ] [ Google Scholar ]
  • JayaLakshmi ANM, Kishore KVK. Performance evaluation of DNN with other machine learning techniques in a cluster using apache spark and MLlib. J King Saud Univ. 2022; 34 (1):1311–1319. doi: 10.1016/j.jksuci.2018.09.022. [ CrossRef ] [ Google Scholar ]
  • Jia X, Wang L. Attention enhanced capsule network for text classification by encoding syntactic dependency trees with graph convolutional neural network. PeerJ Comput Sci. 2022; 7 :e831. doi: 10.7717/PEERJ-CS.831. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jiang D, Luo X, Xuan J, Xu Z. Sentiment computing for the news event based on the social media big data. IEEE Access. 2017; 5 :2373–2382. doi: 10.1109/ACCESS.2016.2607218. [ CrossRef ] [ Google Scholar ]
  • Kaity M, Balakrishnan V. Sentiment Lexicons and non-English languages: a survey. Knowl Inf Syst. 2020; 62 (12):4445–4480. doi: 10.1007/s10115-020-01497-6. [ CrossRef ] [ Google Scholar ]
  • Kastrati Z, Dalipi F, Imran AS, Nuci KP, Wani MA. Sentiment analysis of students’ feedback with Nlp and deep learning: a systematic mapping study. Appl Sci. 2021; 11 (9):3986. doi: 10.3390/app11093986. [ CrossRef ] [ Google Scholar ]
  • Kaur H, Mangat V, Nidhi (2017) A survey of sentiment analysis techniques. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), IEEE, pp 921–925. 10.1109/I-SMAC.2017.8058315
  • Khan K, Baharudin BB, Khan A (2009) Mining opinion from text documents: a survey. In: 2009 3rd IEEE International conference on digital ecosystems and technologies, IEEE, pp 217–222. 10.4304/jetwi.5.4.343-353
  • Khasawneh RT, Wahsheh HA, Al-Kabi MN, Alsmadi IM (2013) Sentiment analysis of Arabic social media content: a comparative study. In: 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013), IEEE, pp 101–106. 10.1109/ICITST.2013.6750171
  • Khattak A, Asghar MZ, Saeed A, Hameed IA, Asif Hassan S, Ahmad S. A survey on sentiment analysis in Urdu: a resource-poor language. Egypt Inform J. 2021; 22 (1):53–74. doi: 10.1016/j.eij.2020.04.003. [ CrossRef ] [ Google Scholar ]
  • Khatua A, Khatua A, Cambria E. Predicting political sentiments of voters from Twitter in multi-party contexts. Appl Soft Comput J. 2020; 97 :106743. doi: 10.1016/j.asoc.2020.106743. [ CrossRef ] [ Google Scholar ]
  • Kitchenham B. Procedures for performing systematic reviews, version 1.0. Empir Softw Eng. 2004; 33 (2004):1–26. [ Google Scholar ]
  • Kitchenham B, Charters SM. Guidelines for performing systematic literature reviews in software engineering. Tech Rep. 2007; 5 :1–57. [ Google Scholar ]
  • Korayem M, Crandall D, Abdul-Mageed M (2012) Subjectivity and sentiment analysis of Arabic: a survey. In: International conference on advanced machine learning technologies and applications, Springer, Berlin, Heidelberg, p 128–139. 10.1007/978-3-642-35326-0_14
  • Koto F, Adriani M (2015) A comparative study on Twitter sentiment analysis: Which Features Are Good? In: International conference on applications of natural language to information systems, Springer, Cham, p 453–457. 10.1007/978-3-319-19581-0_46
  • Krippendorff K (2018) Content analysis: an introduction to its methodology. Sage publications.
  • Kumar A, Garg G. Systematic literature review on context-based sentiment analysis in social multimedia. Multimed Tools Appl. 2020; 79 (21):15349–15380. doi: 10.1007/s11042-019-7346-5. [ CrossRef ] [ Google Scholar ]
  • Kumar A, Jaiswal A. Systematic literature review of sentiment analysis on twitter using soft computing techniques. Concurr Comput. 2020; 32 (1):e5107. doi: 10.1002/cpe.5107. [ CrossRef ] [ Google Scholar ]
  • Kumar A, Sebastian TM. Sentiment analysis: a perspective on its past, present and future. Int J Intell Syst Appl. 2012; 4 (10):1–14. doi: 10.5815/ijisa.2012.10.01. [ CrossRef ] [ Google Scholar ]
  • Kumar A, Narapareddy VT, Gupta P, Srikanth VA, Neti LB, Malapati A (2021) Adversarial and auxiliary features-aware BERT for sarcasm detection. In: 8th ACM IKDD CODS and 26th COMAD, association for computing machinery, p 163–170. 10.1145/3430984.3431024
  • Kydros D, Argyropoulou M, Vrana V. A content and sentiment analysis of Greek tweets during the pandemic. Sustainability (switzerland) 2021; 13 (11):6150. doi: 10.3390/su13116150. [ CrossRef ] [ Google Scholar ]
  • Lai Y, Zhang L, Han D, Zhou R, Wang G. Fine-grained emotion classification of chinese microblogs based on graph convolution networks. World Wide Web. 2020; 23 (5):2771–2787. doi: 10.1007/s11280-020-00803-0. [ CrossRef ] [ Google Scholar ]
  • Leiden University's Centre for Science and Technology Studies (CWTS) (2021) VOSviewer (Version 1.6.17)[Software]. https://www.vosviewer.com/
  • Leydesdorff L, Park HW, Wagner C. International co-authorship relations in the social science citation index: is internationalization leading the network? J Assoc Inf Sci Technol. 2014; 65 (10):2111–2126. doi: 10.48550/arXiv.1305.4242. [ CrossRef ] [ Google Scholar ]
  • Li D, Qian J (2016) Text sentiment analysis based on long short-term memory. In: 2016 First IEEE International Conference on Computer Communication and the Internet (ICCCI), IEEE, pp 471–475. 10.1109/CCI.2016.7778967
  • Li F, Huang M, Zhu X (2010) Sentiment analysis with global topics and local dependency. In: Proceedings of the AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA: AAAI Press, Palo Alto, California USA, pp 1371–1376. 10.1609/aaai.v24i1.7523
  • Li J, Sun M (2007) Experimental study on sentiment classification of chinese review using machine learning techniques. In: 2007 International Conference on Natural Language Processing and Knowledge Engineering, IEEE, pp 393–400. 10.1109/NLPKE.2007.4368061
  • Li N, Liang X, Li X, Wang C, Wu DD. Network environment and financial risk using machine learning and sentiment analysis. Hum Ecol Risk Assess. 2009; 15 (2):227–252. doi: 10.1080/10807030902761056. [ CrossRef ] [ Google Scholar ]
  • Li W, Zhu L, Shi Y, Guo K, Cambria E. User reviews: sentiment analysis using Lexicon integrated two-channel CNN–LSTM family models. Appl Soft Comput J. 2020; 94 :6435. doi: 10.1016/j.asoc.2020.106435. [ CrossRef ] [ Google Scholar ]
  • Li W, Shao W, Ji S, Cambria E. BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing. 2022; 467 :73–82. doi: 10.1016/j.neucom.2021.09.057. [ CrossRef ] [ Google Scholar ]
  • Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput. 2017; 9 (6):843–851. doi: 10.1007/s12559-017-9492-2. [ CrossRef ] [ Google Scholar ]
  • Liang B, Su H, Gui L, Cambria E, Xu R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl Based Syst. 2022; 235 :107643. doi: 10.1016/j.knosys.2021.107643. [ CrossRef ] [ Google Scholar ]
  • Ligthart A, Catal C, Tekinerdogan B. Systematic reviews in sentiment analysis: a tertiary study. Artif Intell Rev. 2021; 54 (7):4997–5053. doi: 10.1007/s10462-021-09973-3. [ CrossRef ] [ Google Scholar ]
  • Lin B, Cassee N, Serebrenik A, Bavota G, Novielli N, Lanza M. Opinion mining for software development: a systematic literature review. ACM Trans Softw Eng Methodol. 2022; 31 (3):1–41. doi: 10.1145/3490388. [ CrossRef ] [ Google Scholar ]
  • Lin Y, Li J, Yang L, Xu K, Lin H. Sentiment analysis with comparison enhanced deep neural network. IEEE Access. 2020; 8 :78378–78384. doi: 10.1109/ACCESS.2020.2989424. [ CrossRef ] [ Google Scholar ]
  • Liu F, Zheng J, Zheng L, Chen C. Combining attention-based bidirectional gated recurrent neural network and two-dimensional convolutional neural network for document-level sentiment classification. Neurocomputing. 2020; 371 :39–50. doi: 10.1016/j.neucom.2019.09.012. [ CrossRef ] [ Google Scholar ]
  • Liu L, Nie X, Wang H (2012) Toward a fuzzy domain sentiment ontology tree for sentiment analysis. In: 2012 5th International congress on image and signal processing, IEEE, pp 1620–1624. 10.1109/CISP.2012.6469930
  • Liu R, Shi Y, Ji C, Jia M. A survey of sentiment analysis based on transfer learning. IEEE Access. 2019; 7 :85401–85412. doi: 10.1109/ACCESS.2019.2925059. [ CrossRef ] [ Google Scholar ]
  • Liu S, Lee K, Lee I. Document-level multi-topic sentiment classification of email data with BiLSTM and data augmentation. Knowl Based Syst. 2020; 197 :105918. doi: 10.1016/j.knosys.2020.105918. [ CrossRef ] [ Google Scholar ]
  • Liu SM, Chen JH. A multi-label classification based approach for sentiment classification. Expert Syst Appl. 2015; 42 (3):1083–1093. doi: 10.1016/j.eswa.2014.08.036. [ CrossRef ] [ Google Scholar ]
  • Liu X, Zeng D, Li J, Wang F-Y, Zuo W. Sentiment analysis of Chinese documents: from sentence to document level. J Am Soc Inform Sci Technol. 2009; 60 (12):2474–2487. doi: 10.1002/asi.21206. [ CrossRef ] [ Google Scholar ]
  • Lo YW, Potdar V (2009) A review of opinion mining and sentiment classification framework in social networks. In: 2009 3rd IEEE International conference on digital ecosystems and technologies, IEEE, pp 396–401. 10.1109/DEST.2009.5276705
  • Lulu L, Elnagar A. Automatic arabic dialect classification using deep learning models. Procedia Comput Sci. 2018; 142 :262–269. doi: 10.1016/j.procs.2018.10.489. [ CrossRef ] [ Google Scholar ]
  • Ma Y, Peng H, Cambria E (2018) Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In: 32nd AAAI conference on artificial intelligence, New Orleans, Louisiana, USA: AAAI Press, Palo Alto, California USA, pp 5876–5883. 10.1609/aaai.v32i1.12048
  • Malandri L, Porcel C, Xing F, Serrano-Guerrero J, Cambria E. Soft computing for recommender systems and sentiment analysis. Appl Soft Comput. 2022 doi: 10.1016/j.asoc.2021.108246. [ CrossRef ] [ Google Scholar ]
  • Mäntylä MV, Graziotin D, Kuutila M. The evolution of sentiment analysis-a review of research topics, venues and top cited papers. Comput Sci Rev. 2018; 27 :16–32. doi: 10.1016/j.cosrev.2017.10.002. [ CrossRef ] [ Google Scholar ]
  • Mao Y, Zhang Y, Jiao L, Zhang H. Document-level sentiment analysis using attention-based bi-directional long short-term memory network and two-dimensional convolutional neural network. Electronics. 2022; 11 (12):1906. doi: 10.3390/electronics11121906. [ CrossRef ] [ Google Scholar ]
  • Maqsood H, Mehmood I, Maqsood M, Yasir M, Afzal S, Aadil F, et al. A local and global event sentiment based efficient stock exchange forecasting using deep learning. Int J Inf Manag. 2020; 50 :432–451. doi: 10.1016/j.ijinfomgt.2019.07.011. [ CrossRef ] [ Google Scholar ]
  • Martinez-Camara E, Martin-Valdivia MT, Urena-Lopez LA (2011) Opinion classification techniques applied to a Spanish Corpus. In: International conference on application of natural language to information systems, Springer, Berlin, Heidelberg, pp 169–176. 10.1007/978-3-642-22327-3_17
  • Martinez-Garcia A, Badia T, Barnes J (2021) Evaluating morphological typology in zero-shot cross-lingual transfer. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, association for computational linguistics, pp 3136–3153. 10.18653/v1/2021.acl-long.244
  • Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J. 2014; 5 (4):1093–1113. doi: 10.1016/j.asej.2014.04.011. [ CrossRef ] [ Google Scholar ]
  • Momtazi S (2012) Fine-grained German sentiment analysis on social media. In: Proceedings of the 8th International conference on language resources and evaluation (LREC’12), European Language Resources Association (ELRA), pp 1215–1220. http://www.lrec-conf.org/proceedings/lrec2012/pdf/999_Paper.pdf
  • Myslin M, Zhu SH, Chapman W, Conway M. Using Twitter to examine smoking behavior and perceptions of emerging tobacco products. J Med Int Res. 2013; 15 (8):174. doi: 10.2196/jmir.2534. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Nair RR, Mathew J, Muraleedharan V, Deepa Kanmani S (2019) Study of machine learning techniques for sentiment analysis. In: 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), IEEE, pp 978–984. 10.1109/ICCMC.2019.8819763
  • Nassif AB, Elnagar A, Shahin I, Henno S. Deep learning for Arabic subjective sentiment analysis: challenges and research opportunities. Appl Soft Comput. 2021; 98 :6836. doi: 10.1016/j.asoc.2020.106836. [ CrossRef ] [ Google Scholar ]
  • Nassirtoussi AK, Aghabozorgi S, Wah TY, Ngo DCL. Text mining for market prediction: a systematic review. Expert Syst Appl. 2014; 41 (16):7653–7670. doi: 10.1016/j.eswa.2014.06.009. [ CrossRef ] [ Google Scholar ]
  • Nejat B, Carenini G, Ng R (2017) Exploring joint neural model for sentence level discourse parsing and sentiment analysis. In: Proceedings of the 18th annual sigdial meeting on discourse and dialogue, association for computational linguistics, pp 289–298. 10.18653/v1/w17-5535
  • Nhlabano VV, Lutu PEN (2018). Impact of text pre-processing on the performance of sentiment analysis models for social media data. In: 2018 International Conference on Advances in Big Data, Computing and Data Communication Systems (IcABCD), IEEE, pp 1–6. 10.1109/ICABCD.2018.8465135
  • Nicholls C, Song F (2010) Comparison of feature selection methods for sentiment analysis. In: Canadian conference on artificial intelligence, Springer, Berlin, Heidelberg, pp 286–289. 10.1007/978-3-319-96292-4_21
  • Nielsen FA (2011) A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs. In: Proceedings of the ESWC2011 workshop on “Making Sense of Microposts”: big things come in small packages, Heraklion, Crete, Greece: CEUR-WS, Aachen, pp 93–98. 10.48550/arXiv.1103.2903
  • Obiedat R, Al-Darras D, Alzaghoul E, Harfoushi O. Arabic aspect-based sentiment analysis: a systematic literature review. IEEE Access. 2021; 9 :152628–152645. doi: 10.1109/ACCESS.2021.3127140. [ CrossRef ] [ Google Scholar ]
  • Ombabi AH, Ouarda W, Alimi AM. Deep learning CNN–LSTM framework for Arabic sentiment analysis using textual information shared in social networks. Soc Netw Anal Min. 2020; 10 (1):1–13. doi: 10.1007/s13278-020-00668-1. [ CrossRef ] [ Google Scholar ]
  • Oueslati O, Cambria E, Ben HM, Ounelli H. A review of sentiment analysis research in Arabic language. Futur Gener Comput Syst. 2020; 112 :408–430. doi: 10.1016/j.future.2020.05.034. [ CrossRef ] [ Google Scholar ]
  • Ouyang X, Zhou P, Li CH, Liu L (2015) Sentiment Analysis Using Convolutional Neural Network. In: 2015 IEEE International conference on computer and information technology; ubiquitous computing and communications; dependable, autonomic and secure computing; pervasive intelligence and computing, IEEE, p 2359–2364. 10.1109/CIT/IUCC/DASC/PICOM.2015.349
  • Pecore S, Villaneau J (2019) Complex and Precise Movie and Book Annotations in French Language for Aspect Based Sentiment Analysis. In: LREC 2018—11th International conference on language resources and evaluation, European Language Resources Association (ELRA), p 2647–2652. https://aclanthology.org/L18-1419
  • Peng H, Cambria E, Hussain A. A review of sentiment analysis research in Chinese language. Cogn Comput. 2017; 9 (4):423–435. doi: 10.1007/s12559-017-9470-8. [ CrossRef ] [ Google Scholar ]
  • Peng H, Ma Y, Li Y, Cambria E. Learning multi-grained aspect target sequence for Chinese sentiment analysis. Knowl Based Syst. 2018; 148 :167–176. doi: 10.1016/j.knosys.2018.02.034. [ CrossRef ] [ Google Scholar ]
  • Pereira DA. A survey of sentiment analysis in the Portuguese language. Artif Intell Rev. 2021; 54 (2):1087–1115. doi: 10.1007/s10462-020-09870-1. [ CrossRef ] [ Google Scholar ]
  • Perianes-Rodriguez A, Waltman L, van Eck NJ. Constructing bibliometric networks: a comparison between full and fractional counting. J Informetr. 2016; 10 (4):1178–1195. doi: 10.1016/j.joi.2016.10.006. [ CrossRef ] [ Google Scholar ]
  • Persson O (2017) BibExcel [Software]. Available from https://homepage.univie.ac.at/juan.gorraiz/bibexcel/
  • Persson O, Danell R, Schneider JW (2009) How to Use Bibexcel for Various Types of Bibliometric Analysis. In: Celebrating scholarly communication studies: a festschrift for Olle Persson at his 60th birthday, ed. J. Schneider F. Åström, R. Danell, B. Larsen. Leuven, Belgium: International Society for Scientometrics and Informetrics, pp 9–24
  • Picasso A, Merello S, Ma Y, Oneto L, Cambria E. Technical analysis and sentiment embeddings for market trend prediction. Expert Syst Appl. 2019; 135 :60–70. doi: 10.1016/j.eswa.2019.06.014. [ CrossRef ] [ Google Scholar ]
  • Piryani R, Madhavi D, Singh VK. Analytical mapping of opinion mining and sentiment analysis research during 2000–2015. Inf Process Manag. 2017; 53 (1):122–150. doi: 10.1016/j.ipm.2016.07.001. [ CrossRef ] [ Google Scholar ]
  • Piryani R, Piryani B, Singh VK, Pinto D. Sentiment analysis in Nepali: exploring machine learning and lexicon-based approaches. J Intell Fuzzy Syst. 2020; 39 (2):2201–2212. doi: 10.3233/JIFS-179884. [ CrossRef ] [ Google Scholar ]
  • Plaza-del-Arco FM, Martín-Valdivia MT, Ureña-López LA, Mitkov R. Improved emotion recognition in spanish social media through incorporation of lexical knowledge. Futur Gener Comput Syst. 2020; 110 :1000–1008. doi: 10.1016/j.future.2019.09.034. [ CrossRef ] [ Google Scholar ]
  • Poria S, Cambria E, Gelbukh A. Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst. 2016; 108 :42–49. doi: 10.1016/j.knosys.2016.06.009. [ CrossRef ] [ Google Scholar ]
  • Prabha MI, Srikanth GU (2019). Survey of Sentiment Analysis Using Deep Learning Techniques. In: 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), IEEE, p 1–9. 10.1109/ICIICT1.2019.8741438
  • Prabhat A, Khullar V (2017). Sentiment Classification on Big Data Using Naïve Bayes and Logistic Regression. In: 2017 International Conference on Computer Communication and Informatics (ICCCI), IEEE, p 1–5. 10.1109/ICCCI.2017.8117734
  • Preethi PG, Uma V, Kumar A. Temporal sentiment analysis and causal rules extraction from Tweets for event prediction. Procedia Comput Sci. 2015; 48 :84–89. doi: 10.1016/j.procs.2015.04.154. [ CrossRef ] [ Google Scholar ]
  • Qasem M, Thulasiram R, Thulasiram P (2015) Twitter Sentiment Classification Using Machine Learning Techniques for Stock Markets. In: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, p 834–840. 10.1109/ICACCI.2015.7275714
  • Qazi A, Fayaz H, Wadi A, Raj RG, Rahim NA, Khan WA. The artificial neural network for solar radiation prediction and designing solar systems: a systematic literature review. J Clean Prod. 2015; 104 :1–12. doi: 10.1016/j.jclepro.2015.04.041. [ CrossRef ] [ Google Scholar ]
  • Qazi A, Raj RG, Hardaker G, Standing C. A systematic literature review on opinion types and sentiment analysis techniques: tasks and challenges. Internet Res. 2017; 27 (3):608–630. doi: 10.1108/IntR-04-2016-0086. [ CrossRef ] [ Google Scholar ]
  • Raghuvanshi N, Patil JM (2016) A Brief Review on Sentiment Analysis. In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), IEEE, p 2827–2831. 10.1109/ICEEOT.2016.7755213
  • Rambocas M, Pacheco BG. Online sentiment analysis in marketing research: a review. J Res Interact Mark. 2018; 12 (2):146–163. doi: 10.1108/JRIM-05-2017-0030. [ CrossRef ] [ Google Scholar ]
  • Rao G, Gu X, Feng Z, Cong Q, Zhang L (2021) A Novel Joint Model with Second-Order Features and Matching Attention for Aspect-Based Sentiment Analysis. In: 2021 International Joint Conference on Neural Networks (IJCNN), IEEE, p 1–8. 10.1109/IJCNN52387.2021.9534321
  • Ravi K, Ravi V. A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst. 2015; 89 :14–46. doi: 10.1016/j.knosys.2015.06.015. [ CrossRef ] [ Google Scholar ]
  • Rotta R, Noack A. Multilevel local search algorithms for modularity clustering. ACM J Exp Algorithmics. 2011; 16 (2):1–27. doi: 10.1145/1963190.1970376. [ CrossRef ] [ Google Scholar ]
  • Sadamitsu K, Sekine S, Yamamoto M (2008) Sentiment Analysis Based on Probabilistic Models Using Inter-Sentence Information. In: Proceedings of the sixth international conference on language resources and evaluation (LREC’08), European Language Resources Association (ELRA), p 2892–2896. http://www.lrec-conf.org/proceedings/lrec2008/pdf/736_paper.pdf
  • Salur MU, Aydin I. A novel hybrid deep learning model for sentiment classification. IEEE Access. 2020; 8 :58080–58093. doi: 10.1109/ACCESS.2020.2982538. [ CrossRef ] [ Google Scholar ]
  • Sánchez-Rada JF, Iglesias CA. Social context in sentiment analysis: formal definition, overview of current trends and framework for comparison. Inf Fusion. 2019; 52 :344–356. doi: 10.1016/j.inffus.2019.05.003. [ CrossRef ] [ Google Scholar ]
  • Santos R, Costa AA, Silvestre JD, Pyl L. Informetric analysis and review of literature on the role of BIM in sustainable construction. Autom Constr. 2019; 103 :221–234. doi: 10.1016/j.autcon.2019.02.022. [ CrossRef ] [ Google Scholar ]
  • Sari IC, Ruldeviyani Y (2020) Sentiment Analysis of the Covid-19 Virus Infection in Indonesian Public Transportation on Twitter Data: A Case Study of Commuter Line Passengers. In: 2020 International Workshop on Big Data and Information Security (IWBIS), IEEE, pp 23–28. 10.1109/IWBIS50925.2020.9255531
  • Sarsam SM, Al-Samarraie H, Alzahrani AI, Wright B. Sarcasm detection using machine learning algorithms in Twitter: a systematic review. Int J Mark Res. 2020; 62 (5):578–598. doi: 10.1177/1470785320921779. [ CrossRef ] [ Google Scholar ]
  • Sayed AA, Elgeldawi E, Zaki AM, Galal AR (2020) Sentiment Analysis for Arabic Reviews Using Machine Learning Classification Algorithms. In: 2020 International Conference on Innovative Trends in Communication and Computer Engineering (ITCE), IEEE, p 56–63. 10.1109/ITCE48509.2020.9047822
  • Schouten K, Frasincar F. Survey on aspect-level sentiment analysis. IEEE Trans Knowl Data Eng. 2015; 28 (3):813–830. doi: 10.1109/TKDE.2015.2485209. [ CrossRef ] [ Google Scholar ]
  • Schuller B, Mousa AED, Vryniotis V. Sentiment analysis and opinion mining: on optimal parameters and performances. Wiley Interdiscip Rev. 2015; 5 (5):255–263. doi: 10.1002/widm.1159. [ CrossRef ] [ Google Scholar ]
  • Serrano-Guerrero J, Romero FP, Olivas JA. Fuzzy logic applied to opinion mining: a review. Knowl Based Syst. 2021; 222 :107018. doi: 10.1016/j.knosys.2021.107018. [ CrossRef ] [ Google Scholar ]
  • Sharma S, Jain A. Role of sentiment analysis in social media security and analytics. Wiley Interdiscip Rev. 2020; 10 (5):e1366. doi: 10.1002/widm.1366. [ CrossRef ] [ Google Scholar ]
  • Shirsat VS, Jagdale RS, Deshmukh SN (2018) Document Level Sentiment Analysis from News Articles. In: 2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA), IEEE, pp 1–4. 10.1109/ICCUBEA.2017.8463638
  • Shofiya C, Abidi S. Sentiment analysis on Covid-19-related social distancing in Canada using Twitter data. Int J Environ Res Public Health. 2021; 18 (11):5993. doi: 10.3390/ijerph18115993. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Singh RK, Sachan MK, Patel RB. 360 Degree view of cross-domain opinion classification: a survey. Artif Intell Rev. 2021; 54 (2):1385–1506. doi: 10.1007/s10462-020-09884-9. [ CrossRef ] [ Google Scholar ]
  • Singh T, Kumari M. Role of text pre-processing in twitter sentiment analysis. Procedia Comput Sci. 2016; 89 :549–554. doi: 10.1016/j.procs.2016.06.095. [ CrossRef ] [ Google Scholar ]
  • Smailović J, Grčar M, Lavrač N, Žnidaršič M. Stream-based active learning for sentiment analysis in the financial domain. Inf Sci. 2014; 285 (1):181–203. doi: 10.1016/j.ins.2014.04.034. [ CrossRef ] [ Google Scholar ]
  • Smetanin S. The applications of sentiment analysis for Russian language texts: current challenges and future perspectives. IEEE Access. 2020; 8 :110693–110719. doi: 10.1109/ACCESS.2020.3002215. [ CrossRef ] [ Google Scholar ]
  • Stemler S. An overview of content analysis. Pract Assess Res Eval. 2000; 7 (1):1–16. doi: 10.1362/146934703771910080. [ CrossRef ] [ Google Scholar ]
  • Sutoyo E, Rifai AP, Risnumawan A, Saputra M. A comparison of text weighting schemes on sentiment analysis of government policies: a case study of replacement of national examinations. Multimed Tools Appl. 2022; 81 (5):6413–6431. doi: 10.1007/s11042-022-11900-9. [ CrossRef ] [ Google Scholar ]
  • Syed AZ, Aslam M, Martinez-Enriquez AM (2010) Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits. In: Mexican international conference on artificial intelligence, Springer, Berlin, Heidelberg, pp 32–43. 10.1007/978-3-642-16761-4_4
  • Taboada M. Sentiment analysis: an overview from linguistics. Annu Rev Linguist. 2016; 2 :325–347. doi: 10.1146/annurev-linguistics-011415-040518. [ CrossRef ] [ Google Scholar ]
  • Tai KS, Socher R, Manning CD (2015) Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. In: Proceedings of the 53rd Annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, association for computational linguistics, pp 1556–1566. 10.3115/v1/p15-1150
  • Tammina S (2020) A Hybrid Learning Approach for Sentiment Classification in Telugu Language. In: 2020 International conference on Artificial Intelligence and Signal Processing (AISP), IEEE, p 1–6. 10.1109/AISP48273.2020.9073109
  • Tan S, Cheng X, Wang Y, Xu H (2009) Adapting Naive Bayes to Domain Adaptation for Sentiment Analysis. In: European Conference on Information Retrieval, Springer, Berlin, Heidelberg, p 337–349. 10.1007/978-3-642-00958-7_31
  • Tan X, Cai Y, Xu J, Leung H-F, Chen W, Li Q. Improving aspect-based sentiment analysis via aligning aspect embedding. Neurocomputing. 2020; 383 :336–347. doi: 10.1016/j.neucom.2019.12.035. [ CrossRef ] [ Google Scholar ]
  • Tembhurne JV, Diwan T. Sentiment analysis in textual, visual and multimodal inputs using recurrent neural networks. Multimed Tools Appl. 2021; 80 (5):6871–6910. doi: 10.1007/s11042-020-10037-x. [ CrossRef ] [ Google Scholar ]
  • Thakur RK, Deshpande MV. Kernel optimized-support vector machine and mapreduce framework for sentiment classification of train reviews. Int J Uncertain Fuzziness Knowl Based Syst. 2019; 27 (6):1025–1050. doi: 10.1142/S0218488519500454. [ CrossRef ] [ Google Scholar ]
  • Thelwall M, Buckley K, Paltoglou G. Sentiment strength detection for the social web. J Am Soc Inform Sci Technol. 2012; 63 (1):163–173. doi: 10.1002/asi.21662. [ CrossRef ] [ Google Scholar ]
  • Thet TT, Na JC, Khoo CSG. Aspect-based sentiment analysis of movie reviews on discussion boards. J Inf Sci. 2010; 36 (6):823–848. doi: 10.1177/0165551510388123. [ CrossRef ] [ Google Scholar ]
  • Trilla A, Alías F (2009) Sentiment Classification in English from Sentence-Level Annotations of Emotions Regarding Models of Affect. In: 10th Annual Conference of the International Speech Communication Association, International Speech Communication Association (ISCA), p 516–519. 10.21437/interspeech.2009-189
  • Trisna KW, Jie HJ. Deep learning approach for aspect-based sentiment classification: a comparative review. Appl Artif Intell. 2022 doi: 10.1080/08839514.2021.2014186. [ CrossRef ] [ Google Scholar ]
  • Valverde-Albacete FJ, Carrillo-de-Albornoz J, Peláez-Moreno C (2013) A Proposal for New Evaluation Metrics and Result Visualization Technique for Sentiment Analysis Tasks. In: International conference of the cross-language evaluation forum for European languages, Springer, Berlin, Heidelberg, p 41–52. 10.1007/978-3-642-40802-1_5
  • Van Eck NJ, Waltman L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics. 2010; 84 (2):523–538. doi: 10.1007/s11192-009-0146-3. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Verma S. Sentiment analysis of public services for smart society: literature review and future research directions. Gov Inf Quart. 2022; 39 (3):101708. doi: 10.1016/j.giq.2022.101708. [ CrossRef ] [ Google Scholar ]
  • Waila P, Marisha S, Singh VK, Singh MK (2012) Evaluating Machine Learning and Unsupervised Semantic Orientation Approaches for Sentiment Analysis of Textual Reviews. In: 2012 IEEE International conference on computational intelligence and computing research, IEEE, pp 1–6. 10.1109/ICCIC.2012.6510235
  • Waltman L, Van Eck NJ. A smart local moving algorithm for large-scale modularity-based community detection. Eur Phys J B. 2013; 86 (11):1–33. doi: 10.1140/epjb/e2013-40829-0. [ CrossRef ] [ Google Scholar ]
  • Waltman L, Van Eck NJ, Noyons ECM. A unified approach to mapping and clustering of bibliometric networks. J Inform. 2010; 4 (4):629–635. doi: 10.1016/j.joi.2010.07.002. [ CrossRef ] [ Google Scholar ]
  • Wang C, Yang X, Ding L. Deep learning sentiment classification based on weak tagging information. IEEE Access. 2021; 9 :66509–66518. doi: 10.1109/ACCESS.2021.3077059. [ CrossRef ] [ Google Scholar ]
  • Wang L, Wan Y (2011) Sentiment Classification of Documents Based on Latent Semantic Analysis. In: International conference on computer education, simulation and modeling, Springer, Berlin, Heidelberg, p 356–361. 10.1007/978-3-642-21802-6_57
  • Wang T, Lu K, Chow KP, Zhu Q. COVID-19 sensing: negative sentiment analysis on social media in China via BERT model. IEEE Access. 2020; 8 :138162–138169. doi: 10.1109/ACCESS.2020.3012595. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wang Z, Chong CS, Lan L, Yang Y, Ho S-B, Tong JC (2016) Fine-Grained Sentiment Analysis of Social Media with Emotion Sensing. In: 2016 Future Technologies Conference (FTC), IEEE, pp 1361–1364. 10.1109/FTC.2016.7821783
  • Wang Z, Ho S-B, Cambria E. A review of emotion sensing: categorization models and algorithms. Multimed Tools Appl. 2020; 79 (47):35553–35582. doi: 10.1007/s11042-019-08328-z. [ CrossRef ] [ Google Scholar ]
  • Wang Z, Ho S-B, Cambria E. Multi-level fine-scaled sentiment sensing with ambivalence handling. Int J Uncertain Fuzziness Knowl-Based Syst. 2020; 28 (4):683–697. doi: 10.1142/S0218488520500294. [ CrossRef ] [ Google Scholar ]
  • Wang Z, Lin Z. Optimal feature selection for learning-based algorithms for sentiment classification. Cogn Comput. 2020; 12 (1):238–248. doi: 10.1007/s12559-019-09669-5. [ CrossRef ] [ Google Scholar ]
  • Wang Z, Tong VJC, Chan D (2014) Issues of Social Data Analytics with a New Method for Sentiment Analysis of Social Media Data. In: 2014 IEEE 6th International conference on cloud computing technology and science, IEEE, pp 899–904. 10.1109/CloudCom.2014.40
  • Wang ZY, Li G, Li CY, Li A. Research on the semantic-based co-word analysis. Scientometrics. 2012; 90 (3):855–875. doi: 10.1007/s11192-011-0563-y. [ CrossRef ] [ Google Scholar ]
  • Wankhade M, Rao ACS, Kulkarni C. A survey on sentiment analysis methods, applications, and challenges. Artif Intell Rev. 2022; 55 :5731–5780. doi: 10.1007/s10462-022-10144-1. [ CrossRef ] [ Google Scholar ]
  • Xing FZ, Cambria E, Welsch RE. Natural language based financial forecasting: a survey. Artif Intell Rev. 2018; 50 (1):49–73. doi: 10.1007/s10462-017-9588-9. [ CrossRef ] [ Google Scholar ]
  • Xing FZ, Pallucchini F, Cambria E. Cognitive-inspired domain adaptation of sentiment lexicons. Inf Process Manage. 2019; 56 (3):554–564. doi: 10.1016/j.ipm.2018.11.002. [ CrossRef ] [ Google Scholar ]
  • Xiong Z, Qin K, Yang H, Luo G. Learning Chinese word representation better by cascade morphological N-Gram. Neural Comput Appl. 2021; 33 (8):3757–3768. doi: 10.1007/s00521-020-05198-7. [ CrossRef ] [ Google Scholar ]
  • Yang B, Shao B, Wu L, Lin X. Multimodal sentiment analysis with unidirectional modality translation. Neurocomputing. 2022; 467 :130–137. doi: 10.1016/j.neucom.2021.09.041. [ CrossRef ] [ Google Scholar ]
  • Yang L, Li Y, Wang J, Sherratt RS. Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE Access. 2020; 8 :23522–23530. doi: 10.1109/ACCESS.2020.2969854. [ CrossRef ] [ Google Scholar ]
  • Yang M, Qu Q, Shen Y, Lei K, Zhu J. Cross-domain aspect/sentiment-aware abstractive review summarization by combining topic modeling and deep reinforcement learning. Neural Comput Appl. 2020; 32 (11):6421–6433. doi: 10.1007/s00521-018-3825-2. [ CrossRef ] [ Google Scholar ]
  • Yi J, Niblack W (2005) Sentiment Mining in WebFountain. In: 21st International Conference on Data Engineering (ICDE’05), IEEE, p 1073–1083. 10.1109/ICDE.2005.132
  • Yin H, Yang S, Li J (2020) Detecting Topic and Sentiment Dynamics Due to COVID-19 Pandemic Using Social Media. In: International conference on advanced data mining and applications, Springer, Cham, p 610–623. 10.1007/978-3-030-65390-3_46
  • You L, Li Y, Wang Y, Zhang J, Yang Y (2016) A deep learning-based RNNs model for automatic security audit of short messages. In: 2016 16th International Symposium on Communications and Information Technologies (ISCIT), IEEE, p 225–229. 10.1109/ISCIT.2016.7751626
  • You T, Yoon J, Kwon O-H, Jung W-S. Tracing the evolution of physics with a keyword co-occurrence network. J Korean Phys Soc. 2021; 78 (3):236–243. doi: 10.1007/s40042-020-00051-5. [ CrossRef ] [ Google Scholar ]
  • Yu J, Jiang J, Xia R. Entity-sensitive attention and fusion network for entity-level multimodal sentiment classification. IEEE/ACM Trans Audio Speech Lang Process. 2019; 28 :429–439. doi: 10.1109/TASLP.2019.2957872. [ CrossRef ] [ Google Scholar ]
  • Yuan JH, Wu Y, Lu X, Zhao YY, Qin B, Liu T. Recent advances in deep learning based sentiment analysis. Sci China Technol Sci. 2020; 63 (10):1947–1970. doi: 10.1007/s11431-020-1634-3. [ CrossRef ] [ Google Scholar ]
  • Yue L, Chen W, Li X, Zuo W, Yin M. A survey of sentiment analysis in social media. Knowl Inf Syst. 2019; 60 (2):617–663. doi: 10.1007/s10115-018-1236-4. [ CrossRef ] [ Google Scholar ]
  • Yurtalan G, Koyuncu M, Turhan Ç. A polarity calculation approach for lexicon-based Turkish sentiment analysis. Turk J Electr Eng Comput Sci. 2019; 27 (2):1325–1339. doi: 10.3906/elk-1803-92. [ CrossRef ] [ Google Scholar ]
  • Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev. 2018; 8 (4):e1253. doi: 10.1002/widm.1253. [ CrossRef ] [ Google Scholar ]
  • Zhang Yin, Du J, Ma X, Wen H, Fortino G. Aspect-based sentiment analysis for user reviews. Cogn Comput. 2021; 13 (5):1114–1127. doi: 10.1007/s12559-021-09855-4. [ CrossRef ] [ Google Scholar ]
  • Zhang Y, Zhang Z, Miao D, Wang J. Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Inf Sci. 2019; 477 :55–64. doi: 10.1016/j.ins.2018.10.030. [ CrossRef ] [ Google Scholar ]
  • Zhao N, Gao H, Wen X, Li H. Combination of convolutional neural network and gated recurrent unit for aspect-based sentiment analysis. IEEE Access. 2021; 9 :15561–15569. doi: 10.1109/ACCESS.2021.3052937. [ CrossRef ] [ Google Scholar ]
  • Zhou J, Ye J. Sentiment analysis in education research: a review of journal publications. Interact Learn Environ. 2020 doi: 10.1080/10494820.2020.1826985. [ CrossRef ] [ Google Scholar ]
  • Zucco C, Calabrese B, Agapito G, Guzzi PH, Cannataro M. Sentiment analysis for mining texts and social networks data: methods and tools. Wiley Interdiscip Rev. 2020; 10 (1):e1333. doi: 10.1002/widm.1333. [ CrossRef ] [ Google Scholar ]
  • Zunic A, Corcoran P, Spasic I. Sentiment analysis in health and well-being: systematic review. JMIR Med Inform. 2020; 8 (1):e16023. doi: 10.2196/16023. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zuo E, Zhao H, Chen B, Chen Q. Context-specific heterogeneous graph convolutional network for implicit sentiment analysis. IEEE Access. 2020; 8 :37967–37975. doi: 10.1109/ACCESS.2020.2975244. [ CrossRef ] [ Google Scholar ]

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

The public attitude towards ChatGPT on reddit: A study based on unsupervised learning from sentiment analysis and topic modeling

Roles Data curation, Formal analysis, Methodology, Project administration, Software, Visualization, Writing – original draft, Writing – review & editing

Affiliation Department of Data Science, School of Computer Science and Engineering, Guangzhou Institute of Science and Technology, Guangzhou, Guangdong, China

Roles Conceptualization, Formal analysis, Investigation, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Department of Management, School of Business, Macau University of Science and Technology, Macao, China

ORCID logo

Roles Data curation, Funding acquisition, Investigation, Validation

Affiliation Data Science Research Center, Faculty of Innovation Engineering, Macau University of Science and Technology, Macao, China

Roles Funding acquisition, Methodology, Project administration, Resources, Writing – review & editing

Affiliation Department of Decision Sciences, School of Business, Macau University of Science and Technology, Macao, China

  • Zhaoxiang Xu, 
  • Qingguo Fang, 
  • Yanbo Huang, 
  • Mingjian Xie

PLOS

  • Published: May 14, 2024
  • https://doi.org/10.1371/journal.pone.0302502
  • Reader Comments

Table 1

ChatGPT has demonstrated impressive abilities and impacted various aspects of human society since its creation, gaining widespread attention from different social spheres. This study aims to comprehensively assess public perception of ChatGPT on Reddit. The dataset was collected via Reddit, a social media platform, and includes 23,733 posts and comments related to ChatGPT. Firstly, to examine public attitudes, this study conducts content analysis utilizing topic modeling with the Latent Dirichlet Allocation (LDA) algorithm to extract pertinent topics. Furthermore, sentiment analysis categorizes user posts and comments as positive, negative, or neutral using Textblob and Vader in natural language processing. The result of topic modeling shows that seven topics regarding ChatGPT are identified, which can be grouped into three themes: user perception, technical methods, and impacts on society. Results from the sentiment analysis show that 61.6% of the posts and comments hold favorable opinions on ChatGPT. They emphasize ChatGPT’s ability to prompt and engage in natural conversations with users, without relying on complex natural language processing. It provides suggestions for ChatGPT developers to enhance its usability design and functionality. Meanwhile, stakeholders, including users, should comprehend the advantages and disadvantages of ChatGPT in human society to promote ethical and regulated implementation of the system.

Citation: Xu Z, Fang Q, Huang Y, Xie M (2024) The public attitude towards ChatGPT on reddit: A study based on unsupervised learning from sentiment analysis and topic modeling. PLoS ONE 19(5): e0302502. https://doi.org/10.1371/journal.pone.0302502

Editor: Jitendra Yadav, IBS Hyderabad: ICFAI Business School, INDIA

Received: October 31, 2023; Accepted: April 7, 2024; Published: May 14, 2024

Copyright: © 2024 Xu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: 23,773 entries were retained, forming the database Processed_GPT_total.json, displayed in supporting information . To improve the reproducibility of our research results, the data and research design have been stored on Protocols.io. Protocols.io has assigned a protocol identifier (DOI) to our protocols, which is DOI: dx.doi.org/10.17504/protocols.io.bp2l6xee1lqe/v1 . Open Access license is freely available for anyone. Readers can find relevant details by visiting the following URL: https://www.protocols.io/private/899FA95EAEFD11EEAB870A58A9FEAC02 .

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

In such an era of rapid development of artificial intelligence (AI), ChatGPT has demonstrated remarkable capabilities and expanded in most life domains. Since it was introduced, with great potential applications in education, healthcare, industry, agriculture, travel, transportation, e-commerce, entertainment, marketing, and finance. The GPT (Generative Pre-trained Transformer) model represents a significant breakthrough in natural language processing, propelling the advancement of language-capable machines that resemble human communication [ 1 ]. ChatGPT is a specific application developed by OpenAI, a private artificial intelligence research lab, based on the GPT-3.5 model, released on November 30, 2022. The GPT-3.5 model is a technological iteration of the GPT-3 model. Based on the GPT-3.5 model, ChatGPT is trained on an extensive dataset comprising both text and code. It could generate text, perform language translation, produce various forms of creative content, and provide informative responses to user inquiries [ 2 ]. On March 15, 2023, OpenAI unveiled the new large-scale multimodal model, GPT-4, which processes textual data, incorporates image content, and exhibits improved response accuracy. Users can access GPT-4 through ChatGPT Plus on a fee-paying basis [ 2 ].

ChatGPT had over 100 million users in January 2023 [ 3 ]. The website generated 1.6 billion visits in June 2023 [ 4 ]. Mainstream media outlets have both expressed admiration and concern in response to ChatGPT. For example, the Guardian published an article written by GPT-3, asserting that humans should trust and respect AI’s role in improving their lives [ 5 ]. The New York Times published an article praising GPT-3’s impressive capabilities, noting its flaws [ 6 ]. In contrast, the Washington Post discussed the potential risks of AI, including ChatGPT, and argues for an enhanced regulatory framework in nations worldwide to address the issue urgently [ 7 ]. The public holds diverse opinions regarding significant new technologies [ 8 ], including ChatGPT. Some people believe that ChatGPT has many positive implications for society, such as helping people learn new knowledge and languages, generating creative text formats, replacing repetitive and laborious tasks, and providing companionship and support. Conversely, there are dissenting opinions regarding GPT, emphasizing its ongoing developmental stage, limitations, ethical issues, and possible outcomes of job losses.

Public attitudes toward ChatGPT are an important research topic. First, studying the public’s attitudes toward ChatGPT can help predict how widely ChatGPT will be used and accepted [ 9 ]. Public attitudes toward ChatGPT encompass affect, behavior, and cognition [ 10 ]. This study uses topic modeling and qualitative content analysis to examine how individuals perceive ChatGPT. The sentiment analysis of posts and comments allows this study to understand people’s emotions on ChatGPT. Second, public attitudes play an important role in shaping applied ethics. If the public is concerned about ChatGPT being used for malicious purposes, this could prompt a shift in its developmental trajectory and encourage its ethical utilization. Third, understanding public attitudes can lead to its advancement that better serves users. It provides researchers and practitioners related to AI with valuable insights into the potential applications and major limitations of ChatGPT. Understanding users’ concerns, needs, and preferences can help practitioners personalize ChatGPT’s responses to individual users and make ChatGPT more engaging and interactive.

This study aims to explore public attitudes toward ChatGPT, using data collected from users’ posts and comments on the social media site Reddit. A large number of studies have used data from Reddit to examine public perceptions, attitudes, and opinions. Reddit has approximately 57 million daily active users until 2023, one of the largest social media outlets in terms of users [ 11 , 12 ]. Based on the data of posts and comments from Reddit, three research questions will be addressed:

  • RQ 1: What are the main topics on Reddit regarding ChatGPT?
  • RQ 2: What is the public sentiment toward ChatGPT?
  • RQ 3: How does public sentiment towards ChatGPT change over time?

The study employs the LDA model for topic modeling to identify emerging themes in public discussions about ChatGPT. In addition, sentiment classification is performed using a weighted combination of VADER and Textblob, and the results are presented in the form of bar charts and pie charts to analyze the variations in public sentiment towards ChatGPT. Furthermore, by extracting the creation timestamps of each sample and the sentiment score results, a daily change curve is plotted to illustrate the fluctuations in the number of positive and negative samples since 2023. This allows us to examine whether sentiment attitudes have changed over time.

This study would make the following contributions: First, this research represents a cutting-edge exploration of ChatGPT after other AI. As Artificial Intelligence advances, AI products, such as chatbots, smart virtual assistants, and self-driving cars, bring innovation, efficiency, and value creation to various fields. Therefore, extending the research line in AI studies, particularly focusing on ChatGPT, is crucial for understanding its impact, addressing challenges, fostering innovation, and ensuring the positive role of AI technology in society. Second, it seeks to address a research gap in public sentiment toward ChatGPT. Prior studies on ChatGPT have predominantly focused on its applications in education, academic research, tourism, and healthcare. Despite the widespread impact on human society, there is a lack of research regarding the public’s perspectives and sentiments regarding its influence. There are only a handful of exceptions wherein students were sampled as subjects to investigate attitudes toward the use of ChatGPT. However, there is a significant absence of comprehensive reviews that examine its implications on human society and public sentiment. This research provides insights into the public’s expectations and concerns regarding the potential risks and benefits of ChatGPT. Third, compared to traditional sampling surveys with self-reported questionnaires, social media big data analytics is leading-edge and efficient. Traditional public attitude surveys typically rely on questionnaires, which are time-consuming, costly, and limited to specific populations such as students, consumers, or employees. In contrast, social media data is publicly available and can be collected globally, allowing researchers to gather extensive data more quickly [ 13 ]. It facilitates comprehension of diverse perspectives on ChatGPT among various social groups. In addition, traditional fixed-choice questionnaires often only cover a limited number of questions [ 14 ]. Social media data provides in-depth information than structured questionnaires, as it offers insights into users’ interests, viewpoints, behaviors intentions on ChatGPT. Social media data could be real-time and quickly identify trends and patterns in public [ 15 ]. Social media big data analytics has been increasingly adopted in research on public sentiment towards AI products [ 16 ]. This study follows the current trend. Fourth, in practical terms, this research will provide valuable information for ChatGPT developers on enhancing the system to meet user needs better. By gathering user’s feedback on ChatGPT, the study will identify its strengths and weaknesses. It aids developers in improving ChatGPT’s functionality and performance, making it more user-friendly and ensuring an enhanced user experience.

2 Literature review

This section reviews the use of ChatGPT in human society for an initial assessment of the general public’s perception of ChatGPT. According to the bibliometric results, a total of 365 research papers (including articles and reviews) titled "ChatGPT" are found in the Web of Science database. Since ChatGPT was released in June 2020 based on GPT-3, the timeframe is set for the papers from June 1, 2020, to September 1, 2023. The existing research on the impact of ChatGPT on human society focuses on education, academic research, healthcare, tourism, and ethics. Education and healthcare are the fields with the highest number of published articles.

In education, research on the impact of ChatGPT covers medical education, nursing education, science education, language education, programming education, etc. [ 17 ]. ChatGPT offers opportunities for students and educators, including personalized feedback, increased accessibility, interactive dialogues, lesson planning, assessment, and helping students improve their programming skills. When ChatGPT is used to teach different subjects, varied pedagogical outcomes are achieved [ 18 ]. The use of ChatGPT for teaching mathematics, sports science, and psychology to students is unsatisfactory. Unexpectedly, ChatGPT has demonstrated reliability and utility, even in more rigorous medical education [ 19 ]. One of the most critical topics is the significant impact of ChatGPT on students’ programming learning [ 20 ]. A study on ChatGPT in computer programming learning proves its effectiveness and usability in generating solution code, checking bugs, debugging code, and dealing with programming assignments, exams, and homework [ 20 , 21 ].

When exploring the impact of ChatGPT on academic research, academics usually focus on the following two questions: First, is the use of ChatGPT in a written work to be considered plagiarism? Second, can ChatGPT be considered as a co-author? There are polarised views on these two issues. Some positively embrace the new technology and see ChatGPT as a viable collaborator. In January 2023, Nurse Education in Practice, a journal published by Elsevier, generated significant controversy by acknowledging ChatGPT as a co-author [ 22 ]. However, others argue that using the ChatGPT technique constitutes academic cheating [ 23 ]. Many journals have stated that AI tools, such as ChatGPT, are not eligible to be credited as authors, including Science [ 24 ]. There is an undeniable consensus about ChatGPT as a competent co-author because of its ability to output more coherent, fairly accurate, informative, and systematic knowledge texts. At the same time, ChatGPT can support interdisciplinary research and provide research support [ 25 ]. The GPT model learns a large amount of textual data from different domains during training, giving it knowledge and understanding across multiple subject areas.

In healthcare, research indicates that ChatGPT holds enormous potential in virtual consultations, improving public mental health and well-being [ 26 ]. Moreover, studies have confirmed the positive role of ChatGPT in clinical practices and patient education [ 27 ]. Scholars have proposed various applications of AI in mental health, such as assisting clinicians with time-consuming tasks like documenting and updating medical records, enhancing diagnostic accuracy and prognosis, fostering a better understanding of mental illness mechanisms, and refining treatment through biological feedback. Furthermore, ChatGPT has even outperformed humans in emotional awareness evaluations. It is expected to assist physicians in making decisions related to diagnosing, treating, and managing chronic obstructive pulmonary disease [ 28 ]. Most healthcare researchers have expressed positive or balanced attitudes toward ChatGPT by analyzing data in social media [ 26 ]. These research findings collectively demonstrate the positive impact of ChatGPT and AI technologies on enhancing healthcare standards and patient experiences within the medical and healthcare domains.

Scholars have also shown significant interest in the impact of ChatGPT on tourism [ 29 ]. It is expected to bring about significant changes in the tourism industry by enhancing decision-making support for managers in tourism companies and policy-makers in governing bodies. The use of ChatGPT in tourism decision-making differs greatly from traditional approaches, as it engages tourists in an interactive question-and-answer mode. It allows them to personalize travel plans and recommend suitable travel services, including hotels, restaurants, transportation, local attractions, and leisure activities [ 30 ].

Scholars have also conducted exploratory research around ChatGPT in other fields. These include the following areas: corporate governance [ 31 ], supply chains [ 32 ], finance [ 33 ], intelligent vehicles [ 34 ], and so on. These researches demonstrate the prospect of wide application of GPT technology and lay the foundation for future research.

However, there have been some general complaints and ethical concerns regarding ChatGPT. First, criticism revolves around potential privacy leakage issues. This is because ChatGPT processes input prompts that may contain personal information, raising privacy concerns. Although OpenAI promises not to collect personal information from users, there is still a risk of leakage during network transmission due to inadequate security measures within data storage systems. Second, there are criticisms the use of ChatGPT in academia raises concerns about academic integrity. If ChatGPT fails to cite reference sources appropriately, it may lead to plagiarism or deception in education and academic research [ 35 ]. ChatGPT could be also used for online exams, posing a significant threat to exam integrity [ 36 , 37 ]. To counter these problems, some anti-plagiarism techniques have been employed to detect AI-generated context [ 38 ]. To leverage ChatGPT’s advantages Responsibly in the realm of education, scholars suggest that educators should focus on improving students’ creativity and critical thinking rather than just acquiring skills. Meanwhile, AI-related tasks can engage students in solving real-world problems [ 39 ].

Third, the conversational capabilities of ChatGPT often draw criticism due to the limitations of its output. These limitations include inaccurate, fabricated, and biased information, along with a lack of in-depth understanding [ 39 ]. For example, ChatGPT does not have real-time information, and its training data comes from before September 2021, which could lead to biased responses. For example, using ChatGPT in medical research raises concerns about accuracy and reliability. The model has limitations in providing personalized advice and may sometimes generate inappropriate or outdated reference information. Due to a lack of human reasoning ability, ChatGPT may have difficulty generating responses to complex or abstract questions and understanding the context of text input [ 30 ]. In addition, ChatGPT struggles with identifying spelling errors, understanding colloquial and ambiguous language, and lack of interactive experiences and human emotions. Consequently, ChatGPT’s current capacity only enables it to partially substitute for human decision-making [ 40 ].

Fourth, some studies focus on the political issues raised by ChatGPT. Although ChatGPT often claims to be apolitical, empirical evidence demonstrates that it exhibits certain political predispositions, notably favoring supporting environmental protection and left-leaning liberal ideology [ 41 , 42 ]. This may be because ChatGPT is trained on large text corpora collected from the Internet. These corpora may be dominated by influential institutions in Western society, such as mainstream news media, prestigious universities, and social media platforms. Consequently, these collections of texts may appear to represent a majority on certain topics. Furthermore, these algorithms may create an accumulation of false, inaccurate, biased, or confrontational content text on the web, exacerbating the vicious cycle of providing misleading and polarising information to the political system.

Research on the significant impact of ChatGPT in various domains of human society initially reflects stakeholders’ attitudes toward ChatGPT. Given the complexity of ChatGPT’s impact on human society, public opinions and attitudes toward ChatGPT also vary greatly, showing varying degrees of preference or aversion. However, limited literature comprehensively explores the public’s attitudes toward ChatGPT, and this study seeks to fill the gap.

This study aims to investigate public discourse and sentiment on ChatGPT through topic modeling and sentiment analysis using natural language processing based on the data from Reddit users’ posts and comments. The data science methods provided an efficient way to classify latent topics and sentiments in public discourse. Firstly, word frequency reveals the public’s interests related to ChatGPT. Word cloud visualization intuitively presents these high-frequency terms, making critical information easily accessible [ 43 ]. Secondly, topic modeling facilitates the identification of latent topics within textual data, which are more comprehensible to interpret [ 44 ]. Public discussions are often multifaceted and different words and sentences may relate to the same or interconnected topics. Topic modeling helps cluster related content, leading to a better understanding of the associations among various subjects [ 45 ]. This is crucial for exploring the breadth and depth of discussions from diverse perspectives. Lastly, sentiment analysis enables the identification of emotional tendencies (positive, neutral, or negative) conveyed within the text [ 46 , 47 ]. During public discussions about ChatGPT on Reddit, sentiment analysis aids in gauging public sentiment, revealing positive attitudes, concerns, and potential issues or needs. Analyzing sentiment fluctuations over time tracks emotional shifts helps to comprehend the public’s responses to specific events, such as ChatGPT product releases, or promotions. Through this approach, emotional changes and factors can be uncovered.

3.1 Data collection and cleaning

Social media platforms such as Twitter, Facebook, Instagram, and Reddit allow users to express their emotions, interests, hobbies, and opinions in real-time within an online community. Reddit ( https://Reddit.com ) has approximately 57 million daily active users until 2023, one of the largest social media outlets in terms of users [ 11 ]. Reddit users can share text, links, images, or videos in various sub-communities (called subreddits and dedicated to specific topics) [ 48 ]. Everyone has access to the public subforum (called subreddits on Reddit), and users can comment and vote on posts and comments for free and anonymously. Reddit has more than 100,000 active subreddits as of 2023. A large number of existing studies use data from the platform to examine public perceptions, attitudes, and opinions. These studies cover a broad range of topics, including disasters [ 49 ], vaccines [ 50 ], advertisement [ 51 ], tobacco [ 52 ], vehicles [ 53 ], climate change [ 54 ], digital governance [ 55 ], political psychology [ 56 ] and collective identity [ 57 ]. The Application Programming Interface (API) is a set of tools that defines how a software application interacts with other components, services, or platforms [ 58 ]. APIs allow for communication and data exchange between different software systems, enabling them to connect and interoperate with each other.

In terms of ethical considerations in the present research, the official API of Reddit is freely and publicly available to third parties [ 12 , 59 ]. In accordance with Reddit’s privacy policy, developers are permitted to write programs or applications, such as Apify used in this study, that interact with the Reddit platform through specific requests and commands. These actions include retrieving specific information, posting content, or performing other tasks [ 60 ]. Reddit enables researchers to extract the subreddits, threads, comments, and associated metadata through various programming languages [ 61 ]. Therefore, data collection and analysis in this study comply with the terms and conditions of the data source. Following the principle of data minimization, it only collects data relevant to the purpose of the study. In addition, to adhere to the principles of privacy and untraceability, personal user information is anonymized because Reddit posts and comments are user-generated.

This study utilized Apify ( https://apify.com ) for collecting Reddit posts and comments as the primary data source. Apify is a platform for data collection for web scraping, data extraction, and automation. It offers tools and services that assist developers in extracting data from web pages, executing automated tasks, and building web crawlers. Personal information has been anonymized. The data consists of posts and comments. The sample data is shown in Table 1 .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0302502.t001

It sets GPT3, GPT3.5, and GPT4 as keywords. These keywords aim to examine whether people’s perceptions of ChatGPT have evolved, particularly since the ChatGPT update. Apify collected 11,730, 10,109, and 12,046 relevant entries (posts and comments) for the respective keywords. This study performed random sampling to ensure sample uniformity, resulting in 10,000 entries preserved for each GPT version. The dataset consisted of 30,000 samples from June 2020 to August 15, 2023.

In terms of data cleaning, emoticons, digits, punctuation, links, unnecessary words, non-ASCII characters, and stopwords were removed from the textual data suggested by Yadav et al [ 61 , 62 ]. The NLTK library provides the English stopwords list. It consists of common English words with no semantic or informational value [ 63 , 64 ]. Stopwords are typically filtered out in natural language processing to enhance the efficiency and accuracy of text analysis. Additionally, all uppercase letters have been converted to lowercase. Furthermore, considering a slight overlap in the data sources, this study also conducted duplicate text filtering. In the end, 23,773 entries were retained, forming the database Processed_GPT_total.json, displayed in supporting information.

To improve the reproducibility of our research results, the data and research design have been stored on Protocols.io. Protocols.io has assigned a protocol identifier (DOI) to our protocols, which is DOI: dx.doi.org/10.17504/protocols.io.bp2l6xee1lqe/v1 . Open Access license is freely available for anyone. Readers can find relevant details by visiting the following URL: https://www.protocols.io/private/899FA95EAEFD11EEAB870A58A9FEAC02 .

3.2 Topic modeling

All data analyses were processed using Python (version 3.10). For topic modeling, prior research proposes three techniques: Latent Dirichlet Allocation (LDA), which adeptly discerns latent topics through probabilistic approaches; Non-Negative Matrix Factorization (NMF), which underscores the intricate relationship between documents and their inherent topics; and Transformer-based models, renowned for their proficiency in grasping the intricate semantics embedded within textual data [ 65 ].

sentiment analysis research topics

Perplexity is a common metric used to evaluate the performance of topic models and is particularly applicable to a selected number of LDA topics to model. It measures how well the model fits the documents in a given dataset, with a lower perplexity score indicating better model performance on that dataset.

sentiment analysis research topics

3.3 Sentiment analysis

Existing research typically uses three approaches for sentiment analysis: rule-based approaches, machine learning, and deep learning [ 71 ]. Rule-based methods, often referred to as rule-based sentiment analysis, utilize predefined sentiment lexicons and rules to evaluate sentiment [ 72 ]. Machine learning methods, on the other hand, learn sentiment classification models from data through supervised or unsupervised learning, offering good generalization capability [ 73 ]. Deep learning methods such as RNNs, CNNs, and Transformers automatically learn features from text, adept at capturing long-term dependencies, albeit requiring substantial data and computational resources [ 74 ].

The present study employs a rule-based approach to classify sentiments in posts and comments. The sentiment categorization utilizes a weighted approach that combines VADER with TextBlob. By combining the strengths of TextBlob and VADER, two distinct sentiment analysis tools that employ different algorithms and semantic processing approaches, and weighting their results, the overall performance of sentiment analysis can be effectively enhanced. These tools exhibit varied performances in different contexts, and through weighting their outcomes, one can leverage their respective strengths to improve the overall performance of sentiment analysis. Simultaneously, this integrated approach helps mitigate individual biases of different sentiment analysis tools, enabling the system to better adapt to diverse text samples, especially those of specific types. Furthermore, when confronted with complex and diverse language expressions, a single sentiment analysis method may exhibit instability. Integrating multiple methods enhances the model’s robustness, allowing it to adapt to various types and styles of text flexibly. Such a comprehensive approach demonstrates significant advantages in improving the accuracy and adaptability of sentiment analysis.

sentiment analysis research topics

Rule-based sentiment analysis in TextBlob relies on natural language processing techniques, utilizing predefined rules and syntactic structures to identify the emotional polarity within the text [ 77 ]. Initially, the text is decomposed into words and phrases, and part-of-speech tagging is conducted to comprehend the grammatical roles of each word in the sentence. Subsequently, based on a pre-defined sentiment lexicon, each word is assigned a sentiment polarity score, such as positive, negative, or neutral. Rules may also take into account relationships between words, where the presence of negation words, for instance, could alter the emotional polarity. By weighting or averaging sentiment scores for all words in the text, the overall emotional polarity of the text can be determined. The advantage of this method lies in its simplicity and ease of implementation, while its accuracy can be enhanced by continuously updating and expanding the sentiment lexicon [ 78 ].

sentiment analysis research topics

3.4 Sentiment trend analysis

Sentiment trend analysis, a burgeoning field, has become vital for comprehending public perception of specific issues [ 80 ]. In this study, sentiment trend analysis integrates the strengths of the aforementioned two methods, utilizing the results with a weighted approach. Initially, a DataFrame (df) is created to organize the data, encompassing columns for date, weighted sentiment scores, and sentiment labels (’Positive’ or ’Negative’). The date column undergoes conversion to the datetime type for accurate time series analysis. Subsequently, the ’Sentiment’ column values are determined based on the weighted scores, with ’Positive’ assigned if the score exceeds 0, and ’Negative’ otherwise. Finally, the DataFrame is grouped by date and sentiment, daily sentiment counts are computed, and a line plot is generated using matplotlib to illustrate the daily counts of positive and negative sentiments over time.

A lucid visualization is paramount to fully capturing the chronology of sentiment dynamics [ 81 ]. As such, daily sentiment metrics are illustrated, clearly depicting the populace’s emotional ebbs and flows. This graphical elucidation not only bestows a daily sentiment snapshot but also illuminates prevailing trends, proving indispensable for decision-makers, ranging from corporate strategists to policymakers, who anchor their choices on the pulse of public sentiment.

4.1 Word frequency

In this part, word clouds and frequency graphs provide initial insights into the diverse perspectives of the general public on the topics (Question 1) and attitudes (Question 2) toward GPT. The analyzed entries exceeded a total count of 23,773 entries. Fig 1 displays the top 20 most common words from the entries. As Fig 1 shows, the public’s positive attitude toward ChatGPT is evident through the words "like" and "good", indicating their appreciation and approval of the model. However, the discussions also unveil contemplation about the practical applications of ChatGPT, encompassing terms such as "use," "using," "way," "make," and "need," highlighting the discourse on how to harness ChatGPT’s capabilities fully. In addition, words like "would," "think," "know," "could," and "even," express doubts and uncertainties, reflecting concerns about its potential limitations and abilities. Technical aspects of the discourse include terms like "model," "bot," "prompt," "data," "code," and "models," revealing the audience’s attention to ChatGPT’s internal working model, data processing, and technological implementation.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g001

4.2 Topic modeling

This part presents the emergent topics and themes identified through topic modeling. It aims to address research question 1: What are the emerging topics related to ChatGPT? This study combines qualitative and quantitative content analysis to uncover and discover latent topics and themes of the public’s discussion, which are believed to hold significant potential for research in the field of social media [ 82 ]. As a commonly used quantitative method for topic classification, the LDA model aids in determining the most optimal number of topics for classification. In Fig 2 , the perplexity-topic number curve is plotted.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g002

Typically, the optimal number of topics is determined based on lower perplexity levels [ 83 ]. When the number of topics is set at 8, the perplexity is at its lowest. However, the lowest perplexity may not always signify the best model performance. With a high number of topics, models often overfit, resulting in excessive and non-convergent topic counts. An excessive number of topics may lead to high redundancy, resulting in low distinctiveness and uniqueness between topics [ 71 , 83 ]. Hence, many studies rely on human judges to determine the optimal number of topics. This method also adheres to certain principles: (1) high coherence between words and topics; (2) the quality of topic, ensuring non-repetition, non-conflict, and coverage of primary content [ 84 ]. This study tests the topic categorization and high-frequency words of each topic when the number of topics was set at 8 ( Table 2 ). However, it demonstrates poor coherence and topic quality. Across the topics, there is a lack of coherent themes, with words appearing disjointed and unrelated within each category. The representative words fail to form distinct and meaningful topics, undermining the effectiveness of the model in capturing the underlying structure of the data.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.t002

Therefore, this study also tested the number of topics corresponding to the point of significant decrease in perplexity, i.e., the number of topics (7) near the inflection point of the curve. When the number of topics is set at 7, the distribution of word frequencies in relevant topics is shown in Table 3 . The top words in each topic exhibit good coherence and topic quality. Table 3 shows that the top 10 words in each topic are categorized into seven topics, which are then assigned to three themes. The analysis of seven topics demonstrates the wide range of discussions regarding ChatGPT on the Reddit community. These discussions cover technical inquiries, philosophical pondering, impacts on society, creative applications, and entertainment. The topics reflect the multifaceted nature of ChatGPT and highlight the diverse perspectives and interests of the public when using it.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.t003

The first topic concerns people’s general impressions of ChatGPT. The keywords such as "like," "think," and "good" indicate that individuals are generally favorable towards ChatGPT. This topic focuses on how people perceive ChatGPT’s potential benefits, usability, functionality, and positive impact on them. The second topic appears to focus on technical inquiries for assistance. Terms such as "bot," "prompt," and "link" indicate that users are seeking information on how to use ChatGPT for various tasks. The terms "questions," "message," and "action" suggest a desire to optimize ChatGPT’s functionality for specific purposes. The third topic delves into philosophical discussions, examining consciousness, AGI (Artificial General Intelligence), and human reasoning of ChatGPT. The keywords like "consciousness," "agi," "humans," and "belief" imply that users are exploring ChatGPT’s human consciousness, intelligence, and spiritual characteristics. The fourth topic explores the technical details of ChatGPT, focusing on coding and textual manipulation. The keywords such as "code," "prompt," "text," and "language" suggest discussions on how to utilize ChatGPT for code generation or text creation efficiently. This topic covers its capabilities in software development, content creation, and language-oriented tasks. The fifth topic explores ChatGPT’s impact on diverse fields of life, including the arts, healthcare, and quantum phenomena. The words "music," "art," "health," and "quantum" indicate discussions on how ChatGPT brings the revolution and advances to the artistic and scientific domains.

The sixth topic focuses on the broad social and economic influences of ChatGPT. Keywords like "market," "jobs," and "impact" suggest discussions on ChatGPT’s impact on the job market and the global economy. The debates could probably revolve around potential job loss resulting from ChatGPT and the ethical concerns about AI. The seventh topic concerns the correlation between ChatGPT and politics and entertainment. The keywords such as "trump" and "president" relate to ChatGPT’s function in political discussions. The terms "spider," "gif," and "gypsy" indicate the potential utilization of ChatGPT within cultural and entertainment contexts.

Through the qualitative content analysis, seven topics were systematically coded and categorized into three themes. Theme 1 covers Topics 1 and 3, which focus on users’ positive views of ChatGPT and its potential advantages and positive influence on various aspects of life. Theme 2 encompasses Topics 2 and 4, focusing on the technical methods of ChatGPT, including queries, assistance, coding, and practical applications. Discussions cover topics such as the application of ChatGPT for specific tasks, seeking guidance, and sharing experiences regarding coding and language generation. Theme 3 comprises Topics 5, 6, and 7, focusing on the broader social impact of ChatGPT on art, music, health, politics, market, employment prospects, scientific progress, and the entertainment industry.

4.3 Robustness of topic modeling

To verify the robustness of the model, it randomly selects 10,000 samples as a subset from the original dataset for testing [ 85 ]. To examine the robustness, the same parameters are used for two models [ 86 ].

Fig 3 , compares the perplexity scores of the two models when the number of topics ranges from 1 to 20, finding that the results of the two runs are highly similar. The solid line with circles represents the perplexity changes of the original dataset when LDA topic modeling is performed, while the dotted line with triangles represents the perplexity changes of a subset of the original dataset under the same modeling process. It can be seen that the variation range of the perplexity scores of the two curves is between -8.5 and -11.1, and the overall trend shows a slow increase at first and then a decrease. Within the topics range of 1 to 8, the difference between the two curves is minimal, showing a high degree of similarity. The largest gap is when the number of topics is 20, and it is only 0.23 at this time, indicating that our model is more robust.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g003

When the topic modeling of the subset is performed and the number of topics is 7, the high-frequency words corresponding to each topic are shown in Table 4 . Three Themes were summarized from the 7 Topics generated by LDA topic modeling based on a subset of the original dataset, which is consistent with our previous LDA modeling results using the original dataset. This also confirms the posts and comments about GPT. The three widely discussed themes are User Perception, Technical Methods, and Impacts on Society.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.t004

To illustrate the robustness, the theme generated by the two models has a high degree of similarity with the top words ( Fig 4 ). Regardless of whether it is applied to the original dataset or its subset, it can produce highly similar theme results. It has summarized the seven topics derived from the original dataset into three themes, and it also depicts the high-frequency words associated with different themes generated by two topic modeling processes. Among these, the intersection of each pair of themes represents the high-frequency words that yield identical results from both topic modeling techniques.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g004

To further corroborate the reliability of this result, it repeatedly experimented with and sampled the original dataset five times [ 87 ]. Since comparing the results when the number of topics is 1 is not meaningful, it utilizes the perplexity scores from the second to the eighth topics of the original dataset as a baseline. These baseline scores are then compared against the perplexity scores generated by the subsets sampled on the other five occasions. A similarity test is subsequently conducted to assess the comparability shown in Fig 5 . The horizontal axis of the heat map above represents the number of topics using LDA for topic modeling, ranging from 2 to 8, and the vertical axis represents the number of subsets we randomly extracted from the original dataset [ 88 ]. Different squares represent the difference in perplexity between the topic modeling results of 5 randomly selected subsets of the original data set and the topic modeling of the original data set.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g005

In Fig 5 , the difference between the five tests and the baseline (perplexity of the original dataset) is small. When the number of topics is less than or equal to 6, the difference in confusion does not exceed 0.10. The relative maximum difference is when the number of topics in the second extracted subset is 8, at which time the difference in perplexity is only about 0.14. After several times of randomly extracting the subset and re-performing the number of topics-perplexity experiment, the perplexity difference is very small for five experiments in the range of the number of topics is 8, which further confirms the high robustness of our modeling. Rre-performing topic modeling on randomly selected subsets and topic modeling performed on the original dataset show extremely high similarity in results, and the results of multiple repetitions of the modeling show very little difference in perplexity. Therefore, based on the comparatively low perplexity differences between the subsets and the original data, the robustness of the model can be supported.

4.4 Sentiment analysis

In this part, two sentiment analysis models, Vader and Textblob, are assigned weights of 0.6 and 0.4 respectively for sentiment classification to explore research question 2. The sentiment analysis categorizes the emotional tone of the entries into three distinct parameters: positive, negative, and neutral. The weighting of positive, neutral, and negative entries is shown in Fig 6 (N = 23,773). The analysis reveals a positive sentiment among Reddit users, with approximately 61.6% of entries conveying affirmative emotional nuances. In contrast, about 20.8% of entries express negative sentiments, while neutral ones account for the smallest segment at 17.6%.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g006

Fig 7 demonstrates the comprehensive sentiment distribution (N = 23,773). There is a noticeable concentration of entries between 0 and 0.6, indicating the prevailing positive emotions. Moreover, it is pertinent to mention that the sentiment analysis identified a significant count exceeding 5100 entries conveying a neutral sentiment. Furthermore, most entries fall within the range of -0.25 to 0.60, suggesting a moderately nuanced sentiment orientation and a notable absence of distinct polarization in the overall sentiment attitudes towards ChatGPT.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g007

Table 5 selectively displays the high-frequency words in posts and comments expressing different sentiments.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.t005

In neutral discussions, individuals mention the usage of ChatGPT, such as "api", and "to access it you have to use the API." The API is an interface that facilitates communication between distinct software systems or services, enabling programs or applications to access the functionalities or data of other systems. Users might engage in conversations regarding the functionalities and limitations of the API, deliberating on the prospect of integrating ChatGPT’s language generation capabilities into their applications or systems using the API [ 89 ].

The negative comments with words such as "wrong," "bad," and "problem," reflect their perception of errors, issues, or flaws of ChatGPT. For instance, "It is just a dumb stunt for a dumb application," "The model has been quantized badly, " and "I have doubts about their security claims. " These express suspicion about certain aspects of ChatGPT which generate problematic content at times or unsatisfactory functionality. It suggests skepticism about the accuracy and quality of the content generated by ChatGPT.

It is worth noting that the term "model" appears in both positive, neutral, and negative posts and comments, showcasing varying perspectives on the ChatGPT based on its performance, applications, and potential risks. Positive comments emphasize its impressive capabilities, including generating high-quality text, conducting fluent conversations, and efficiently retrieving information. This technology is acknowledged for its significant advances in natural language processing, benefiting various fields such as intelligent assistants and text creation. On the other hand, negative feedback may suggest that ChatGPT produces incorrect outputs, raises ethical concerns, and has the potential to spread misinformation. These diverse viewpoints reflect the complexity of ChatGPT and its social implications.

4.5 Sentiment trend analysis

This part examines daily sentiment trends by comparing the quantity of positive and negative sentiment posts from January to August 2023 (N = 23,773) to explore research question 3. Based on the GPT-3.5 model, ChatGPT was launched by Open AI on November 30, 2022, gaining a growing user base. On March 15, 2023, OpenAI unveiled the new multimodal model, GPT-4, available for purchase [ 2 ]. It aims to ascertain whether version updates have influenced sentiment towards ChatGPT. Fig 8 demonstrates how sentiments changed over time. The graph displays two sentiment classes, denoted by green and red, representing positive and negative sentiments, respectively. The sentiments fluctuate over time during the ChatGPT update.

thumbnail

https://doi.org/10.1371/journal.pone.0302502.g008

Fig 8 shows that fluctuations in 23,773 entries occur during mid-February, mid-to-late March, May, and mid-July. In most instances, the number of positive entries surpasses that of negative ones. The graph reveals a moderately increasing trend in mid-February, where the daily entry count exceeded 350. This surge can be attributed to the launch of the Plus plan on February 9, offering Plus users the option to select from various versions of ChatGPT. Moreover, the worldwide release of ChatGPT Plus for purchase was announced on February 13th. In mid-March, there was a small peak in the count of entries, approaching nearly 400 daily, perhaps attributable to the announcement of GPT-4 on March 14, 2023. Additionally, there were 100 more positive entries than negative ones indicating that GPT-4’s launch offered advanced reasoning, complex instructions, and enhanced creativity. In May, discussions regarding ChatGPT reached a peak. This surge in activity can be attributed to several factors. Primarily, it could be due to OpenAI’s implementation of new privacy features on May 3rd, which introduced the option to "Turn off chat history and decline to use for model training." It addresses some privacy concerns and encourages user engagement. Furthermore, on May 12th, OpenAI released an update allowing ChatGPT Plus members to incorporate the Bing search engine for browsing web content. It transcended the previous limitations of ChatGPT’s database, which had been confined to information available only until 2021. Lastly, it could also be largely attributed to the momentous launch of the ChatGPT iOS app on May 18th in selected countries, including the United States, the United Kingdom, France, and others. It enriches users’ mobile experiences. However, due to the significant limitations imposed by these successive updates on the nation and the mobile operating system, most users have been unable to benefit from the conveniences. This has potentially led to public dissatisfaction, resulting in a surge of more than 300 negative comments and posts in a single day.

In mid-July, there was a slight increase in ChatGPT discussions, with more than 300 entries reflecting positivity. The surge of positive entries can be attributed to the widespread introduction of the Code Interpreter feature to all ChatGPT Plus users. This innovation allowed non-programmers to express intentions in everyday language, translating into executable Python code solutions, enabling the accomplishment of intricate tasks within a real-time working environment. The innovation not only streamlined the processes of code composition and data manipulation but also expedited the application of artificial intelligence across diverse domains. While the version updates of ChatGPT may spark heated discussions among Reddit users, users generally hold positive sentiments towards ChatGPT and there was no significant shift from positive to negative attitudes during the period between January 2023 and August 2023.

5 Discussion

ChatGPT is one of the most fascinating frontier AI technologies, revolutionizing the approach to human-machine interaction and gaining worldwide attention for providing detailed answers in various areas of human society. However, there is an absence of studies evaluating its significant social influence. This study investigates the public’s viewpoints regarding the usage and impact of ChatGPT through topic modeling and sentiment analysis. Differing from sampling survey methods, this study follows the emerging trend of big data mining and gathers data on posts and comments from social media platform, Reddit. It employs the LDA unsupervised learning model to generate seven topics. The study uses a weighted approach that combines VADER with Textblob to categorize sentiment and analyze sentiment trends in posts and comments.

The result reveals seven topics of public discourse concerning ChatGPT, which can be classified into three themes: user perception, technical methods, and impacts on society. It suggests a comprehensive exploration by users into its potential ramifications, with opportunities for advancement across various facets of human society, such as markets, capital, employment, education, research, healthcare, art, entertainment, politics, gender, and ethical considerations. Meanwhile, the extensive discourse on its technical methods indicates that ChatGPT does not replace human intelligence or hinder creative expression. On the contrary, it provides a reservoir of diverse perspectives, facilitating unconventional thinking, and fostering an environment conducive to the expansion of human creative capacities [ 90 , 91 ].

In addition, sentiment analysis shows that people generally have a positive attitude towards ChatGPT. They believe that ChatGPT can engage in natural and easy conversations with users without requiring an in-depth understanding of complex natural language processing techniques. It is considered a symbol of huge technological progress. However, posts and comments still express concern and criticism about potential risks with ChatGPT. While there are acknowledged limitations within ChatGPT, this study does not explicitly pinpoint the specific areas where these problems exist. Finally, the sentiment analysis reveals that throughout the majority of the periods investigated in our study, most users express a positive attitude towards ChatGPT. Changes in sentiment tend to vary over time and may be affected by updates introduced to ChatGPT. These updates are often associated with a high level of user satisfaction on Reddit.

For practical implication, this study offers valuable insights into potential enhancements and optimal utilization strategies for developers and users of ChatGPT. GPT-related companies and developers should prioritize the user experience. While the public’s attitude towards it is relatively positive due to its naturalistic interactive capabilities, a substantial portion of public discourse (as one of the themes) concentrates on the technical methods of using ChatGPT and its prompts. Therefore, it is recommended that ChatGPT developers enhance the user-friendliness of bot features in product design and its prompt. Additionally, GPT-related companies and research institutions could consider prompt in-depth discussions on technological applications and impacts on society to attract more users. The application of ChatGPT in various fields, such as healthcare, art, and science, can encourage users to unlock the potential of ChatGPT. It promotes cross-domain integration and fosters innovation, even for those with limited knowledge of artificial intelligence techniques or programming [ 92 ]. Furthermore, by actively seeking dialogue from diverse stakeholders, this inclusive approach facilitates the ethical development and deployment of ChatGPT.

For the users, they should understand the impact of ChatGPT on their own lives and learn how to use it effectively. The general public needs to learn how to use suitable prompts for text generation and dialogue accurately. Also, users should consider the advantages and disadvantages of ChatGPT. Similar to the findings revealed by previous research [ 93 ], the public also expresses concerns about the ethical risks associated with ChatGPT, such as the potential for generating fabricated misinformation, violating copyrights, and promoting plagiarism. Therefore, all stakeholders are expected to cultivate social awareness and engage in public discourse regarding the ethical use and standards of technology. It is crucial to enhance the transparency, accountability, and fairness of ChatGPT [ 94 ].

Despite its contributions, this study has several limitations. First, it relies on data from a single social media platform, Reddit, where the users’ demographic skews towards being male, young, white, and highly educated (63% of Reddit users have a Bachelor’s degree or higher) [ 12 , 57 ]. Previous research indicates that individuals with higher educational attainment and younger age groups exhibit a greater understanding of ChatGPT. This may raise concerns about the generalizability of the findings to users of other social media platforms and the public [ 95 ]. Future research should examine the public’s attitude towards ChatGPT on various social media platforms to address the limitation. Comparative analyses across different platforms such as Twitter, Facebook, and online forums would provide a more comprehensive view and public perceptions of ChatGPT. Second, the study is descriptive, and future research should consider causal studies. The study shows a wide range of impacts of ChatGPT on different domains of human society (e.g., market, capital, employment, health, arts, entertainment, politics, and gender). However, it is uncertain whether users with different occupations and identities affect people’s attitudes toward ChatGPT. For example, quantitative methods such as regression analysis can be used. In addition, a longitudinal research design could explore how ChatGPT affects different domains over time. Third, this study does not identify the specific areas in which people expressed negative perceptions. A more detailed qualitative content analysis could examine negative posts and comments to identify specific themes and underlying concerns. This can lead to a better understanding of the limitations of the technology and directions for improvement.

Supporting information

S1 file. details on data collection and analysis..

https://doi.org/10.1371/journal.pone.0302502.s001

  • View Article
  • Google Scholar
  • 6. Metz, C., & Collins, K. (2023, Mar 15). All the ways GPT-4 is impressive but still flawed: [Business/Financial desk]. New York. 2023. [cited 2023 December 2]. Available from: https://www.nytimes.com/2023/03/14/technology/openai-new-gpt4.html
  • PubMed/NCBI
  • 11. Brigham K. Reddit throughout the years: Its rise to prominence, recent revolts and IPO plans. CNBC. 2023. [cited 2023 December 2]. Available from: https://www.cnbc.com/2023/07/30/reddits-rise-to-prominence-recent-revolts-and-future-prospects.html .
  • 47. Rajput NK, Grover BA, Rathi VK. Word frequency and sentiment analysis of Twitter messages during the coronavirus pandemic. arXiv:2004.03925 [Preprint]. 2020 [cited 2024 January 2]. Available from: https://doi.org/10.48550/arXiv.2004.03925 . https://doi.org/10.48550/arXiv.2004.03925
  • 70. Anandarajan M, Hill C, Nolan T. Practical Text Analytics. Springer; 2018.

Research Article

Deciphering Lending Behaviors in Peer-to-Peer Platforms: An Integrated Analysis of Emotion, Topic Modeling, and User-defined Occupational Data

  • @INPROCEEDINGS{10.4108/eai.15-12-2023.2345291, author={Yunxuan Zhang}, title={Deciphering Lending Behaviors in Peer-to-Peer Platforms: An Integrated Analysis of Emotion, Topic Modeling, and User-defined Occupational Data}, proceedings={Proceedings of the 3rd International Conference on Public Management and Big Data Analysis, PMBDA 2023, December 15--17, 2023, Nanjing, China}, publisher={EAI}, proceedings_a={PMBDA}, year={2024}, month={5}, keywords={peer-to-peer (p2p) lending self-disclosure sentiment analysis}, doi={10.4108/eai.15-12-2023.2345291} }
  • Yunxuan Zhang Year: 2024 Deciphering Lending Behaviors in Peer-to-Peer Platforms: An Integrated Analysis of Emotion, Topic Modeling, and User-defined Occupational Data PMBDA EAI DOI: 10.4108/eai.15-12-2023.2345291
  • 1: The University of British Columbia

This research investigates the influence of borrower-generated content on Peer-to-Peer (P2P) lending outcomes, specifically focusing on the sentiment and thematic content of loan descriptions. The central hypothesis posits that these elements of self-disclosure profoundly influence lending determinations and patterns. Utilizing advanced machine learning techniques, variables such as loan descriptions, occupations, loan amounts, interest rates, and loan statuses are examined. Findings highlight strong correlations between sentiment, thematic quality of loan descriptions, and funding success, accentuating the pivotal role of self-reported occupation. These insights illuminate previously uncharted currents within P2P lending dynamics, emphasizing the importance of sentiment analysis, theme selection, and occupational transparency. The research constitutes a sturdy foundation for prospective investigations in this rapidly evolving field.

4 Attractive Monthly Dividend ETFs For May 2024

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

Monthly income is not just for retired folks anymore. That’s apparently the case, given how much is being written and discussed these days about the concept of receiving dividend income from one’s investments. Around this time last year, a survey was released whose results included that 77% of retired Americans believe they have enough income to “live comfortably.” As someone who has seen the ebbs and flows of markets and investor sentiment over my 38-year career, I can’t help but wonder if that is a bit of the “recency effect” at work.

Investors have benefitted from strong appreciation in stock portfolios more often than not over the past several years. The annualized 15-year return of the S&P 500 Index is more than 14%, so the bad periods for stocks have been overwhelmed by the good times. But that is a double-edged sword, since past performance often not only fails to repeat itself, but tends to reverse. If stock returns from price gains were to be lower or even negative over the next 5 to 10 years, investors may need to figure out an alternate route.

This article discusses one such route, exchange-traded funds that pay a dividend to shareholders out of their value each month. For decades, quarterly was a typical dividend payout period. But the ETF industry has responded to the aging Baby Boomer generation’s need for predictable, regular investment income by offering an increased number of funds that pay income monthly. And, while investors can still lose money on the price return component of any mutual fund, stock or bond, the expanded lineup of monthly payers simply opens up more opportunities to choose from.

And, since even younger investors have taken to the idea of supplementing their working income by generating investment income , this is a timely topic to a broader audience than ever. So, as an ETF connoisseur for decades, here are four of many I have scouted over time, which represent different investment styles that all pay monthly dividends.

The brain trust at Forbes has run the numbers, conducted the research, and done the analysis to come up with some of the best places for you to make money in 2024. Download Forbes' most popular report, 12 Stocks To Buy Now.

Why Invest In Monthly Dividend ETFs

We tend to budget for our living expenses monthly, and get paid weekly, twice a month or perhaps monthly in our jobs. So quarterly income may be a bit out of sync for some investors. While getting paid 12 times a year means that those payments will be smaller than if they were received four times a year (via quarterly-paying ETFs), the higher frequency is more in line with our generally shorter attention spans, not to mention the budgeting aspects.

And, since bonds tend to pay income only every six months if owned individually, the monthly payment method is quite appealing to investors used to having to mix a set of bonds so that they could “ladder” the payments to occur more frequently. The other key advantage of monthly pay dividend ETFs is that for funds whose payout rates fluctuate, getting paid more often allows the shareholder to receive those more current rates each time, and on a timely basis.

How These Top Dividend ETFs Were Chosen

There are hundreds of monthly pay ETFs to choose from. I selected four that I have personal past experience with, though none of this should be considered advice or a recommendation at all. My familiarity with these drove my selection, as well as my opinion that most ETFs that target long-term bonds or so-called “credit” bonds (high yield, corporates, preferred stock, convertible bonds), with less than U.S. Treasury level credit features have systemic risks that could potentially threaten the payouts and value of such ETFs in the foreseeable future. No one can ever predict a “credit market event,” but given the high debt burdens of many companies, I decided to stick to stocks and Treasury securities this time out.

These ETFs all yield at least 4%, have been in existence for at least three years, and have at least $100 million in assets under management as of this writing.

4 Best Monthly Dividend ETFs for May 2024

Source: YCharts

1. Invesco S&P 500 High Dividend Low Volatility ETF SPHD

Etf overview.

  • Years since inception: 11
  • Dividend Yield: 4.2%
  • Last Ex-Dividend Date: April 22, 2024
  • 1-Year Total Returns: 14.6%
  • Net Expense Ratio: 0.30%

Why SPHD Is A Top Choice

SPHD starts with the S&P 500 index, and through a series of filters and fundamental tests, whittles that index down by 90%, to about 50 stock holdings. Those are the ones judged to be the least volatile high-yielding component stocks within the S&P 500. The combination of low volatility and high yield tends to crowd out technology and communications stocks, which combined represent 11% of SPHD versus 40% for the S&P 500. SPHD’s dividend yield is more than three times that of the S&P 500, and the nature of its stock screening process allows it to make extensive changes to the portfolio at scheduled intervals. SPHD’s historical turnover is 67%, implying that about ⅔ of those 50 stocks are replaced in a typical 12-month period. Current top holdings include Altria Group MO , Kinder Morgan and AT&T.

2. Global X Nasdaq 100 Covered Call & Growth ETF (QYLG)

  • Years since inception: 3
  • Dividend Yield: 5.6%
  • 1-Year Total Returns: 25.4%
  • Net Expense Ratio: 0.35%

Why QYLG Is A Top Choice

While SPHD is a true stock portfolio inside an ETF wrapper, QYLG does own the full Nasdaq 100, but half of that portfolio is used as a base to write (sell) covered call options. That brings in yield in the form of option premium, and QYLG is essentially two portfolios in one. Half the assets seek to replicate the performance of the Nasdaq 100 index and the other half has very little price upside potential, but capitalizes on the volatility of the Nasdaq 100 to accumulate income. That takes a near-zero yielding corner of the stock market (the Nasdaq 100) and produces a high yield and upside potential. Top holdings include Microsoft MSFT , Apple AAPL and Nvidia .

Stop chasing shadows in the market. Forbes' expert analysts have pinpointed the 12 superstars poised to ignite returns in 2024. Don't miss out—download 12 Stocks To Buy Now and claim your front-row seat to the coming boom.

3. FT Cboe Vest S&P 500 Dividend Aristocrats Target Income ETF (KNG)

  • Years since inception: 6
  • Dividend Yield: 7.7%
  • Last Ex-Dividend Date: April 23, 2024
  • 1-Year Total Returns: 10.2%
  • Net Expense Ratio: 0.75%

Why KNG Is A Top Choice

KNG, like QYLG, employs covered call option writing. However, its base is not the Nasdaq, it is an index of stocks known as “dividend aristocrats,” those which have raised their dividend yields for decades. That tends to indicate fundamental strength, and also provides some dividend yield that is then supplemented by writing call options on the portfolio’s assets. Top holdings include C.H. Robinson Worldwide, Albemarle and Amcor.

4. iShares Treasury Floating Rate Bond ETF TFLO

  • Years since inception: 10
  • Dividend Yield: 5.3%
  • Last Ex-Dividend Date: May 1, 2024
  • 1-Year Total Returns:5.5%
  • Net Expense Ratio: 0.15%

Why TFLO Is A Top Choice

TFLO owns U.S. Treasury securities maturing in one to three years, but which have a unique feature. These bonds pay income at a rate that “floats,” or fluctuates with market interest rates. This makes TFLO a potential winner during periods of rising interest rates and inflation, though the opposite is also true.

Bottom Line

ETFs offer so many ways to earn income, and can do so on a monthly basis. Popular ETFs providers, include Vanguard , Fidelity and BlackRock BLK . For that reason, investors can look to hundreds of them to research and find the ones that match what they aim to do with their money when it comes to spinning out cash flow from investments each month, without having to sell holdings.

Frequently Asked Questions (FAQs)

Are monthly dividend etfs risky.

Any ETF is only as risky as its underlying portfolio of assets. Any stock ETF carries some degree of stock market risk and the stock market has had two drops of 50% or more this century. But when it comes to monthly dividend ETFs from asset classes such as short-term U.S. Treasury securities, barring a collapse of the U.S. Treasury, the risk of such ETFs should be considered much less than that of the stock or long-term bond markets.

Can I Reinvest Dividends From Monthly Dividend ETFs?

Yes, though typically this is done after the dividend has been received. Unlike mutual fund distributions and dividends, ETFs trade on the stock exchange, so like stocks, the income from them goes right into the shareholder’s account on payment date. Investors can check with their custodial brokerage firm to have dividends reinvested, and under what terms.

Are Monthly Dividend ETFs Tax-Efficient?

Tax efficiency is different for each investor, and so that is a question for an individual to ask their tax advisor about.

  • Nvidia Stock Earnings Preview: What Investors Need To Know
  • 5 Best Dividend Stocks To Help Hedge Inflation
  • What Is Inflation And How It Impacts Your Retirement

Rob Isbitts

  • Editorial Standards
  • Reprints & Permissions

Trends and challenges in sentiment summarization: a systematic review of aspect extraction techniques

  • Published: 09 May 2024

Cite this article

sentiment analysis research topics

  • Nur Hayatin 1 , 2 ,
  • Suraya Alias 1 &
  • Lai Po Hung 1  

25 Accesses

Explore all metrics

Sentiment Summarization is an automated technology that extracts important features of sentences and then reorganizes selected words or sentences by their aspect class and sentiment polarity. This emerging research area wields considerable influence, where a sentiment-based summary can provide insight into users’ subjective opinions, creating social engagement that benefits industry players and entrepreneurs. Meanwhile, systematic studies examining sentiment-based summarization, particularly those delving into aspect levels, are still limited. Whereas aspects are crucial to obtain a comprehensive assessment of a product or service for improving sentiment summarization results. Hence, we conducted a comprehensive survey of aspect extraction techniques in sentiment summarization by classifying techniques based on sentiment analysis levels and features. This work analyzes the current research trends and challenges in the research domain from a different perspective. More than 150 literature published from 2004 to 2023 are collected mainly from credible academic databases. We summarized and performed a comparative analysis of the sentiment summarization approaches and tabulated their performance based on different domains, sentiment levels, and features. We also derived a thematic taxonomy of aspect extraction techniques in sentiment summarization from the analysis and illustrated its usage in various applications. Finally, this study presents recommendations for the challenges and opportunities for future research development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

sentiment analysis research topics

Similar content being viewed by others

Aspect extraction in sentiment analysis: comparative analysis and survey, a survey on classification techniques for opinion mining and sentiment analysis.

sentiment analysis research topics

Monitoring the Business Cycle with Fine-Grained, Aspect-Based Sentiment Extraction from News

Statista (2021) Total number of user reviews and opinions on Tripadvisor worldwide from 2014 to 2020 (in millions). Statista Research Department, 2021. https://www.statista.com/statistics/684862/tripadvisor-number-of-reviews/ . Accessed 13 Nov 2021.

Dixon S (2022) How do online customer reviews affect your opinion of a local business?. https://www.statista.com/statistics/315751/online-review-customer-opinion/ . Accessed 28 Dec 2022

Lloret E, Palomar M (2011) Text summarisation in progress: a literature review. Artif Intell Rev 37(1):1–41

Article   Google Scholar  

Liu B (2012) Sentiment analysis and sentiment analysis and opinion mining. Morgan & Claypool, San Rafael

Book   Google Scholar  

Moussa ME, Mohamed EH, Haggag MH (2018) A survey on opinion summarization techniques for social media. Future Comput Inform J 3(1):82–109

Beineke P, Hastie T, Manning C, Vaithyanathan S (2004) Exploring sentiment summarization. In: AAAI spring symposium—technical report, 2004, vol SS-04-07, pp 12–15

Mukherjee A, Liu B (2012) Aspect extraction through semi-supervised modeling. In: 50th annual meeting of the association for computational linguistics, ACL 2012—proceedings of the conference, 2012, vol 1, no. July, pp 339–348

Das SJ, Murakami R, Chakraborty B (2021) Development of a two-step LDA based aspect extraction technique for review summarization. Int J Appl Sci Eng 18(1):1–18

Google Scholar  

Kim H, Ganesan K (2011) Comprehensive review of opinion summarization. Illinois Environ 1–30

Giachanou A, Crestani F (2016) Like it or not: a survey of twitter sentiment analysis methods. ACM Comput Surv 49(2):1–41

Tubishat M, Idris N, Abushariah MAM (2018) Implicit aspect extraction in sentiment analysis: review, taxonomy, oppportunities, and open challenges. Inf Process Manag 54(4):545–563

Yue L, Chen W, Li X, Zuo W, Yin M (2019) A survey of sentiment analysis in social media. Knowl Inf Syst 60(2):617–663

Nazir A, Rao Y, Wu L, Sun L (2020) Issues and challenges of aspect-based sentiment analysis: a comprehensive survey. IEEE Trans Affect Comput 13(2):845–863

Cai H, Xia R, Yu J (2021) Aspect-category-opinion-sentiment quadruple extraction with implicit aspects and opinions. In: Proceedings ofthe 59th annual meeting ofthe association for computational linguistics and the 11th international joint conference on natural language processing, pp 340–350

Komwad N, Tiwari P, Praveen B, Chowdary CR (2022) A survey on review summarization and sentiment classification. Knowl Inf Syst 64(9):2289–2327

Maitama JZ, Idris N, Abdi A, Shuib L, Fauzi R (2020) A systematic review on implicit and explicit aspect extraction in sentiment analysis. IEEE Access 8(November):194166–194191

Mahajani A, Pandya V, Maria I, Sharma D (2019) A comprehensive survey on extractive and abstractive techniques for text summarization. Adv Intell Syst Comput 904:339–351

Nenkova A, McKeown K (2011) Automatic summarization. Found Trends Inf Retr 5(2–3):103–233

Hayatin N, Alias S, Hung LP, Sainin MS (2022) Sentiment analysis based on probabilistic classifier techniques in various indonesian review data. Jordanian J Comput Inf Technol 08(3):271–282

Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093–1113

Yi J, Nasukawa T, Bunescu R, Niblack W (2003) Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In: Proceedings—IEEE international conference on data mining, iCDM, 2003, pp 427–434

Hu YH, Chen YL, Chou HL (2017) Opinion mining from online hotel reviews—a text summarization approach. Inf Process Manag 53(2):436–449

Hussain SF, Babar HZUD, Khalil A, Jillani RM, Hanif M, Khurshid K (2020) A fast non-redundant feature selection technique for text data. IEEE Access 8:181763–181781

Thakkar HK, Sahoo PK, Mohanty P (2021) DOFM: domain feature miner for robust extractive summarization. Inf Process Manag 58(3):102474

Tan B, Qin L, Xing E, Hu Z (2020) Summarizing text on any aspects: a knowledge-informed weakly-supervised approach. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), 2020, pp 6301–6309

Huang S-L, Cheng W-C (2015) Discovering Chinese sentence patterns for feature-based opinion summarization. Electron Commer Res Appl 14(6):582–591

Jiang W, Chen J, Ding X (2021) Review summary generation in online systems: frameworks for supervised and unsupervised scenarios. ACM Trans Web 15(3):1–33

López M, Martínez-Cámara E, Luzón MV, Herrera F (2021) ADOPS: aspect discovery opinion summarisation methodology based on deep learning and subgroup discovery for generating explainable opinion summaries. Knowl-Based Syst 231:107455

Kumar A, Seth S, Gupta S, Maini S (2021) Sentic computing for aspect-based opinion summarization using multi-head attention with feature pooled pointer generator network. Cognit Comput 14(1):130–148

Kitchenham B, Pearl Brereton O, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering—a systematic literature review. Inf Softw Technol 51(1):7–15

Page MJ et al (2021) The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg 88:105906

Ataei TS, Darvishi K, Javdan S, Minaei-Bidgoli B, Eetemadi S (2019) Pars-ABSA: an aspect-based sentiment analysis dataset for Persian, pp 1–6

Huang J, Xue Y, Hu X, Jin H, Lu X, Liu Z (2019) Sentiment analysis of Chinese online reviews using ensemble learning framework. Cluster Comput 22:3043–3058

Alturaief N, Aljamaan H, Baslyman M (2021) AWARE: aspect-based sentiment analysis dataset of apps reviews for requirements elicitation. In: Proc. - 2021 36th IEEE/ACM int. conf. autom. softw. eng. work. ASEW 2021, pp 211–218

Mudalige CR et al (2020) SigmaLaw-ABSA: dataset for aspect-based sentiment analysis in legal opinion texts. In: 2020 IEEE 15th int. conf. ind. inf. syst. ICIIS 2020—proc., pp 488–493

Angelidis S, Amplayo RK, Suhara Y, Wang X, Lapata M (2021) Extractive opinion summarization in quantized transformer spaces. Trans Assoc Comput Linguist 9:277–293

Nguyen Ngoc D, Phan Thi T, Do P (2019) A data preprocessing method to classify and summarize aspect-based opinions using deep learning. In: Lect. notes comput. sci. (including subser. lect. notes artif. intell. lect. notes bioinformatics), vol 11431 LNAI, no. December, pp 115–127

Kang Y, Zhou L (2017) RubE: rule-based methods for extracting product features from online consumer reviews. Inf Manag 54(2):166–176

Nurrahmi H, Maharani W, Saadah S (2016) Feature extraction and opinion classification using class sequential rule on customer product review. In: 2016 4th int. conf. inf. commun. technol. ICoICT 2016

Ku L, Lee L-Y, Wu T-H, Chen HH (2005) Major topic detection and its application to opinion summarization. In: SIGIR 2005—proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, 2005, pp 627–628

Hu M, Liu B (2004) Mining and summarizing customer reviews. In: KDD-2004—proc. tenth ACM SIGKDD int. conf. knowl. discov. data min., pp 168–177

Rana TA, Cheah Y-N, Rana T (2020) Multi-level knowledge-based approach for implicit aspect identification. Appl Intell 50(12):4616–4630

Xu Q, Zhu L, Dai T, Guo L, Cao S (2020) Non-negative matrix factorization for implicit aspect identification. J Ambient Intell Humaniz Comput 11(7):2683–2699

Liu B (2012) Sentiment analysis and opinion mining

Peal M, Hossain MS, Chen J (2022) Summarizing consumer reviews. J Intell Inf Syst 59:193–212

Hu Y (2017) Opinion mining from online hotel reviews—a text summarization approach. Inf Process Manag 53(2):436–449

Ma Y, Li Q (2019) A weakly-supervised extractive framework for sentiment-preserving document summarization. World Wide Web 22(4):1401–1425

Marzijarani SB, Sajedi H (2020) Opinion mining with reviews summarization based on clustering. Int J Inf Technol 12(4):1299–1310

Abdi A, Shamsuddin SM, Hasan S, Piran J (2019) Automatic sentiment-oriented summarization of multi-documents using soft computing. Soft Comput 23(20):10551–10568

Abdi A, Hasan S, Shamsuddin SM, Idris N, Piran J (2021) A hybrid deep learning architecture for opinion-oriented multi-document summarization based on multi-feature fusion. Knowl-Based Syst 213:106658

Tsai CF, Chen K, Hu YH, Chen WK (2020) Improving text summarization of online hotel reviews with review helpfulness and sentiment. Tour Manag 80:104122

Abdi A, Shamsuddin SM, Hasan S, Piran J (2018) Machine learning-based multi-documents sentiment-oriented summarization using linguistic treatment. Expert Syst Appl 109:66–85

Zhou X, Wan X, Xiao J (2016) CMiner: opinion extraction and summarization for Chinese microblogs. IEEE Trans Knowl Data Eng 28(7):1650–1663

Nishikawa H, Hasegawa T, Matsuo Y, Kikui G (2010) Opinion summarization with integer linear programming formulation for sentence extraction and ordering. In: Coling 2010—23rd international conference on computational linguistics, proceedings of the conference, 2010, vol 2, pp 910–918

Zhang M, Zhou G, Huang N, He P, Yu W, Liu W (2023) AsU-OSum: aspect-augmented unsupervised opinion summarization. Inf Process Manag 60(1):103138

Guzman E, Ibrahim M, Glinz M (2017) A little bird told me: mining tweets for requirements and software evolution. In: Proc.—2017 IEEE 25th int. requir. eng. conf. RE 2017, no 3, pp 11–20

Uddin G, Khomh F (2017) Automatic summarization of API reviews. IN: ASE 2017—proceedings of the 32nd IEEE/ACM international conference on automated software engineering, pp 159–170

Chu E, Liu PJ (2019) MeanSum: a neural model for unsupervised multi-document abstractive summarization. In: 36th international conference on machine learning, ICML 2019, 2019, vol 2019, pp 2088–2110

Dang HT (2005) Overview of DUC 2005. In: Proc. doc. underst. conf., p. 1Ą12

Zhuang L, Jing F, Zhu X-Y (2006) Movie review mining and summarization. In: International conference on information and knowledge management, proceedings, 2006, pp 43–50

Hayashi H, Budania P, Wang P (2021) WikiAsp: a dataset for multi-domain aspect-based summarization. Trans Assoc Comput Linguist 9:211–225

Ge S, Huang J, Meng Y, Wang S, Han J (2021) Fine-grained opinion summarization with minimal supervision

Marrese-Taylor E, Velásquez JD, Bravo-Marquez F (2014) A novel deterministic approach for aspect-based opinion mining in tourism products reviews. Expert Syst Appl 41(17):7764–7775

Jmal J, Faiz R (2013) Customer review summarization approach using twitter and sentiwordnet. In: ACM international conference proceeding series

Piryani R, Gupta V, Singh VK (2018) Generating aspect-based extractive opinion summary: drawing inferences from social media texts. Comput y Sist 22(1):83–91

He R, Lee WS, Ng HT, Dahlmeier D (2017) An unsupervised neural attention model for aspect extraction. In: ACL 2017—55th annu. meet. assoc. comput. linguist. proc. conf. (Long Pap.), vol 1, pp 388–397

Angelidis S, Lapata M (2018) Summarizing opinions: aspect extraction meets sentiment prediction and they are both weakly supervised. In: Proc. 2018 conf. empir. methods nat. lang. process. EMNLP 2018, vol arXiv, pp 3675–3686

Lloret E, Boldrini E, Vodolazova T, Martínez-Barco P, Muñoz R, Palomar M (2015) A novel concept-level approach for ultra-concise opinion summarization. Expert Syst Appl 42(20):7148–7156

Hu HW, Chen YL, Hsu PT (2016) A novel approach to rate and summarize online reviews according to user-specified aspects. J Electron Commer Res 17(2):132–152

Amplayo RK, Song M (2017) An adaptable fine-grained sentiment analysis for summarization of multiple short online reviews. Data Knowl Eng 110:54–67

Xu X, Meng T, Cheng X (2011) Aspect-based extractive summarization of online reviews. In: Proceedings of the ACM symposium on applied computing, 2011, pp 968–975

Lu Y, Zhai CX, Sundaresan N (2009) Rated aspect summarization of short comments. In: WWW’09—proc. 18th int. world wide web conf., pp 131–140

Zhu JZM, Wang H, Tsou BK (2009) Aspect-based sentence segmentation for sentiment summarization. In: International conference on information and knowledge management, proceedings, 2009, pp 65–72

Blair-Goldensohn S, Hannan K, McDonald R, Neylon T, Reis GA, Reynar J (2008) Building a sentiment summarizer for local service reviews. In: Proceedings of the WWW2008 workshop: NLP in the information explosion era (NLPIX 2008)

Hong M, Wang H (2021) Research on customer opinion summarization using topic mining and deep neural network. Math Comput Simul 185:88–114

Article   MathSciNet   Google Scholar  

Balahur A, Lloret E, Boldrini E, Montoyo A, Palomar M, Martínez-barco P (2009) Summarizing threads in blogs using opinion polarity. In: Events in emerging text types (eETTs) - Borovets, Bulgaria, 2009, pp 23–31

Stoyanov V, Cardie C (2006) Partially supervised coreference resolution for opinion summarization through structured rule learning. In: COLING/ACL 2006—EMNLP 2006: 2006 conference on empirical methods in natural language processing, proceedings of the conference, 2006, pp 336–344

Amoudi G, Almansour A, Alghamdi HS (2022) Improved graph-based Arabic hotel review summarization using polarity classification. Appl Sci 12(21):10980

Wu P, Li X, Shen S, He D (2020) Social media opinion summarization using emotion cognition and convolutional neural networks. Int J Inf Manag 51(July 2019):101978

Abdi A, Shamsuddin SM, Aliguliyev RM (2018) QMOS: query-based multi-documents opinion-oriented summarization. Inf Process Manag 54(2):318–338

Balahur A, Kabadjov M, Steinberger J, Steinberger R, Montoyo A (2012) Challenges and solutions in the opinion summarization of user-generated content. J Intell Inf Syst 39(2):375–398

Kokkoras F, Ntonas ELK, Vlahavas I (2008) MOpiS: a multiple opinion summarizer. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 5138, pp 110–122

Darmawiguna IGM, Pradnyana GA, Jyotisananda IB (1810) Indonesian sentiment summarization for lecturer learning evaluation by using textrank algorithm. J Phys Conf Ser 1:2021

Sheikh AA, Arif T, Malik MB, Bhat SI (2021) Extraction and summarization of reviews using lexicon based approach. IOP Conf Ser Mater Sci Eng 1022(1):0–7

Ali SM, Noorian Z, Bagheri E, Ding C, Al-Obeidat F (2018) Topic and sentiment aware microblog summarization for twitter. J Intell Inf Syst 54(1):129–156

Dadhich A, Thankachan B (2021) Social and juristic challenges of AI for opinion mining approaches on Amazon and flipkart product reviews using machine learning algorithms. SN Comput Sci 2(3):180

Hou T, Yannou B, Leroy Y, Poirson E (2019) Mining customer product reviews for product development: a summarization process. Expert Syst Appl 132:141–150

Wang WM, Li Z, Tian ZG, Wang JW, Cheng MN (2018) Extracting and summarizing affective features and responses from online product descriptions and reviews: a Kansei text mining approach. Eng Appl Artif Intell 73(October 2017):149–162

Zhang R, Yu W, Sha C, He X, Zhou A (2015) Product-oriented review summarization and scoring. Front Comput Sci 9(2):210–223

Wang D, Zhu S, Li T (2013) SumView: a web-based engine for summarizing product reviews and customer opinions. Expert Syst Appl 40(1):27–33

Di Fabbrizio G, Aker A, Gaizauskas R (2013) Summarizing online reviews using aspect rating distributions and language modeling. IEEE Intell Syst 28(3):28–37

Xu K, Liao SS, Li J, Song Y (2011) Mining comparative opinions from customer reviews for Competitive Intelligence. Decis Support Syst 50(4):743–754

Li Q, Jin Z, Wang C, Zeng DD (2016) Mining opinion summarizations using convolutional neural networks in Chinese microblogging systems. Knowl-Based Syst 107:289–300

Valarmathi B, Palanisamy V (2011) Opinion mining classification using key word summarization based on singular value decomposition. Int J Comput Sci Eng 3(1):212–215

Zheng Y, Li X, Su G, Ma J, Ning C (2020) Position-aware hybrid attention network for aspect-level sentiment analysis. In: Lect. notes comput. sci. (including subser. lect. notes artif. intell. lect. notes bioinformatics), vol 12285 LNCS, no 2, pp 83–95

Maulidiah Elfajr N, Sarno R (2018) Sentiment analysis using weighted emoticons and SentiWordNet for Indonesian language. In: Proceedings - 2018 international seminar on application for technology of information and communication: creative technology for human life, iSemantic 2018, pp 234–238

Marstawi A, Sharef NM, Aris TNM, Mustapha A (2017) Ontology-based aspect extraction for an improved sentiment analysis in summarization of product reviews. In: Proceedings of the 8th international conference on computer modeling and simulation, {ICCMS} 2017. Canberra, Australia, January 20–23, 20172017, pp 100–104

Lloret E, Balahur A, Gómez JM, Montoyo A, Palomar M (2012) Towards a unified framework for opinion retrieval, mining and summarization. J Intell Inf Syst 39(3):711–747

Meng X, Wei F, Liu X, Zhou M, Li S, Wang H (2012) Entity-centric topic-oriented opinion summarization in twitter. In: Proc. ACM SIGKDD int. conf. knowl. discov. data min., pp 379–387

Sarvabhotla K, Pingali P, Varma V (2011) Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents. Inf Retr 14(3):337–353

Weng J, Yang C-L, Chen B-N, Wang Y-K, Lin S-D (2011) IMASS: an intelligent Microblog analysis and summarization system. In: ACL HLT 2011—49th annual meeting of the association for computational linguistics: human language technologies, proceedings of student session, pp 133–138

Gerani S, Carenini G, Ng RT (2019) Modeling content and structure for abstractive review summarization. Comput Speech Lang 53:302–331

Rakesh V, Ding W, Ahuja A, Rao N, Sun Y, Reddy CK (2018) A sparse topic model for extracting aspect-specific summaries from online reviews. In: Web conf. 2018—proc. world wide web conf. WWW 2018, pp 1573–1582

Wu H, Gu Y, Sun S, Gu X (2016) Aspect-based opinion summarization with convolutional neural networks. In: Proc. int. jt. conf. neural networks, vol 2016-Octob, pp 3157–3163

Gu X, Gu Y, Wu H (2017) Cascaded convolutional neural networks for aspect-based opinion summary. Neural Process Lett 46(2):581–594

Nguyen HT, Le T, Le Nguyen M (2019) Opinions summarization: aspect similarity recognition relaxes the constraint of predefined aspects. In: Int. conf. recent adv. nat. lang. process. RANLP, vol 2019-Septe, pp 487–496

Hu M, Liu B (2004) Mining opinion features in customer reviews. In: Proceedings of the national conference on artificial intelligence, 2004, pp 755–760

Shimada K, Tadano R, Endo T (2011) Multi-aspects review summarization with objective information. Procedia Soc Behav Sci 27:140–149

Titov I, McDonald R (2008) A joint model of text and aspect ratings for sentiment summarization. In: ACL-08: HLT—46th annual meeting of the association for computational linguistics: human language technologies, proceedings of the conference, pp 308–316

Amarouche K, Benbrahim H, Kassou I (2018) Customer product review summarization over time for competitive intelligence. J Autom Mob Robot Intell Syst 12(4):70–82

Tadano R, Shimada K, Endo T (2010) Multi-aspects review summarization based on identification of important opinions and their similarity. In: PACLIC 24—proceedings of the 24th Pacific Asia conference on language, information and computation, 2010, no 2008, pp 685–692

Saeed RMK, Rady S, Gharib TF (2022) An ensemble approach for spam detection in Arabic opinion texts. J King Saud Univ Comput Inf Sci 34(1):1407–1416

Chamid AA (2023) Graph-based semi-supervised deep learning for Indonesian aspect-based sentiment analysis

Khan MR, Kannan R (2017) Extracting sentiments and summarizing health reviews from social media using machine learning techniques. Trans Mach Learn Artif Intell 6(1):24

Mane VL, Panicker SS, Patil VB (2015) Summarization and sentiment analysis from user health posts. In: 2015 international conference on pervasive computing: advance communication technology and application for society, ICPC 2015

Liang J, Bao J, Wang Y, Wu Y, He X, Zhou B (2021) CUSTOM: aspect-oriented product summarization for E-commerce. In: International conference on natural language processing and chinese computing, 2021, pp 124–136

Li Y (2013) Deriving market intelligence from microblogs. Decis Support Syst 55(1):206–217

Li H, Wang Y, Mou X, Peng Q (2020) Sentiment classification of financial microblogs through automatic text summarization. In: Proc.—2020 Chinese autom. congr. CAC 2020, pp 5579–5584

Huang Y, Yu Z, Guo J, Yu Z, Xian Y (2020) Legal public opinion news abstractive summarization by incorporating topic information. Int J Mach Learn Cybern 11(9):2039–2050

Dong R, O’Mahony MP, Schaal M, McCarthy K, Smyth B (2016) Combining similarity and sentiment in opinion mining for product recommendation. J Intell Inf Syst 46(2):285–312

Bai P, Xia Y, Xia Y (2020) Fusing knowledge and aspect sentiment for explainable recommendation. IEEE Access 8:137150–137160

Ouyang Y (2017) SentiStory: multi-grained sentiment analysis and event summarization with crowdsourced social media data. Pers Ubiquitous Comput 21(1):97–111

Bražinskas A (2022) Low- and high-resource opinion summarization

Mukherjee R, Peruri HC, Vishnu U, Goyal P, Bhattacharya S, Ganguly N (2020) Read what you need: controllable aspect-based opinion summarization of tourist reviews. In: SIGIR 2020—proc. 43rd int. ACM SIGIR conf. res. dev. inf. retr., pp 1825–1828

Siledar T, Makwana J, Bhattacharyya P (2023) Aspect-sentiment-based opinion summarization using multiple information sources. In: ACM int. conf. proceeding ser., pp 55–61

Ouerhani N, Maalel A, Ben Ghézala H (2022) SMAD: SMart assistant during and after a medical emergency case based on deep learning sentiment analysis: the pandemic COVID-19 case. Cluster Comput 25(5):3671–3681

Yadav A, Patel A, Shah M (2021) A comprehensive review on resolving ambiguities in natural language processing. AI Open 2(July):85–92

Eke CI, Norman AA, Shuib L, Nweke HF (2020) Sarcasm identification in textual data: systematic review, research challenges and open directions. Artif Intell Rev 53(6):4215–4258

Carenini G (2008) Summarizing emails with conversational cohesion and subjectivity. In: ACL-08: HLT—46th annual meeting of the association for computational linguistics: human language technologies, proceedings of the conference, pp 353–361

Chaturvedi I, Cambria E, Welsch RE, Herrera F (2018) Distinguishing between facts and opinions for sentiment analysis: survey and challenges. Inf Fusion 44:65–77

Kansal H, Toshniwal D (2014) Aspect based summarization of context dependent opinion words. Procedia Comput Sci 35:166–175

Cambria E, Hussain A, Durrani T, Havasi C, Eckl C, Munro J (2010) Sentic computing for patient centered applications. In: International conference on signal processing proceedings, ICSP, 2010, pp 1279–1282

Verma K, Davis B (2021) Implicit aspect-based opinion mining and analysis of airline industry based on user-generated reviews. SN Comput Sci 2(4):1–9

Bathla G, Singh P, Singh RK, Cambria E, Tiwari R (2022) Intelligent fake reviews detection based on aspect extraction and analysis using deep learning. Neural Comput Appl 34:20213–20229

Mabokela KR, Celik T, Raborife M (2023) Multilingual sentiment analysis for under-resourced languages: a systematic review of the landscape. IEEE Access 11(October 2022):15996–16020

Zou Y, Zhu B, Hu X, Gui T, Zhang Q (2021) Low-resource dialogue summarization with domain-agnostic multi-source pretraining. In: EMNLP 2021 - 2021 conf. empir. methods nat. lang. process. proc., pp 80–91

Liu C (2015) IncreSTS: towards real-time incremental short text summarization on comment streams from social network services. IEEE Trans Knowl Data Eng 27(11):2986–3000

Guerra PHC, Veloso A, Meira W, Almeida V (2011) From bias to opinion: A transfer-learning approach to real-time sentiment analysis. In: Proc. ACM SIGKDD int. conf. knowl. discov. data min., no. August, pp 150–158

Hu M, Liu B (2006) Opinion extraction and summarization on the web. Proc Natl Conf Artif Intell 2:1621–1624

Kangale A, Kumar S, Naeem MA, Williams M, Tiwari M (2015) Mining consumer reviews to generate ratings of different product attributes while producing feature- based review-summary. Int J Syst Sci 47(February 2016):3272–3286

Hariharan S, Srimathi R, Sivasubramanian M (2010) Opinion mining and summarization of reviews in web forums. In: COMPUTE 2010—the 3rd annual ACM Bangalore conference

Kim Amplayo R, Brazinskas A, Suhara Y, Wang X, Liu B (2022) Beyond opinion mining: summarizing opinions of customer reviews, vol 1, no 1. Association for Computing Machinery

Ardilla ZN, Sari TI, Hayatin N, Fatichah C Sarcasm detection on news headline bidirectional-LSTM with glove embeddings using multilayer, pp 2–7

Liu M, Shang Y, Yue Q, Zhou J (2021) Detecting fake reviews using multidimensional representations with fine-grained aspects plan. IEEE Access 9:3765–3773

Dragoni M, Federici M, Rexha A (2019) An unsupervised aspect extraction strategy for monitoring real-time reviews stream. Inf Process Manag 56(3):1103–1118

AbdulAziz A, Starkey A (2020) Predicting supervise machine learning performances for sentiment analysis using contextual-based approaches. IEEE Access 8:17722–17733

Banerjee A, Bhattacharjee M, Ghosh K, Chatterjee S (2020) Synthetic minority oversampling in addressing imbalanced sarcasm detection in social media. Multimed Tools Appl 79(47–48):35995–36031

Xu H, Liu H, Jiao P, Wang W (2021) Transformer reasoning network for personalized review summarization. SIGIR 2021:1452–1461

Zuo E, Zhao H, Chen B, Chen Q (2020) Context-specific heterogeneous graph convolutional network for implicit sentiment analysis. IEEE Access 8:37967–37975

Elsahar H, Coavoux M, Gallé M, Rozen J (2021) Self-supervised and controlled multi-document opinion summarization. In: EACL 2021—16th conf. eur. chapter assoc. comput. linguist. proc. conf. , pp 1646–1662

Akhtar MS, Sawant P, Sen S, Ekbal A, Bhattacharyya P (2018) Improving word embedding coverage in less-resourced languages through multi-linguality and cross-linguality: a case study with aspect-based sentiment analysis. In: ACM trans. Asian low-resour. lang. inf. process., vol 18, no 2

Chaturvedi I, Ragusa E, Gastaldo P, Zunino R, Cambria E (2018) Bayesian network based extreme learning machine for subjectivity detection. J Frankl Inst 355(4):1780–1797

Eke CI, Norman AA, Shuib L (2021) Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and BERT model. IEEE Access 9:48501–48518

Akhtar N, Zubair N, Kumar A, Ahmad T (2017) Aspect based Sentiment oriented summarization of hotel reviews. Procedia Comput Sci 115:563–571

Tran TA, Duangsuwan J, Wettayaprasit W (2021) Automatic aspect-based sentiment summarization for visual, structured, and textual summaries. ECTI Trans Comput Inf Technol 15(1):50–72

Download references

Acknowledgements

This work is supported by: Kementerian Pengajian Tinggi Malaysia, Fundamental Research Grant Scheme (FRGS) by code number FRGS/1/2020/ICT02/UMS/02/2; Language Engineering and Application Development (LEAD) research group of Faculty of Computing and Informatics, Universiti Malaysia Sabah; and Lembaga Pengembangan Publikasi Ilmiah (LPPI) University of Muhammadiyah Malang (UMM), Indonesia.

The authors have no financial or proprietary interests in any material discussed in this article.

Author information

Authors and affiliations.

Faculty of Computing and Informatics, Universiti Malaysia Sabah, Kota Kinabalu, Malaysia

Nur Hayatin, Suraya Alias & Lai Po Hung

Informatics Department of Engineering Faculty, University of Muhammadiyah Malang, Malang, Indonesia

Nur Hayatin

You can also search for this author in PubMed   Google Scholar

Contributions

Nur Hayatin conducted the experiment and composed the manuscript with the assistance of Suraya Alias, who provided project supervision and collaborated with Lai Po Hung in manuscript review.

Corresponding author

Correspondence to Suraya Alias .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Hayatin, N., Alias, S. & Hung, L.P. Trends and challenges in sentiment summarization: a systematic review of aspect extraction techniques. Knowl Inf Syst (2024). https://doi.org/10.1007/s10115-024-02075-w

Download citation

Received : 10 March 2023

Revised : 25 January 2024

Accepted : 05 February 2024

Published : 09 May 2024

DOI : https://doi.org/10.1007/s10115-024-02075-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • CCS concepts
  • Semantic and reasoning
  • Artificial intelligence
  • Machine learning
  • Aspect extraction
  • Sentiment analysis
  • Sentiment summarization
  • Find a journal
  • Publish with us
  • Track your research

COMMENTS

  1. Survey on sentiment analysis: evolution of research methods and topics

    Sentiment analysis, one of the research hotspots in the natural language processing field, has attracted the attention of researchers, and research papers on the field are increasingly published. Many literature reviews on sentiment analysis involving techniques, methods, and applications have been produced using different survey methodologies and tools, but there has not been a survey ...

  2. Sentiment Analysis

    **Sentiment Analysis** is the task of classifying the polarity of a given text. For instance, a text-based tweet can be categorized into either "positive", "negative", or "neutral". Given the text and accompanying labels, a model can be trained to predict the correct sentiment. **Sentiment Analysis** techniques can be categorized into machine learning approaches, lexicon-based approaches, and ...

  3. What Is Sentiment Analysis?

    Sentiment analysis, or opinion mining, is the process of analyzing large volumes of text to determine whether it expresses a positive sentiment, a negative sentiment or a neutral sentiment. Companies now have access to more data about their customers than ever before, presenting both an opportunity and a challenge: analyzing the vast amounts of ...

  4. The Evolution of Sentiment Analysis

    We present the top-20 cited papers from Google Scholar and Scopus and a taxonomy of research topics. In recent years, sentiment analysis has shifted from analyzing online product reviews to social ...

  5. sentiment analysis Latest Research Papers

    Sentiment Analysis (SA) is a Natural Language Processing (NLP) and an Information Extraction (IE) task that primarily aims to obtain the writer's feelings expressed in positive or negative by analyzing a large number of documents. SA is also widely studied in the fields of data mining, web mining, text mining, and information retrieval.

  6. The evolution of sentiment analysis—A review of research topics, venues

    This technique allows a deeper understanding of different research topics of an area, sentiment analysis in our case, by providing a tree like semantic structure. Sixth, we review the top-cited papers according Scopus and Google Scholar to show the hallmarks of sentiment analysis research. This paper is structured as follows.

  7. A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research

    Sentiment analysis is a critical subfield of natural language processing that focuses on categorizing text into three primary sentiments: positive, negative, and neutral. With the proliferation of online platforms where individuals can openly express their opinions and perspectives, it has become increasingly crucial for organizations to comprehend the underlying sentiments behind these ...

  8. The Evolution of Sentiment Analysis

    published after 2004. Sentiment analysis papers are scattered to multiple publication venues, and the combined number of papers in the top-15 venues only represent ca. 30% of the papers in total. We present the top-20 cited papers from Google Scholar and Scopus and a taxonomy of research topics. In recent years, sentiment analysis has shifted from

  9. The evolution of sentiment analysis—A review of research topics, venues

    This technique allows a deeper understanding of different research topics of an area, sentiment analysis in our case, by providing a tree like semantic structure. Sixth, we review the top-cited papers according Scopus and Google Scholar to show the hallmarks of sentiment analysis research. This paper is structured as follows.

  10. The Evolution of Sentiment Analysis

    Sentiment analysis is one of the fastest growing research areas in computer science, making it challenging to keep track of all the activities in the area. We present a computer-assisted literature review, where we utilize both text mining and qualitative coding, and analyze 6,996 papers from Scopus. We find that the roots of sentiment analysis are in the studies on public opinion analysis at ...

  11. What is Sentiment Analysis?

    Sentiment analysis is a subset of natural language processing (NLP) that focuses on extracting and understanding the emotional content from data. The primary objective is to classify the polarity of a text as positive, negative, or neutral. This classification is essential for understanding customer sentiment, gauging public opinion, and ...

  12. Sentiment Analysis Guide

    Sentiment analysis is a vast topic, and it can be intimidating to get started. Luckily, there are many useful resources, from helpful tutorials to all kinds of free online tools, to help you take your first steps. ... Sentiment Analysis Research & Courses. After learning the basics of sentiment analysis, and understanding how it can help you ...

  13. Sentiment Analysis Projects & Topics For Beginners [2024]

    Sentiment analysis is a kind of data mining where you measure the inclination of people's opinions by using NLP (natural language processing), text analysis, and computational linguistics. We perform sentiment analysis mostly on public reviews, social media platforms, and similar sites.

  14. Sentiment Analysis: A Complete Guide [Updated for 2023]

    Sentiment analysis, also known as opinion mining, is the process of determining the emotions behind a piece of text. Sentiment analysis aims to categorize the given text as positive, negative, or neutral. Furthermore, it then identifies and quantifies subjective information about those texts with the help of: 2.

  15. Sentiment Analysis

    Sentiment analysis is the area which deals with judgments, responses as well as feelings, which is generated from texts, being extensively used in fields like data mining, web mining, and social media analytics because sentiments are the most essential characteristics to judge the human behavior. This particular field is creating ripples in both research and industrial societies.

  16. Sentiment Analysis and How to Leverage It

    Sentiment analysis is a powerful tool that offers a number of advantages, but like any research method, it has some limitations. Advantages of sentiment analysis: Accurate, unbiased results; Enhanced insights; More time and energy available for staff do to higher-level tasks; Consistent measures you can use to track sentiment over time

  17. 10 Sentiment Analysis Project Ideas with Source Code [2024]

    Sentiment analysis of citation contexts in research/review papers is an unexplored field, primarily because of the existing myth that most research papers have a positive citation. ... Sentiment Analysis Based on News Topics during COVID-19. It's been over a year since the first lockdown in many countries worldwide because of the COVID-19 ...

  18. Survey on sentiment analysis: evolution of research methods and topics

    When research on sentiment analysis was still in its infancy, the contents and topics of surveys mainly focused on sentiment analysis tasks, analysis granularity, and application areas. Kumer et al. reviewed the basic terms, tasks, and levels of granularity related to sentiment analysis (Kumar and Sebastian 2012 ).

  19. In The Beginning, Let There Be The Word: Challenges and Insights in

    This research delves into the application of sentiment analysis to political communication. To address the limitations of the Bag of Words methodology, a comparative study of sentiment analysis tools and emotion detection from speech is conducted, using automated speech recognition as a benchmark.

  20. Sentiment Analysis

    Sentiment analysis is a specific subtask within the broad area of opinion mining; in short, the classification of texts according to the emotion that the text appears to convey. Sentiment analysis typically classifies texts according to positive, negative and neutral classifications; so that " This movie is great!" is classified as positive, while "This movie was too long and I got bored ...

  21. The public attitude towards ChatGPT on reddit: A study based on

    In this part, two sentiment analysis models, Vader and Textblob, are assigned weights of 0.6 and 0.4 respectively for sentiment classification to explore research question 2. The sentiment analysis categorizes the emotional tone of the entries into three distinct parameters: positive, negative, and neutral.

  22. Deciphering Lending Behaviors in Peer-to-Peer Platforms: An Integrated

    This research investigates the influence of borrower-generated content on Peer-to-Peer (P2P) lending outcomes, specifically focusing on the sentiment and thematic content of loan descriptions. ... An Integrated Analysis of Emotion, Topic Modeling, and User-defined Occupational Data ... emphasizing the importance of sentiment analysis, theme ...

  23. Reddit's Analysis on the Efficacy of FDA Approved Leading

    This comprehensive conversation analysis report delves into the most popular weight loss drugs on Reddit in the last six months (October 1st, 2023, to March 31st, 2024), offering a nuanced ...

  24. Sentiment analysis researches story narrated by topic modeling approach

    This paper brings forward a comprehensive study about main research topics, research trends, and comparisons of research topics" in the field of "sentiment analysis" through "social media" using topic modeling, in specific LDA. The findings of this paper prove that "machine learning" methods are among the most important topics the ...

  25. 4 Best Monthly Dividend ETFs For May 2024

    The brain trust at Forbes has run the numbers, conducted the research, and done the analysis to come up with some of the best places for you to make money in 2024. Download Forbes' most popular ...

  26. Trends and challenges in sentiment summarization: a ...

    Sentiment Summarization is an automated technology that extracts important features of sentences and then reorganizes selected words or sentences by their aspect class and sentiment polarity. This emerging research area wields considerable influence, where a sentiment-based summary can provide insight into users' subjective opinions, creating social engagement that benefits industry players ...