How does sentiment analysis work in general?

What is sentiment analysis?

Sentiment analysis, also known as “mood recognition”, is based on the automated evaluation of user comments, which are used to determine whether a text is meant to be positive or negative. For this purpose, methods of "text mining" are used (see also data mining), ie the automatic analysis of texts that are written in natural language.

And that is exactly what leads to the first problem: natural language does not consist of positive and negative lists. Simple Analytical methods Search in the text for words that have a positive or negative meaning according to a previously created dictionary that matches the topic. This method may provide a very rough overview, but is hardly suitable for capturing the actual mood. Not even the frequency of words that are considered positive or negative in connection with the subjective evaluation of a product is meaningful.

Two customer ratings as an example: "Am thrilled!" and "Quite well, fulfills its purpose.". The first sentence contains one positive word, "enthusiastic", the second sentence contains two positive words, "good" and "fulfilled". A simple evaluation using statistical methods would rate the second sentence better; a person would at most rate it as mediocre, if not as negative. For a successful sentiment analysis, one must therefore use artificial intelligence tools.

In addition, especially in social networks, people formulate their opinions and statements as if they were talking to a friend, and not always according to the rules of German grammar. Many sentences have a completely different meaning in the overall context than they do on their own. Recognizing these nuances is a major challenge for the analysis tools. Furthermore, youth language in particular is characterized by short-term trends.

A sentiment analysis tool must have the target group and therefore know exactly the environment of the product to be analyzed. Machine learning methods help here, with which the tools can be gradually trained, which improves the quality of the results in the long term.