Sentiment analysis is the automated analysis of content like text or speech as positive, neutral or negative. That positive or negative analysis can be considered looking at the polarity of the words and a result of sentiment analysis is often a polarity score. This type of scoring allows organizations to better understand how constituents see them. This might mean analyzing Twitter or Facebook, tagging a sentiment score on blog posts or comments on posts.
Using technology like sentiment analysis we get insight into how clients and employees see products, the company, support cases, or services. This is usually done using natural language processing part of a body of work we tend to generically call machine learning. Sentiment analysis has become key for many industries to figure out what people think and to then improve the experience with products and address any branding concerns or promote positive branding when possible.
While natural language techniques in computing began all the way back in the 1950s with Alan Turing’s amazing article, Computing Machinery and Intelligence. The philosophies, models, articles, and software grew steadily into a body of work over the course of the next few decades.
The explosion of data in the last twenty years then became impossible to analyze manually. Suddenly there were much broader uses for the techniques already under development. Suddenly we could isolate and classify content and even score feelings, which is where sentiment analysis comes into play.
How Sentiment Analysis functions
Sentiments are subjective and usually refer to opinions, and emotions. This means we’re putting a quantifiable overlay on top of otherwise qualitative data. Still, these aren’t facts. For example, the word “bad” might be considered to have a negative polarity but “that’s not bad” might be considered a bit more neutral.
Because sentiment analysis has grown as a discipline over the years, different branches have different needs and so a variety of strategies to identify exactly what sentiments are contained in text being analyzed. Subjectivity/Objectivity Identification involves classifying sentences into a subjective or objective grouping. Feature or Aspect-Based is another branch of sentiment analysis that isolates what opinions are related to an entity and so picks up on nuance rather than just isolating a score. Neither are simple, but scoring Subjectivity/Objectivity is much more common.
The uses of sentiment analysis are becoming more ubiquitous. Once considered ground-breaking, these days with frameworks available for popular programming languages and example code spread across a variety of social coding websites, organizations can get started relatively quickly. While the implementation can be quick, removing unnecessary information or training code to interpret information in a way specific to a need can be fairly time consuming. This is usually referred to as training a machine learning model.
The more we train our model, the more significant our findings, or insights can be. This means we’re going from a generic lexicon of words and their score to a more specialized analysis. This isn’t necessarily required as you can easily quantify opinions and accept a bell-curve type of approach where the typical response becomes a baseline. This allows you to quickly start tracking the normal reactions and comparing others against that. But over time most organizations begin partitioning that information and looking for specialized data to train.
Let’s look at one example of quickly getting started using sentiment analysis and how that use case might expand to include training a model. HandrailUX is a popular repository for tracking research data in user research. Researchers meet with customers at various companies and take notes or transcribe interviews. That data is qualitative in nature. We can export the data and then analyze the sentiment of each interaction generically using a simple process documented at https://krypted.com/ux-research/batch-process-sentiment-analysis-for-ux-research-studies/ that provides a script available to anyone to use for free.
Now, let’s say that the company doing the research caters to an industry that uses some pretty specialized words to describe things. In that project is a data.json file. In that file, we can define a blurb of text and then identify whether it’s positive or negative. This impacts the scoring by altering the way the algorithm interprets specific bits of text and can be used to then get a score with a higher confidence interval.
Drawbacks of Sentiment Analysis
Machine learning isn’t perfect. Vannevar Bush described what would become computers as devices that supplement humans but don’t replace us in his 1945 Atlantic article “As We May Think” and that article inspired many of the original pioneers in computing. Sentiment analysis is similar; it helps guide us in research but does not replace human interpretation. Quantitative and qualitative review of data should always overlay one another. As with most research findings, interpretation is key.
There will always be subtleties that machines can’t pick up on. We can analyze more data with the help of machine learning, though. Sentiment analysis can also be dangerous if you don’t train the model. Using a baseline is fine but requires a lot of review to see how accurate the machines findings are. The better training, the less human intervention required but the cost goes up so exponentially for each percentage point of accuracy it often becomes less expensive to hire larger teams of humans to analyze data.
Another concern with sentiment analysis is that humans create models to do machine learning. There are a lot of tools available and each is as good as the person who wrote it and the similarity to the use case the tool is being applied to. Test multiple tools and models to find one that works well.
Clean analysis requires clean data. Data often needs to be consistent and the humans that put data into databases or forums don’t always have that consistency in mind. Errors are always a concern with machine learning but more specifically when there’s bad data. Most of the time I’ve spent on machine learning projects has been to get data into a consistent and analyzable fashion.
But despite these drawbacks, machine learning is likely to be an integral, if not one of the most important parts of any organization.
Advantages of using Sentiment Analysis
You can learn a lot when you have a machine help synthesize the data. Using sentiment analysis, you can quickly see how customers feel about granular areas of an organization much, much faster than just reading through a whole lot of posts on the Internet or forums. The more reactions to your products out there, the harder it is to find, much less analyze that data.
Building tiny robots to help will get you further, faster. You can then drill down into very specific questions you need answered by combining sentiment analysis with recommender and/or classifier machine learning projects. And to do so without researcher bias. Findings and uses are far ranging and in some cases the value of the machine learning tools are not more than the value of entire companies.
Machine learning has helped cut down on fraud, helped put the most relevant content in front of us on social networks, been used to mine crazy amounts of data (and helps us interpret that data), and propel scientific discoveries. Machine learning is definitely the next wave of how computers augment the human experience.