By Heather Hansman • July 30, 2014 • Pacific Standard
Would you say that you’re happy?
Researchers find a universal positivity bias in the way we talk, tweet, and write.
That’s actually a leading question; you already do. Or at least people in general do. In different languages, and across different modes of communication—from Twitter to A Tale of Two Cities—we use more positive words than negative ones. We talk, write, and tweet with glee, and we always have.
A group of applied mathematicians and computer scientists at the University of Vermont that is fascinated with tracking happiness just published a study that quantifies how much more frequently we use happy words than sad ones. As it turns out, it’s a lot. In the recent study, the group looked at 100,000 words in 24 classic novels across 10 languages and found that we have a global, long-term tendency toward positivity.
They’re not the first people to look at a universal positivity bias. In the late 1970s two psychologists, Margaret Matlin and David Stang, put forward a theory they called the Pollyanna principle, which holds that we subconsciously lean toward positive thinking and language, even when we’re consciously focusing on negative things. Basically, our brains want to be happy. It’s a nice idea, but, since happiness is abstract, Matlin and Stang never settled on any widely accepted way to prove it.
That’s where the UVM math nerds come in. Since 2009, they’ve been looking at ways to quantify happiness, and have decided that language is the most tangible way to do so. Chris Danforth, a mathematician with a background in chaos theory, who leads the team along with Peter Dodds, whose research focuses on sociotechnical problems, says that his team is trying to measure population-scale happiness as a baseline to improve quality of life. They think happiness is just as important as GDP, or other frequently tracked measurements of well-being. They’re essentially trying to solve a social problem with a math problem. “Happiness is a hard thing to quantify,” Danforth says. “It’s quite hard to improve something you can’t measure, so we’re trying to create an instrument capable of quantifying happiness on a large scale.”
To track happiness they had to figure out what signaled the feeling and then decide how best to measure that. That ability to track emotion, which is part of a broader field called sentiment analysis, is a nut that everyone from Facebook to the National Security Agency (NSA) is trying to crack, and Dodds and Danforth believe they have found a granular way to do it.
First, they sorted out the 5,000 most commonly used words in English. They asked people to rate how positive those words were on a scale of one to nine. Subjects were given a list of words, and images of 10 stick figures, which showed a range of emotion from sad to happy, and asked to correlate them. Words like “amazing” and “summer” rated highly positive, while “terrorist” rated negative. With that information in hand they could easily track how frequently positive words showed up in large bodies of language.
Deciphering word happiness isn’t exactly straightforward because language is a moving target—“sick,” for instance, can be highly negative or positive—which is why the team looked at a large pool of words, and why drawing on books made sense for their most recent study. They needed a big sample size to draw any significant conclusions. “The math behind it is actually quite basic,” says Kameron Decker Harris, one of the grad students who worked on the project.
Before their most recent study, the team looked at songs, blog posts, and State of the Union speeches, because those transcripts were easily available. Then, as Twitter became popular, they turned to that, to access a huge volume of real-time data. Using Twitter, the group can track levels of positivity, as well as where people are using those positive words, and what they’re talking about. They built what they call a hedonometer, which is a daily report of the geography and timing of happiness. Recently, negativity spiked around the conflict in Israel and Germany beating Brazil in the World Cup. Vermont is currently the happiest state, they might argue, because the hedonometer shows Vermonters are using more positive words than anyone else in the country. For instance, statewide use of the word “shit” is way down this month.
The group’s most recent study, the one that looks specifically at books—from Moby Dick to The Adventures of Tom Sawyer—solves a few different problems they had discovered in their data. They wanted to track happiness over a longer period of time, and outside of outward facing social media and blogs, where people might project happiness to maintain an image. Danforth says they also wanted to branch out beyond America. “We were looking to sharpen the resolution of our instrument by compiling a large list of happiness ratings for words in many languages,” he says says. So they mined 24 books in 10 languages to see if we’ve always had a positivity bias, and if people in Germany or Japan are less effusive.
The happiness lean showed up consistently in all languages, and in books ranging from Alice in Wonderland to Ulysses. Using books, the group was also able to track the shape of stories, an old Vonnegut trope that proves to be true. Word choice reflects the emotional content of the books.
Danforth says that he was surprised at how consistent and widespread the positivity bias they found was. “I knew about the Pollyanna principle, our tendency to be subconsciously optimistic, but didn’t expect it to be so deeply ingrained in our modes of communication. I find it surprising that positive words are used more frequently not only in social media, where people may put on a positive face, but also in books, news articles, music lyrics, movie subtitles, etc. And the phenomenon appears to be independent of language. It really is baked in.”
So, if that positivity bias is ingrained, what’s the point of tracking happiness? And how does knowing how happy we say we are make us happier? Danforth says that his group has found that word usage frequencies correlate strongly with various health and demographic measures. When we say we’re happy we actually are happy, so by tracking when, where, and why people are happy, they can, theoretically, up those factors. Now that they have a baseline they’re looking at the long game. “I think the most practical application will be in measuring changes in societal health at a time scale relevant to policymakers,” he says.
~ Heather Hansman is a Seattle-based freelance writer and a former editor at Powder and Skiing. Follow her on Twitter @hhansman.
* * * * *
Here is the abstract to the full paper, via arXiv:1406.3855v1 [physics.soc-ph]. You can download the PDF of paper at that link.
Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, Christopher M. Danforth
(Submitted on 15 Jun 2014)
Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias is strongly independent of frequency of word usage. Alongside these general regularities, we describe inter-language variations in the emotional spectrum of languages which allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.