Category Archives: computational linguistics

Detecting Spin in Political Speeches

New Scientist¬†ran a piece last week¬†about some text analysis technology that’s being developing at Queen’s University to detect “spin” in political speeches. The story hit a few years ago when they were using these algorithms to analyze text in the Canadian elections, but they’re at the fore again with the American political season heating up. The text analysis algorithms were¬†developed¬†by David Skillcorn and Ayron Little by looking at different frequencies of word usage to glean information about how truthful the speaker is being.¬†The model they used¬†was developed by James Pennebaker¬†at UT Austin and includes the following predictors of deception: (1) a¬†decreased frequency of first-person singular pronouns, (2) a¬†decreased frequency of exception words, such as `however’ and `unless’, (3) an increased frequency of negative emotion words, and (4) an increased frequency of action words.¬†

This is a fun story, and interesting technology but we really can’t take it too seriously. First of all I would be leery of trusting how much the computer extracted linguistic features accurately predict “spin.” The model was not developed to detect spin, but more generally deception. Can we assume the circumstances under which they were studying deception are the same as those that give rise to political speeches? Not least of which is that politicians routinely have other people write their speeches for them. It’s a stretch that the original research is generalizable here. ¬†An admission of the study is that features may vary depending on the content domain. Even the features they did extract could only predict deception 61% of the time. This is better than chance, but not by much, and only for the content they looked at in their study. Generalizable to politics? Unclear.

This kind of story brings up another issue that I’m interested in: the transparency of algorithms for doing content analysis. Whenever computers are making categorizations for us they’re making their decisions based on some imperfect rules. Well, what exactly ARE those rules? We need to know those as assumptions before we can really evaluate the results. Caveats, people, come on!¬†