Category Archives: fact-checking

Fact-Checking at Scale

Note: this is cross-posted on the CUNY Tow-Knight Center for Entrepreneurial Journalism site. 

Over the last decade there’s been a substantial growth in the use of Fact-Checking to correct misinformation in the public sphere. Outlets like Factcheck.org and Politifact tirelessly research and assess the accuracy of all kinds of information and statements from politicians or think-tanks. But a casual perusal of these sites shows that there are usually only 1 or 2 fact-checks per day from any given outlet. Fact-Checking is an intensive research process that demands considerable skilled labor and careful consideration of potentially conflicting evidence. In a task that’s so labor intensive, how can we scale it so that the truth is spread far and wide?

Of late, Politifact has expanded by franchising its operations to states – essentially increasing the pool of trained professionals participating in fact-checking. It’s a good strategy, but I can think of at least a few others that would also grow the fact-checking pie: (1) sharpen the scope of what’s fact-checked so that attention is where it’s most impactful, (2) make use of volunteer, non-professional labor via crowdsourcing, and (3) automate certain aspects of the task so that professionals can work more quickly. In the rest of this post, I’ll flesh out each of these approaches in a bit more detail.

Reduce Fact-Checking Scope
“I don’t get to decide which facts are stupid … although it would certainly save me a lot of time with this essay if I were allowed to make that distinction.” argues Jim Fingal in his epic fact-check struggle with artist-writer John D’Agata in The Lifespan of a Fact. Indeed, some of the things Jim checks are really absurd: did the subject take the stairs or the elevator, did he eat “potatoes” or “french fries”; these things don’t matter to the point of that essay, nor, frankly, to me as the reader.

Fact-checkers, particularly the über-thorough kind employed by magazines, are tasked with assessing the accuracy of every claim or factoid written in an article (See the Fact Checker’s Bible for more). This includes hard facts like names, stats, geography, and physical properties as well as what sources claim via a quotation, or what the author writes from notes. Depending on the nature of the claim some of it may be subjective, opinion-based, or anecdotal. All of this checking is meant to protect the reputation of the publication and of the writers. To maintain trust with the public. But it’s a lot to check and the imbalance between content volume and critical attention will only grow.

To economize their attention fact-checkers might better focus on overall quality; who cares if they’re “potatoes” or “french fries”? In information science studies, the notion of quality can be defined as the “value or ‘fitness’ of the information to a specific purpose or use.” If quality is really what we’re after then fact-checking would be well-served and more efficacious if it focused the precious attention of fact-checkers on claims that have some utility. These are the claims that if they were false could impact the outcome of some event or an important decision. I’m not saying accuracy doesn’t matter, it does, but fact-checkers might focus more energy on information that impacts decisions. For health information this might involve spending more time researching claims that impact health-care options and choices; for finance it would involve checking information informing decisions about portfolios and investments. And for politics this involves checking information that is important for people’s voting decisions – something that the likes of Politifact already focus on.

Increased Use of Volunteer Labor
Another approach to scaling fact-checking is to incorporate more non-professionals, the crowd, in the truth-seeking endeavor. This is something often championed by social media journalists like Andy Carvin, who see truth-seeking as an open process that can involve asking for (and then vetting) information from social media participants. Mathew Ingram has written about how platforms like Twitter and Reddit can act as crowdsourced fact-checking platforms. And there have been several efforts toward systematizing this, notably the TruthSquad, which invited readers to post links to factual evidence that supports or opposes a single statement. A professional journalist would then write an in-depth report based on their own research plus whatever research the crowd contributed. I will say I’m impressed with the kind of engagement they got, though sadly it’s not being actively run anymore.

But it’s important to step back and think about what the limitations of the crowd in this (or any) context really are. Graves and Glaisyer remind us that we still don’t really know how much an audience can contribute via crowdsourced fact-checking. Recent information quality research by Arazy and Kopak gives us some clues about what dimensions of quality may be more amenable to crowd contributions. In their study they looked at how consistent ratings of various wikipedia articles were along dimensions of accuracy, completeness, clarity, and objectivity. They found that, while none of these dimensions had particularly consistent ratings, completeness and clarity were more reliable than objectivity or accuracy. This is probably because it’s easier to use a heuristic or shortcut to assess completeness, whereas rating accuracy requires specialized knowledge or research skill. So, if we’re thinking about scaling fact-checking with a pro-am model we might have the crowd focus on aspects of completeness and clarity, but leave the difficult accuracy work to the professionals.

#Winning with Automation
I’m not going to fool anyone by claiming that automation or aggregation will fully solve the fact-checking scalability problem. But there may be bits of it that can be automated, at least to a degree where it would make the life of a professional fact-checker easier or make their work go faster. An automated system could allow any page online to be quickly checked for misinformation. Violations could be flagged and highlighted, either for lack of corroboration or for controversy, or the algorithm could be run before publication so that a professional fact-checker could take a further crack at it.

Hypothetical statements, opinions and matters of taste, or statements resting on complex assumptions may be too hairy for computers to deal with. But we should be able to automatically both identify and check hard-facts and other things that are easily found in reference materials. The basic mechanic would be one of corroboration, a method often used by journalists and social scientists in truth-seeking. If we can find two (or more) independent sources that reinforce each other, and that are credible, we gain confidence in the truth-value of a claim. Independence is key, since political, monetary, legal, or other connections can taint or at least place contingencies on the value of corroborated information.

There have already been a handful of efforts in the computing research literature that have looked at how to do algorithmic corroboration. But there is still work to do to define adequate operationalizations so that computers can do this effectively. First of all, we need to define, identify, and extract the units that are to be corroborated. Computers need to be able to differentiate a factually stated claim from a speculative or hypothetical one, since only factual claims can really be meaningfully corroborated. In order to aggregate statements we then need to be able to match two claims together while taking into account different ways of saying similar things. This includes the challenge of context, the tiniest change in which can alter the meaning of a statement and make it difficult for a computer to assess the equivalence of statements. Then, the simplest aggregation strategy might consider the frequency of a statement as a proxy for its truth-value (the more sources that agree with statement X, the more we should believe it) but this doesn’t take into the account the credibility of the source or their other relationships, which also need to be enumerated and factored in. We might want algorithms to consider other dimensions such as the relevance and expertise of the source to the claim, the source’s originality (or lack thereof), the prominence of the claim in the source, and the source’s spatial or temporal proximity to the information. There are many challenges here!

Any automated corroboration method would rely on a corpus of information that acts as the basis for corroboration. Previous work like DisputeFinder has looked at scraping or accessing known repositories such as Politifact or Snopes to jump-start a claims database, and other work like Videolyzer has tried to leverage engaged people to provide structured annotations of claims. Others have proceeded by using the internetas a massive corpus. But there could also be an opportunity here for news organizations, who already produce and have archives of lots of credible and trustworthy text (e.g. rigorously fact-checked magazines), to provide a corroboration service based on all of the claims embedded in those texts. Could news organizations even make money by syndicating their archives like this?

There are of course other challenges to fact-checking that also need to be surmounted, such as the user-interface for presentation or how to effectively syndicate fact-checks across different media. In this essay I’ve argued that scale is one of the key challenges to fact-checking. How can we balance scope with professional, non-professional, and computerized labor to get closer to the truth that really matters?

 

Moving Towards Algorithmic Corroboration

Note: this is cross-posted on the Berkman/MIT “Truthiness in Digital Media” blog

One of the methods that truth seekers like journalists or social scientists often employ is corroboration. If we find two (or more) independent sources that reinforce each other, and that are credible, we gain confidence in the truth-value of a claim. Independence is key, since political, monetary, legal, or other connections can taint or at least place contingencies on the value of corroborated information.

How can we scale this idea to the web by teaching computers to effectively corroborate information claims online? An automated system could allow any page online to be quickly checked for misinformation. Violations could be flagged and highlighted, either for lack of corroboration or for a multi-faceted corroboration (i.e. a controversy).

There have already been a handful of efforts in the computing research literature that have looked at how to do algorithmic corroboration. But there is still work to do to define adequate operationalizations so that computers can be effective corroborators. First of all, we need to define and extract the units that are to be corroborated. Computers need to be able to differentiate a factually stated claim from a speculative or hypothetical one, since only factual claims can really be meaningfully corroborated. In order to aggregate statements we then need to be able to match two claims together while taking into account different ways of saying similar things. This includes the challenge of context, the tiniest change in which can alter the meaning of a statement and make it difficult for a computer to assess the equivalence of statements. Then, the simplest aggregation strategy might consider the frequency of a statement as a proxy for its truth-value (the more sources that agree with statement X, the more we should believe it) but this doesn’t take into the account the credibility of the source or their other relationships, which also need to be enumerated and factored in. We might want algorithms to consider other dimensions such as the relevance and expertise of the source to the claim, the source’s originality (or lack thereof), the prominence of the claim in the source, and the source’s spatial or temporal proximity to the information. There are many research challenges here!

Any automated corroboration method would rely on a corpus of information that acts as the basis for corroboration. Previous work like DisputeFinder has looked at scraping known repositories such as Politifact or Snopes to jump-start a claims database, and other work like Videolyzer has tried to leverage engaged people to provide structured annotations of claims, though it’s difficult to get enough coverage and scale through manual efforts. Others have proceeded by using the internet as a massive corpus. But there could also be an opportunity here for news organizations, who already produce and have archives of lots of credible and trustworthy text, to provide a corroboration service based on all of the claims embedded in those texts. A browser plugin could detect and highlight claims that are not corroborated by e.g. the NYT or Washington Post corpora. Could news organizations even make money off their archives like this?

It’s important not to forget that there are limits to corroboration too, both practical and philosophical. Hypothetical statements, opinions and matters of taste, or statements resting on complex assumptions may not benefit at all from a corroborative search for truth. Moreover, systemic bias can still go unnoticed, and a collective social mirage can guide us toward fonts of hollow belief when we drop our critical gaze. We’ll still need smart people around, but, I would argue, finding effective ways to automate corroboration would be a huge advance and a boon in the fight against a misinformed public.

Videolyzing Pharmaceutical Ads

There are just two countries in the world where Direct-To-Consumer (DTC) advertising is allowed for pharmaceuticals: the US and New Zealand. The ostensible motivation? To educate consumers, to raise awareness of medical conditions, to get people talking to their doctors, or to reduce the stigma associated with certain conditions (e.g. Viagra)

Since the laws changed back in 1997 in the US opening the floodgates for big pharma to peddle their wares directly to patients, there has been a debate about the efficacy and value of DTC advertising. Even today the FDA lists several ongoing studies evaluating the understandability and effects of DTC advertising. But the debate is political too. Congress has recently started floating proposals to limit the marketing powers of pharmaceutical companies for the first 2 years after a drug has been approved by the FDA. This would give regulators additional time to evaluate a new drug’s broader risks once it were available on the market.

Drugs aren’t the only DTC advertising issue generating controversy either. DTC medical device advertising is already generating a debate about the ethics of advertising products to people that can’t possibly understand the medical risks and decisions necessary for a medical device implant.

This is not to mention that DTC could be pushing up the overall costs of health care by directing people toward brand name “designer” drugs that may not be any more effective than alternative treatments. Obama’s $1 billion stimulus funding for Comparative Effectiveness Research (CER) should help with this somewhat by doing real comparisons of which treatments are “worth it” both in $$$ and patient value.

But big pharma is big business. Huge sums of money are invested in pharameutical advertising ($5.2 Billion in 2007), with spending growing at an annual rate of about 20% from 1997 to 2005. And with huge returns on investment, who can blame big pharma for wanted to drive traffic for new drugs by going straight to the people who would need treatment. The birth-control pill, Yaz, increased its sales from $262 million in 2007 to  $616 million in 2008, utilizing a few high profile (and misleading) broadcast ads.

Misleading or inaccurate information could lead consumers to make poor health decisions, or take risks that they may not fully understand.
So how does the government keep consumers safe and pharmaceutical advertisers honest? Right now the process is managed by the FDA Division of Drug Marketing, Advertising, and Communications (DDMAC). Advertisers are required to submit promotional materials to the DDMAC when they are first used or published, but not before. This means the FDA’s role is purely to “check up on” what advertisers publish, ex post facto. Ads can be circulating for months before they are critiqued and evaluated. And if an ad is found to be misleading, the FDA sends a warning letter to the offender asking them to retract the ad. That’s it most of the time.

What does the FDA check? According to their website, “advertisements cannot be false or misleading or omit material facts. They also must present a fair balance between effectiveness and risk information. FDA has consistently required that appropriate communication of effectiveness information includes any significant limitations to product use.” They require that all drug advertisements contain information as a brief summary relating to side effects, contraindications, and effectiveness. For instance, the law states that, “an advertisement may be false, lacking in fair balance, or otherwise misleading if it: “Fails to present information relating to side effects and contraindications with a prominence and readability reasonably comparable with the presentation of information relating to effectiveness of the drug, taking into account all implementing factors such as typography, layout, contrast, headlines, paragraphing, white space, and any other techniques apt to achieve emphasis.” The FDA has a very specific set of guidlines for how ads can be used in the video domain as well; including different categories of ads such as “product-claim ads”, “reminder ads” and “information seeking” ads.

The current FDA procedures for the evaluation of DTC video (broadcast) ads are wholly unwieldy. They include the submission of TEN (!!!!) copies of an annotated storyboard with each sequentially numbered frame and associated annotated references and precribing information (PI) supporting claims. Isn’t there a better way to do this?

This got me thinking about how an application like Videolyzer, that I originally built as a tool for bloggers and journalists to critique and debate online video, could be used by someone like the FDA (or the pharma companies) to streamline and digitize the evaluation and sourcing of video advertisements. This is in addition to exisiting journalism outfits, like Consumer Reports Ad Watch, which could use the tool to add back context to an overly curt video advertisement. Yaz, a birth control pill marketed by Bayer gained notoriety in late 2008 for two ads that were deemed misleading by the FDA and for which they had to run corrective ads in 2009. I’ve added the original version of one of the Yaz ads to Videolyzer for anyone interested in seeing how the tool can be used to critique a pharaceutical ad.

Fact Checking Source Contextualization

I ran across this round-up of some of the most prominent Political Fact Checking sites online including non-partisan FactCheck, Politifact, and¬† Washington Post Fact Checker Blog, as well as the partisan counter-parts: Newsbusters and MediaMatters. One of my criticisms of such sites is that oftentimes the fact-checking is decontextualized from the orginal document, especially for multimedia such as video. The presentation is usually done as a block of text explaining the “fact” in question. But what’s missing is the context of the claim or statement within the original source document. A far more compelling information interface for this would be to present an annotated document so that segments of the document (or video) are precisely delineated and critiqued. This is something I worked into the Videolyzer for video and text, but more generally this type of thing needs to happen for all fact-checked texts online.