In my Slate article “Sex, Violence, and Autocomplete Algorithms,” I use a reverse-engineering methodology to better understand what kinds of queries get blocked by Google and Bing’s autocomplete algorithms. In this post I want to pull back the curtains a bit to talk about my process as well as add some context to the data that I gathered for the project.
To measure what kinds of sex terms get blocked I first found a set of sex-related words that are part of a larger dictionary called LIWC (Linguistic Inquiry and Word Count) which includes painstakingly created lists of words for many different concepts like perception, causality, and sex among others. It doesn’t include a lot of slang though, so for that I augmented my sex-word list with some more gems pulled from the Urban Dictionary, resulting in a list of 110 words. The queries I tested included the word by itself, as well as in the phrase “child X” in an attempt to identify suggestions related to child pornography.
For the violence-related words that I tested, I used a set of 348 words from the Random House “violent actions” list, which includes everything from the relatively innocuous “bop” to the more ruthless “strangle.” To construct queries I put the violent words into two phrases: “How to X” and “How can I X.”
Obviously there are many other words and permutations of query templates that I might have used. One of the challenges with this type of project is how to sample data and where to draw the line on what to collect.
With lists of words in hand the next step was to prod the APIs of Google and Bing to see what kind of autocompletions were returned (or not) when queried. The Google API for autocomplete is undocumented, though I found and used some open-source code that had already reverse engineered it. The Bing API is similarly undocumented, but a developer thread on the Bing blog mentions how to access it. I constructed each of my query words and templates and, using these APIs, recorded what suggestions were returned.
An interesting nuance to the data I collected is that both APIs return more responses than actually show up in either user interface. The Google API returns 20 results, but only shows 4 or 10 in the UI depending on how preferences are set. The Bing API returns 12 results but only shows 8 in the UI. Data returned from the API that never appears in the UI is less interesting since users will never encounter it in their daily usage. But, I should mention that it’s not entirely clear what happens with the API results that aren’t shown. It’s possible some of them could be shown during the personalization step of the algorithm (which I didn’t test).
The queries were run and data collected on July 2nd, 2013, which is important to mention since these services can change without notice. Indeed, Google claims to change its search algorithm hundreds of times per year. Autocomplete suggestions can also vary by geography or according to who’s logged in. Since the APIs were accessed programmatically, and no one was logged in, none of the results collected reflect any personalization that the algorithm performs. However, the results may still reflect geography since figuring out where your computer is doesn’t require a log in. The server I used to collect data is located in Delaware. It’s unclear how Google’s “safe search” settings might have affected the data I collected via their API. The Bing spokesperson I was in touch with wrote, “Autosuggest adheres to a ‘strict’ filter policy for all suggestions and therefore applies filtering to all search suggestions, regardless of the SafeSearch settings for the search results page.”
In the spirit of full transparency, here is a .csv to all of the queries and responses that I collected.