I just came back from News Foo, an un-conference for technologists, academics and journalists in Phoenix on the future of news. The following post details my thoughts, heavily inspired by the conversations and sessions I had the privilege to be a part of.
There are a growing number of algorithms that are deciding what topics people’s attention should be given to. Algorithms are taking over the historical raison d’etre of news editors, generating top news lists, hot trends and personalized recommendations. Algorithms have the perception of being neutral, yet they encode political choices and have cultural values baked in. At a time when audience attention has become a scarce commodity, an algorithm’s ability to command user attention is true power within our media ecosystem. As curatorial power is handed over to automated systems, we must make sure that the public understands the biases at play and that product engineers are optimizing for the wanted outcome – an informed public – not just what generates traffic.
Human vs. Algorithm
An algorithm is a finite list of instructions that a machine performs in order to calculate a function. From simple counting operations to complex information sorting, a good algorithm is thought through and well defined to give the wanted output in the least computationally complex manner. Algorithms are extremely good at scale. They can be used to efficiently classify text from millions of documents within micro-seconds, extract images of a certain type, and identify complex correlations between multiple data points. Recommendation systems such as the ones used by Netflix and Amazon employ algorithms that learn about user preferences through their actions, and personalize the information presented for every user, an impossible task to be completed manually.
Algorithmically curated, personalized recommendations have become popular within digital media spaces. “Most read articles” modules are based on simple math: the top 10 articles in terms of page views. On the other hand, “hottest articles” lists are more ambiguous and vary based on what the organization defines as “hot”. Is it new content? Is it popular? Spiking? How far back is the data being compared? Are there white listed or blacklisted topics? Whats hot is an intuitive and very humane assessment of an ecosystem, yet a mathematically complex formula, if at all possible to reproduce.
Yet humans are still unbeatable for many types of tasks. Journalists and editors drive agendas, made up of qualities that are difficult to determine in a formula: trust, excitement, impression and intuition. Humans aren’t always rational, and may trust a source despite a bad reputation. The intuition that an experienced editor or journalist brings to the table could never be replaced by automated formulas.
Algorithmic Bias vs. Perception of Neutrality
As soon as digital information providers add any form of curation and recommendation mechanisms (a common practice within social network spaces), the technology loses its neutrality. In some ways, “Twitter’s trending topics algorithm acts like a lot of human news editors, who are more interested in the latest news rather than ongoing stories”, says Tarleton Gillespie of Cornell University. Values are coded into the way these systems make recommendations:
- Twitter’s trending topics highlight novel events rather than events that slowly grow, simmer, thus making it very hard for events like Occupy Wall Street to trend, in comparison to events like Kim Kardashian’s wedding or Steve Job’s death which easily trend.
- Google’s search algorithm was recently adjusted (Panda update) to highlight fresh content, affecting some 35% of all search queries.
- Facebook is known to promote content that references any brand that is also one of their ad partners on people’s personal “walls”.
As these systems grow, a single engineer or product designer may not fully understand the logic behind all of the pieces that make up the whole. We’ve seen a number of examples where uninteded consequences of algorithmically designed results led to awkward outcomes, such as Amazon’s $23,698,655.93 priced book about flies or Google’s past ‘Florida release’ which had a catastrophic effect on a large number of websites, causing SMEs to go bankrupt. Mike Ananny describes how the Android marketplace recommended the “Sex Offender Search” application for anyone interested in Grindr, a gay dating app. And most recently, Siri’s inability to find abortion clinics in New York city.
These are not Google, Apple, Amazon or Twitter conspiracies, but rather the unexpected consequences of algorithmic recommendations being misaligned with people’s value systems and expectations of how the technology should work. The larger the gap between people’s expectations and the algorithmic output, the more user trust will be violated. Liz Strauss eloquently describes why she quit Klout, feeling cheated by an algorithm that constantly changes under her feet. She wanted to trust the algorithm, even through initial doubts, but broke down and quit after multiple algorithm changes.
As designers and builders of these technologies, we need to strike a fine balance between making sure our users understand enough about the choices we encode into our algorithms, but not too much to enable them to game the system. People’s perception affects trust. And once trust is violated, it is incredibly difficult to gain back. There’s a misplaced faith in the algorithm, assuming that the algorithm should accurately represent what we think is true.
While it is clear for technologists that algorithms are biased, the general public perception is that of neutrality. Someone at News FOO brought up the famous Rumsfeld quote, adding that it is the unknown unknowns that we should be most worried about. When people don’t know that they don’t know how the algorithms that govern their interfaces work, they may get burned, angry and blame the technology.
The Augmented Journalist
We need to be thinking about hybrid approaches. On the news production side, how do we utilize algorithms for scale while using journalists and editors for compelling narratives and thoughtful judgement. Algorithmic Investigative Journalism may hold a treasure trove of possibilities for new types of stories, where journalists will use the output of a complex data query to feed their intuitions and draw conclusions from correlations in the data. Tom Lee at Sunlight Labs is doing an amazing job pushing projects that derive insight from big data, while Kris Hammond uses machines to write stories where automation is possible.
On the flip side, we need to make sure the general public has a better understanding of the algorithms at play, the algorithms that feed their attention, without giving away too much of the special sauce. We must come up with the right vocabulary to define editorial workflows, and work with engineers to code them into the algorithms. As danah boyd stressed during the session, it is important to be constantly thinking through what we’re optimizing for. The editor and journalist’s job is to inform the public. Is it possible to design and implement algorithms that optimize for an informed public? How do we even start to quantify a person’s level of “informed-ness”?
Pete Skomoroch raises a similar question. We need to strike the right balance between automated news personalization and curated, editorialized feeds. Advanced chess (or computer-assisted chess) is a relatively new form of chess, wherein each human player uses a computer chess program to help explore the possible results of candidate moves. The human players, despite this computer assistance, are still fully in control of what moves their “team” (of one human and one computer) make. What would the augmented journalist or editor look like? How can technology and algorithms be used effectively in the newsroom to inform both journalists and the general public?
The conversation should not be focused on humans vs. algorithms, but rather how we utilize algorithms to take our media ecosystem to the next level.