In a 2018 testimony before the US Senate, Facebook chief Mark Zuckerberg said he was optimistic that in five to ten years, artificial intelligence would play a leading role in automatically detecting and moderating hate speech.
We got a taste of what AI-driven moderation looks like when Covid-19 forced Facebook to send home its human content moderators in March. The results were not encouraging: greater quantities of child pornography, news articles erroneously marked as spam and users who were temporarily unable to appeal moderation decisions.
But the conversation about automated moderation is still mostly focused on text, despite the fact that visuals, videos and podcasts are powerful drivers of disinformation. According to research First Draft recently published on vaccine-related narratives, photos and visuals accounted for at least half of the misinformation that was studied.
One issue with such an approach is that platforms geared toward audiovisual content, such as YouTube, TikTok and Spotify, have sidestepped a lot of scrutiny in comparison. Part of this is a structural issue, says Evelyn Douek, a lecturer at Harvard Law School. Watching visual media is more time-intensive, and the journalists and researchers who write about platform moderation spend more time on text-based platforms like Facebook and Twitter. “Text is easier to analyze and search than audiovisual content, but it’s not at all clear that that disproportionate focus is actually justified,” Douek says.
Unlike Zuckerberg or Twitter’s Jack Dorsey, YouTube’s CEO, Susan Wojcicki, has not been called in for questioning by the Senate Judiciary Committee, even though many of the company’s moderation policies have been opaque and the subject of criticism. For example, YouTube recently announced that it would remove videos claiming that voter fraud altered the outcome of the US presidential election, only to later tell some users how to post similar content to bypass such moderation. Others were quick to point out that policy enforcement seems to vary according to inconsistent semantic distinctions.
We know even less about moderation on TikTok, a platform that is expected to top one billion monthly active users in 2021 and which once instructed its moderators to remove content of people who have an “abnormal body shape” or poor living conditions.
When audiovisual disinformation is both widespread and difficult to define, how prepared are the platforms to moderate this type of content, or for researchers to understand the consequences of moderation? And are we as confident as Zuckerberg that AI will help?
Automated platform moderation comes in two main forms: “matching” technology that uses a database of already-flagged content to catch copies, and “predictive” technology that finds new content and assesses whether it violates platform rules. We know relatively little about predictive tech at the major platforms. Matching technology, on the other hand, has had some major successes for content moderation, but with important caveats.
One of the first models of moderating audiovisual content at scale was a matching technology called photoDNA, developed to catch sexual abuse of children. Adopted by most of the major platforms, photoDNA is largely viewed as a success, in part due to the relatively clear boundaries around what constitutes child sexual abuse material, but also because of the clear payoff in taking down this content.
Similar technology was later developed to identify content that might be connected to suspected terrorist activity and collect it in a database run by the Global Internet Forum to Counter Terrorism (GIFCT), a collaboration of technology companies, government, civil society and academia. Following the 2019 Christchurch shooting livestreamed across Facebook, there was renewed government support for GIFCT, as well as growing pressure for more social platforms to join.
Yet even GIFCT’s story highlights huge governance, accountability and free speech challenges for platforms. For one, the technology struggles with interpreting context and intent, and with looser definitions about what constitutes terrorist content, there is more room for false positives and oppressive tactics. Governments can use “terrorist” as a label to quiet political opposition, and in particular, GIFCT’s focus on the Islamic State and Al-Qaeda puts content from Muslims and Arabs at greater risk of over-removal.
Data from GIFCT’s recent transparency report also shows that 72 per cent of content in its database is “Glorification of Terrorist Acts,” a broad category that could include some forms of legal and legitimate speech. This underscores one of the biggest challenges for platforms looking to moderate audiovisual content at scale: Legal speech is a lot harder to moderate than illegal speech. Why? Because even though the boundaries of illegal speech are fuzzy, they are at least defined in law.
However, speech being legal does not make it acceptable, and the boundaries of acceptable legal speech are bitterly contested. This poses an extra challenge for moderating audiovisual misinformation. Unlike child sexual abuse material, misinformation is not a crime (some countries have criminalized misinformation, but in most cases its definition remains vague). It is also extremely difficult to define, even if you try to set parameters around certain categories as defined by “authoritative” sources, much like photoDNA and GIFCT try to match content against an existing database.
For example, in October, YouTube said it would remove any content about Covid-19 vaccines that contradicts consensus from local health authorities or the World Health Organization. But as we’ve learned, identifying consensus in a rapidly developing public health crisis isn’t easy. The WHO at first said face masks did not protect the wearer, amid a global shortage of PPE, before reversing course much later on.
Which is to say: Even though automated moderation has had mixed results with certain types of content, moderating at scale — particularly when it comes to visual disinformation — poses enormous challenges. Some of these are technological, such as how to accurately interpret context and intent. And some of these are part of a wider discussion about governance, such as whether platforms should be moderating legal speech and in what circumstances.
To be sure, this is not an argument for harsher content moderation or longer working hours for human content moderators (who are mostly contractors, and who already have to watch some of the most traumatizing content on the internet for less pay and fewer protections than full-time employees). What we do need is more research about visual disinformation and the platforms where it spreads. But first we’ll need more answers from the platforms about how they moderate such content.