Online research tools highlight ethical debate in the age of disinformation

July 17, 2019

Image: Pixabay

Journalists need a guiding set of principles for how they mine social media for public information.

Ever since the 2016 U.S. election brought us menacing refrains like “fake news” and “Russian meddling”, journalists, governments and technology companies have been searching for ways to fight disinformation.

But what if the solutions are leading us down a dubious ethical path? Some recent controversies over how investigative journalists and researchers use social media data highlight the tension between access to information and personal privacy, raising a number of ethical issues about how they monitor online accounts, what tools they build and how they’re funded. It is increasingly clear we need a guiding set of principles for how to tackle disinformation, or else we might compromise important privacy protections along the way.

This June, Facebook set off a big debate when it quietly limited access to an advanced search function called Graph Search. Put simply, Graph Search allowed you to identify accounts that fit incredibly specific parameters, for example: “people who live in Helsinki and like Kanye West”, or “people who work at U.S. Customs and Border Protection and like anti-immigrant Facebook pages.”

Actually performing these queries was fairly technical without the help of an external tool to automate the process, which is why the average Facebook user may have never heard of Graph Search. But a range of people used it: private investigators, fact checkers, human rights activists looking into war crimes – essentially, anyone who has a stake in verifying information on social media.

So when Facebook took away the function without warning, some complained that the platform was blocking access to an essential tool.

“It’s difficult for human rights researchers to find information that’s relevant to their work,” said Alexa Koenig, Executive Director of the University of California, Berkeley’s Human Rights Center. “Graph Search was one of the primary modes for beginning to find that needle in the haystack on social media relevant to a particular crisis.”

On the other hand, some people thought that shutting down the tool was a win for personal privacy. In many ways, Facebook obscures what information about you is public and what is private, giving people who understand Graph Search a unique ability to find data that you may not have even realised was available.

“Graph Search was one of the primary modes for beginning to find that needle in the haystack on social media relevant to a particular crisis.” – Alexa Koenig, Executive Director, Human Rights Center, Berkeley

To make this problem as explicit as possible, one Belgian developer, Inti De Ceukelaire, created a tool that makes it easy to perform Graph Searches and called it “Stalkscan.” The creepy name was intentional. “I wanted my tool to be shocking,” said De Ceukelaire.

According to De Ceukelaire, Stalkscan has had about eight million visitors in the last two years. While he says he supports the subset of people who used his tool for good, such as fact checkers and activists, he was also concerned about abuse. “You should have seen my inbox,” he said. “I got emails from people daily trying to exploit it on their children, their partners.”

Now that Graph Search is down, Facebook redirects any links from Stalkscan to a help page that tells you how to use the platform’s native search tool. Other tools that made Graph Search easy are similarly broken. But of course, Graph Search isn’t the only way to mine social media data for public information. There are still hundreds of tools out there.

Facebook did not respond to request for comment about the changes to Graph Search.

Monitoring or Surveillance?

Let’s entertain for a moment what seems like a simple question: What is the difference between a human rights researcher who uses an open-source tool like Graph Search to scan social media profiles for evidence of government abuse, and a government that uses similar tools to monitor their citizens?

“Intent” seems like the obvious answer, but many human rights lawyers agree that any monitoring operation can have negative consequences, even in the hands of people who mean well. One such lawyer is Jennifer Easterday, director of JustPeace Labs, a nonprofit that helps organisations use technology responsibly in high-risk settings. When researchers and investigators collect social media data, Easterday said, that data can bring unwanted attention and other risks to anyone who is connected to it. There is always a risk that the data ends up in the wrong hands. “Remember that risks can be varied: they can encompass not just physical harm, but also psychological and group/community harms,” she said in an email.

There are other considerations as well. Who does the social media data belong to? What institutions will have access to the data? Did the researchers have to go undercover to obtain it? And if the data is anonymised, is it possible that it can become vulnerable to re-identification if it’s analysed with another data-set (the “mosaic effect”)?

So when exactly does “research,” or “fact checking,” or an “open-source investigation” cross the line into surveillance?

All of these ambiguities make the ethics of social media monitoring tricky territory. In order to navigate them, lawyers at Berkeley’s Human Rights Center are drafting what they’re calling the International Protocol on Open Source Investigations. “More and more individuals and organisations were popping up who were beginning to use these methods, but there wasn’t a clear consensus about the level of quality to which they should be done, or what quality even meant in this space,” said Koenig. “So we’ve been trying to think through that ethical framework.”

A Check on Power

In the lead up to the highly contested Brazilian election last year, researchers at the Federal University of Minas Gerais (UFMG), about 450 kilometres north of Rio de Janeiro, developed a tool to help find viral political misinformation by monitoring groups on WhatsApp that have been listed publicly. WhatsApp may be an encrypted messaging service, but the platform allows administrators of group messages to create invite links – which can be posted on the internet, and then used by anyone.

The tool worked like this: content gets pulled from 350 of these publicly listed groups. The top posts are visualised on a dashboard, separated into text, photo, video, and audio. The tool doesn’t allow you to see who posted which message, but it does quantify how many times that content was shared, and in which groups.

Given that many people who use WhatsApp might have reasonable expectations of privacy, some journalists, like Poynter’s Daniel Funke, questioned whether such a monitoring system was unnecessarily invasive.

This raises such an important question for debunking efforts on WhatsApp: Where is the line between fact-checking and violating privacy? https://t.co/s4QdEmxUA7

— Daniel Funke (@dpfunke) June 20, 2019

“I had these concerns at the time [we] were creating the project — about privacy, ethics, and data collection,” said Fabricio Benevenuto, an associate professor at UFMG, “but I think we took the necessary steps to not cross any lines.”

Benevenuto says that when his team was building the tool, they took care to only pull data from groups that appeared to be openly public. The system also does not store any personally identifying information.

So when exactly does “research,” or “fact checking,” or an “open-source investigation” cross the line into surveillance? Obviously, tools like the WhatsApp dashboard and Facebook Graph Search are minor league when it comes to monitoring technology. We know that China, for example, is using facial recognition to track the Uighurs, one of its predominantly Muslim minority populations. In Mexico, journalists reporting on corruption and the drug trade have received text messages that infected their phones with Pegasus spyware, a surveillance tool that allows targets to be monitored remotely.

And it is still extremely difficult to know how governments are using surveillance technology and what their relationship to private surveillance looks like. The industry is so shrouded in mystery that at the end of last month Special Rapporteur for the United Nations David Kaye presented a report that calls for an “immediate moratorium on the global sale and transfer of the tools of the private surveillance industry until rigorous human rights safeguards are put in place.”

Clearly, governments and private security companies have more money and resources at their disposal. Journalism and human rights work is supposed to be a check on that power.

“We took the necessary steps to not cross any lines.” – Fabricio Benevenuto, associate professor, Federal University of Minas Gerais

But having good intentions doesn’t mean that journalists and civil society can sidestep tough questions about using open source tools. Without a thoughtful and transparent ethical framework, we may risk supporting methodologies and crossing lines that could subsequently be turned against us.

For example, the “Iranian Disinformation Project”, an initiative funded by the US Department of State, claimed that it brought to light “disinformation emanating from the Islamic Republic of Iran via official rhetoric, state propaganda outlets, social media manipulation, and more.” What it actually did was target journalists, like former Washington Post correspondent Jason Rezaian, and researchers, like Tara Sepehri Far, who were critical of the Trump administration’s “maximum pressure” economic sanctions on Iran.

At the very least, journalists and civil society organisations need to develop transparent guidelines about when to accept funding from governments to develop open source tools. The same goes for delivering training or participating in workshops with government representatives where these methods are discussed.

The fact that journalists and researchers couldn’t agree about whether even a scrappy technology like Facebook Graph Search going down was good or bad highlights the murkiness of the ethical waters here. And with technology advancing so quickly, it’s important that we have a clear vision for what the right to personal privacy looks like.

Correction: We have updated this article to acknowledge Tom Trewinnard’s 11 June 2019 Medium post, an earlier writing that addressed the same subject. We have also moved this acknowledgement to the end of the article, as per editorial policy.

To stay informed, become a First Draft subscriber and follow us on Facebook and Twitter.

Journalists need a guiding set of principles for how they mine social media for public information.

Monitoring or Surveillance?

A Check on Power

A guide to prebunking: a promising way to inoculate against misinformation

The unproven lab leak theory, Wuhan lab and virus origin: Reporting best practices

Overlays: How journalists can avoid amplifying misinformation in their stories

Online research tools highlight ethical debate in the age of disinformation

Journalists need a guiding set of principles for how they mine social media for public information.

Monitoring or Surveillance?

A Check on Power

Related articles

A guide to prebunking: a promising way to inoculate against misinformation

The unproven lab leak theory, Wuhan lab and virus origin: Reporting best practices

Overlays: How journalists can avoid amplifying misinformation in their stories