When a disinformation campaign on social media is uncovered, the first question on everyone’s mind is, “Who did it?”
Attributing responsibility to a particular actor is a complex process that Alex Stamos has plenty of experience with, as the former Chief Security Officer at Facebook and current Director of the Stanford Internet Observatory.
Stamos recently spoke to First Draft about the challenges that attribution poses, both to technology companies trying to determine the culprits and to newsrooms looking to responsibly report on this issue.
First Draft: By way of background, could you explain the key differences between the way that the platforms and the media approach the issue of attribution of threat actors?
Alex Stamos: This is a very messy issue, because we already have a long history of difficulty in coming up with standards for cyberthreat attribution for technical attacks. There’s about a decade of scholarship of different ways you can think about attribution and different models you can use.
In the information operations space, it gets more complicated. One, because a lot of the technical indicators that are left behind in a traditional intrusion don’t exist — in the end, a disinformation attack is the technically correct use of these platforms. And so there’s no malware to be analysed. There’s not really an equivalent of a command-and-control channel. There is technical data available, but that technical data is almost uniquely available to the companies, and under certain circumstances, the government.
So you end up with this situation where you have a huge asymmetry in the amount of information available to the tech platforms versus the media. And the interplay of this with privacy laws is where it gets really complicated, because privacy laws apply to bad guys too.
A lot of the people who work in the intelligence teams in companies like Google and Facebook come from Western intelligence agencies: the NSA, CIA, GCHQ and the like. Those organisations have done a lot of internal work on acceptable attribution models, and the kind of data that lines up for different levels of confidence, and definitions of what they mean by confidence levels. Facebook, within our internal discussions, would have standards for our attribution of confidence, and then, effectively, another set of standards for what we would publicly say.
“One of the big differences between companies and the media is that companies generally will not feel comfortable using strategic purpose as a source of attribution.”
I wrote a blog post about different attribution models right before I left Facebook last summer. One of the big differences between the companies and the media is that companies generally will not feel comfortable using strategic purpose as a source of attribution, because they don’t feel comfortable judging: is this in China’s best interest? Is this in Iran’s best interest? Therefore, that can[not] be used as attribution. Whereas that seems to be the source of a lot of media attribution — a guess about what is in the geopolitical interests of certain parties.
So, platforms are very uncomfortable with evaluating the political goals and motivations of particular nation-states, but the media is more comfortable mixing motivations and goals into the reporting.
Exactly, when these foreign policy pieces are written — it just seems to be a standard part of the piece. The template that is used for a story about “The Iranians did X,” will talk about potential motivations and such. But just because [a reporter writes about a political motivation that might exist] doesn’t mean that should actually be used as part of the attribution model. So, in the Facebook post I wrote, we explicitly say [that] we will not, at Facebook, use political motivations as a mechanism.
Another model is looking at coordination. In the information operations space, this is actually an important set of data. Signs of coordination — of being able to cluster accounts, where you are saying: here are a bunch of actors, perhaps on one platform, or more likely, across multiple platforms, and we have evidence that ties them together — can be a strong source of attribution. But you have to be very careful about which technical mechanisms you’re using for the clustering, and how much weight you give those.
For example, a shared Google Analytics tag is considered a weak indicator of clustering. Partially because, one, it’s easily faked. If someone wants to do a false flag or throw you off the scent, it’s an easy thing to fix. And partially because if you work in the space, you’ll find there’s a bunch of examples of these models falling down.
“A big thing I always tell all journalists is: look, it’s probably not Russia… The vast majority of the time, it is not a foreign influence campaign.”
Investigators that work on the dark market will find out that a lot of these fake news sites have the same templates. They’re copying templates from each other and they’ll often copy things like the Google Analytics code. They’re not really collaborating, but [the same Google Analytics tag] shows up and gives you a false scent. Or people share a graphic designer, or they share a hosting platform. In those kinds of clustering, all those pieces of data are useful, but you have to be careful about how you weigh them.
What more can journalists do to make sure they’re reporting responsibly on these stories, even if they don’t have access to platform information, or a technical forensics background?
It is incumbent on platforms to find ways to share more data — that is their responsibility to figure out. There are interesting problems with privacy laws, but if the company doesn’t normally want to talk about the details of that, then [journalists should] push back.
A big thing I always tell all journalists is: look, it’s probably not Russia. The truth is, the vast majority of political disinformation is coming from semi-professionals who are making money pushing disinformation, who are also politically motivated and have some kind of relationship to the political actors themselves. The vast majority of the time, it is not a foreign influence campaign. And that should be the automatic assumption: it is not James Bond.
Journalists are there on all kinds of other things. If you read a local newspaper story about a woman disappearing in the middle of the night, it’s probably not a human trafficking ring. It’s probably the husband. Local crime reporters understand this, so they don’t write, ‘This is probably a Ukrainian human trafficking ring’, as the first assumption in the story. They write: Scott Peterson seemed very sketchy.
“[Journalists] have to fundamentally accept that the sexiest explanation is usually not true.”
At some point, you start to realise it’s mostly scammers. This is the truth on the internet: there are tens of thousands of people whose entire job it is to push spam on Facebook. It’s their career. There are hundreds of times more people doing that than there are working in professional disinformation campaigns for governments. So they have to fundamentally accept that the sexiest explanation is usually not true.
This is something that companies go through, too. They’ll hire new analysts, and they jump to wild conclusions. ‘I found a Chinese IP, maybe it’s MSS [Ministry of State Security].’ It’s probably not MSS; it’s probably unpatched Windows bugs in China. This is also why you do the red-teaming, and why you have disinterested parties whose job it is to question the conclusions.
When a journalist actually gets on a call with a technology company, like Facebook or YouTube, to discuss disinformation campaigns on that platform, what kinds of questions should she be asking?
There are two different scenarios: when the company provides attribution, and when it doesn’t. The platforms are going to be the most reluctant to provide attribution, because if they get it wrong, it’s a huge deal. It’s a lot of downsides, and not a lot of upsides, for doing attribution.
The media also needs to understand that attribution [to] actors like Russia is relatively easy for these companies. None of them have offices in Russia anymore. Russia is economically irrelevant to all the major tech companies, whereas that is not true for China or India. So the first thing the media needs to consider is whether or not the platforms are motivated by their financial ties or by the safety of their employees in not making the attribution.
“The media also needs to understand that the attribution [to] actors like Russia is relatively easy for these companies.”
If the companies don’t provide attribution, then the other thing the media should ask is: are you providing the raw data to anybody who can? Are you providing it to any NGOs or academics, or are you providing it to a trustworthy law enforcement source? This is the key thing.
But this is only going to get harder because the truth is, if you look globally at disinformation campaigns, the median victim of a professional disinformation campaign is a victim of a campaign being run by their own government against the domestic audience. If you look at India, the disinformation is not being driven by foreign adversaries, it’s being driven by the Indian political parties. That makes the attribution question very complicated for the tech companies. So that’s something the media needs to keep in mind.
This is also why the media needs to be consistent about calling for regulation of the [technology] companies. When you call for global regulation, you’re asking to switch power over to the governments who are perhaps not totally democratically accountable; or are democracies, but democracies that think disinformation is totally fine.
That’s the kind of stuff that drives me nuts on the big-picture issues, when large media organisations call for these companies to be subservient to governments, and then also believe that the companies should protect people from their own governments.
This interview has been edited and condensed for clarity.