Is Wikipedia a Real-Time News Source?
After a mass shooting or natural disaster, Wikipedia’s volunteers are on the story within hours and make thousands of edits in the first days
Steven Cherry: Hi, this is Steven Cherry for IEEE Spectrum’s “Techwise Conversations.”
You know how some news events capture everyone’s attention? The tragic school shooting in Newtown, Connecticut, last December was a case in point. Do you remember how you heard? For many of us now, news comes from a website.
And once you’re onto the story, how do you follow it? If you’re a desk worker, you probably look to the Internet, because that’s the screen right in front of you. Maybe you go to CNN.com or the New York Times site, but increasingly, people are going to Wikipedia. Yes, Wikipedia.
But Wikipedia is an encyclopedia, I hear you say. Yes and no. A couple of weeks after the Newtown shooting, a writer on the website GigaOM wrote about how in moments like that, he looks to Wikipedia, and how, as it turns out, a social-sciences researcher recently wrote his Ph.D. thesis on Wikipedia’s increasing prowess at real-time news.
The researcher, Brian Keegan, looked at seven such events, including the movie shooting in Aurora, Colorado, and the Wisconsin Sikh temple shooting, both earlier in 2012, the 2011 mass shooting in Norway, and the 2007 shooting at Virginia Tech. His findings may surprise you. Wikipedia’s devoted volunteers rapidly created, and continually updated, pages about the events. If the purpose of Wikipedia is to aggregate the world’s information, it seems it can do it as readily for events a few hours old as for those a few hundred years past.
Brian Keegan received his Ph.D. from Northwestern University, in Evanston, Ill., in the Media, Technology, and Society program, and is currently a postdoctoral fellow at Northeastern University in Boston, studying networks and big data in politics. He’s also a consultant at Ninja Metrics, doing statistical modeling of influence and behavior in online games. He joins us by phone.
Brian, welcome to the podcast.
Brian, as I understand it, this was your Ph.D. topic, so on December 14 you looked at the Connecticut school shootings not just with the horror we all did, but also with an eye to how the inevitable Wikipedia article would be born and develop. What did you see?
Brian Keegan: Right, and so I have a sort of morbid fascination, that after something tragic like this happens, a lot of people go into a period of mourning, but I’m sort of a data scientist, a scientist at heart, so I want to go out and get the data on this. And for sites like Twitter, that’s something that a lot of us, as social media researchers, you kind of have to have your boots on the ground, or at least boots in the stream, so to speak, and getting it while it’s unfolding. But for something like Wikipedia, that response takes a few days to sort of build up. And so while the article is created a few hours after the event, the sort of momentum of people writing about this takes about two days.
And so the shooting was on the Friday, and so by that Sunday, I sort of started pulling the pieces together for that to kind of put together, sort of my way, to sort of make sense of everything that I was kind of going through. And see how was Wikipedia responding to this event, and how did this sort of event, and Wikipedia’s coverage of it, compare to the way that Wikipedia has written about similar events?
Steven Cherry: So you found, I guess, that in some ways the Newtown event was typical, but you also found some differences among some of these events.
Brian Keegan: Right, and so the Newtown event certainly captured a lot more media attention, but the way in which Wikipedia responded to it, it was relatively similar to some other major events, like the ones that you had mentioned. By far and away, the Virginia Tech shooting kind of stands out as sort of exceptional by Wikipedia’s response, both in terms of the number of changes that Wikipedia editors made to the article, as well as the sort of intensity of the response, in terms of how quickly people were editing this article.
But what’s interesting, a sort of interesting similarity about these articles, is that you actually have a group of editors who are sort of specialized at editing these news articles. They’re making contributions across all of these news articles.
Steven Cherry: You studied the Gini coefficients of these various editors. What is a Gini coefficient?
Brian Keegan: So a Gini coefficient is something that economists have come up with, and they usually use to measure something like income inequality. So a Gini coefficient runs from 0 to 1. So a 0 indicates that for distribution for something like income, that everyone has the same income, and for 1, one person has all the wealth.
But if you use the same statistic for this, and you want to look at who is making revisions to these articles, you would say that a Gini coefficient of 0 would say that every editor is doing the same amount of work, you know, the same number of contributions, and a Gini coefficient of 1 would say that one editor is doing all of the work on the article.
And so what we see in the Newtown shooting and across many other shootings is that in the first 4 to 6 hours of the article, work is relatively distributed—that everyone is doing about the same amount of work—but after about 4 to 6 hours, some of these specialists that I mentioned show up, and really sort of take over the article, and invest a lot of time and effort, and so it becomes increasingly centralized. And so there’s a few editors doing a lot of work on these articles, from about 6 hours after they’re created, onward.
Steven Cherry: And I guess this is another distinction. It seemed like there are some editors who are only interested in, like, this event, and not the others, and they generally do the sort of heavy lifting of aggregating information from news sources. And then there are other editors who are more, sort of, Wikipedia-ish people who flip from article to article and do the sort of cleanup work?
Brian Keegan: Right, and so these articles are interesting because they sort of bring together a sort of motley crew of people, and so you’ve got these sort of news specialists, like invention, who sort of kind of show up when you look at who’s edited each of these articles, and they’ve kind of got their fingerprints on all the articles that I looked at in my dissertation, and they sort of show up in many other sorts of articles as well. These are people who, whenever there’s news breaking, they’re really kind of making contributions of some type.
But it actually turns out, when you dive in a little bit deeper, that the people who are doing the heaviest lifting on these articles aren’t people who have actually done work on breaking-news articles before, or other events in the news. Oftentimes you have people who are kind of just coming in from the cold, it seems like, who have no qualifications to be working these news articles, when you think about it. But of course if something’s on Wikipedia, anyone can edit this. And it turns out, actually, that people who don’t appear to have the qualifications on the surface, actually the work that they’ve done in their past actually qualifies them really well to do some of this sort of work.
And so an example like that would be something like, someone who edits articles about Harry Potter. So you’d say that a person who edits articles about Harry Potter probably shouldn’t be editing stuff about school shootings. There wouldn’t be a whole lot of overlap there. But Harry Potter is one of these articles that’s always in the news as well. It’s not a topic that we think of as a news story, but that’s something that’s always in the news. People are always trying to edit it, and so that editor has developed a capacity to sort of engage in dispute resolution, or sort of handle a topic that’s in the news, or controversial, or people are trying to make lots of changes to it at the same time. So those people sort of migrate, or are able to move around.
So what’s sort of interesting about Wikipedia, so because it’s so open, it lets people sort of move, and bring those specialties and expertise from other areas that don’t sort of seem to, on the surface, match, but actually the substantive work that they do is really important for when you’ve got thousands of people, or hundreds of people, trying to make changes at the same time.
Steven Cherry: So it’s interesting that you say “controversy.” I guess one of the controversies in the case of the Newtown shooting is that a lot of the early reporting was wrong, right? It conflated the identity of the shooter with his brother’s. It mistook the number of guns that the shooter had, and whether his mother worked at the school. Those are just a few of the misreported facts. How well did Wikipedia recover from those errors?
Brian Keegan: Right, and so that’s something that’s obviously really fraught, and you see that Wikipedia, just like any other news source, just operates in the larger information ecosystem. And so whatever CNN or ABC or any of those major news outlets is reporting, Wikipedia doesn’t have dedicated editors on the ground who are reporting this. The rules of Wikipedia explicitly state that whatever goes into the Wikipedia article has to first be reported by someone else—by reliable journalistic outlets, or newspapers.
And so when CNN and ABC and all these other news outlets were getting this information wrong about all of those things that you just mentioned, Wikipedia reflected that same sort of misinformation as well. But because it becomes this clearinghouse, that it’s not just CNN or it’s not just ABC, but it’s all these sorts of news sources that the editors are drawing on, it becomes possible to clear up those errors much more quickly.
Steven Cherry: I gather that you think that Wikipedia can’t replace newspapers and other news reporting organizations, but that it will increasingly be a necessary complement to them. Is this kind of fact-checking and fact-clearing process part of that?
Brian Keegan: Sure, yeah. I definitely agree that Wikipedia will never be able to replace the resources that journalists bring to bear on these sorts of events. So it very much operates in the domain of established journalists and drawing on that information, incorporating and synthesizing it. But that last half, that’s really important, that it’s able to feel that the editors are going out and looking at all these sources and information, that they’re trying to synthesize it and combine it and come up with more authoritative information about it. So you almost have this wisdom of the network, that people are going out and looking at all these different sorts of information by all these different people, looking at all these different sorts of information to be able to draw on that and bring it together in a way that I think creates a more reliable account.
So I think Wikipedia’s role in these sorts of news events and the media ecosystem is actually really important, about how information goes out, how people verify that, and then how people are able to make sense of what’s going on and what’s actually reliable, even if it’s not itself reporting on the event itself.
Steven Cherry: It’s pretty interesting how an institution as young as Wikipedia can both assume a prominent place in these key national dialogues, but I guess also how quickly it can organically develop these customary practices. Do you think that’s always going to be the case? And I guess for a successful institution, it sort of has to, doesn’t it? Or else it wouldn’t be successful at all.
Brian Keegan: Right, and so there is very much a template, the way that these articles unfold. So if you look at something like hurricanes and tropical storms, there’s a whole community of editors who specialize in editing these sorts of articles, and in fact, and so when you look at these articles, this community of editors, they have this template that, you know, Storm X began to form in Area Y at Time Z, and has these properties: A, B, and C.
And so they have that format, and they’re able to replicate that both for historical articles, but also it provides the structure that they can then bring in for breaking-news events, like Hurricane Katrina or Hurricane Irene or something like that. And so that’s just the case for hurricanes, but there are other analogous examples also, like school shootings or tornadoes or other major national disasters, and things like that.
But that template, that way that we write about things, is encoded in the way that they display information about…the info box that’s on the right there. But also in the practices, I mean, you’ve got these people who have their fingerprints in all these different sorts of articles, and these people and the stuff that they do on these articles, they way that they’re cleaning things up or helping with discussions also provide the structure for how these collaborations are unfolding and create these institutions, like you mentioned.
Steven Cherry: Well, very good. Thanks for such interesting research that you’re doing, and thanks for being with us today.
Brian Keegan: Thanks very much.
Steven Cherry: We’ve been speaking with Brian Keegan, a researcher at Northeastern University, about how surprisingly well Wikipedia documents current events, nearly in real time.
For IEEE Spectrum’s “Techwise Conversations,” I’m Steven Cherry.
NOTE: Transcripts are created for the convenience of our readers and listeners and may not perfectly match their associated interviews and narratives. The authoritative record of IEEE Spectrum’s audio programming is the audio version.