Football Rankings Versus Google's PageRank

When mathematicians rank the rankings, PageRank doesn't always come out on top

Loading the podcast player...

How we rank things is important, and sometimes not easy.

College basketball teams are relatively easy to rank because the top 68 teams play a knockout tournament. The best team is the last one standing, so to speak—the only one that hasn’t lost a game. We think we know the second best team, but is it always the one that lost in the final round? Maybe it’s the one that lost to the same team but in the semifinal round? As it turns out, it’s not so easy to rank basketball teams, even after a month-long tournament process.

College football teams are even harder to rank, notoriously so. There’s an effort to get the very best teams to play against one another, but there’s no comprehensive tournament like basketball has. Last year, there were two undefeated teams. Which is Number 1 and which is Number 2? There were five teams with only one loss. Which is Number 3 and which is Number 7?

A lot rides on college ranking systems: The best teams get millions of dollars in television revenue and, in a rich-get-richer way, attract the best high school athletes, who may lead them to future championships.

A lot rides on some other ranking systems—first and foremost Google’s. Last year, JC Penney, the department store chain, spent a lot of money to pump itself up in Google search results, earning itself a banner season in online holiday revenue—until Google tweaked its PageRank algorithms and Penney’s position for terms like “dresses” and “living room furniture” plummeted from Number 1 to below 50.

As it turns out, ranking football teams and Google’s PageRank have more in common than you might think. A group of mathematicians recently compared them and found that in some key respects, the sports systems were superior—not just for ranking sports teams but for ranking Web pages as well.

Their research is in the current issue of the Society for Industrial and Applied Mathematics’ Journal of Scientific Computing.

I have two guests today. Without ranking them in any way, Tim Chartier is an associate professor of mathematics at Davidson College and was the lead author of the SIAM study. He joins us by phone from Davidson, North Carolina.

Erich Kreutzer is listed as the second author of the study. At the time he was an undergraduate researcher at Davidson and developed many of the paper’s real-world experiments. He’s a software engineer at Marin Software, which creates search-related systems for large advertisers. He also has another connection to Google: In 2007, he was a student developer for Google’s annual Summer of Code. He joins us by cellphone from San Francisco.

How we rank things is important, and sometimes not easy. College basketball teams are relatively easy to rank because the top 68 teams play a knockout tournament. The best team is the last one standing, so to speak—the only one that hasn’t lost a game. We think we know the second best team, but is it always the one that lost in the final round? Maybe it’s the one that lost to the same team but in the semifinal round? So as it turns out, it’s not so easy to rank basketball teams.

College football teams are even harder to rank, notoriously so. There’s an effort to get the very best teams to play against one another, but there’s no comprehensive tournament like basketball has. Last year, there were two undefeated teams. Which is No. 1 and which is No. 2? There were five teams with only one loss. Which is No. 3 and which is No. 7?

A lot rides on college ranking systems: The best teams get millions of dollars in television revenue and, in a rich-get-richer way, attract the best high school athletes, who may lead them to future championships. A lot rides on some other ranking systems—first and foremost, Google’s. Last year, JC Penney, the department store chain, spent a lot of money to pump itself up in Google search results, earning itself a banner season in online holiday revenue—until Google tweaked its PageRank algorithms and Penney’s position for terms like “dresses” and “living room furniture” plummeted from No. 1 to below 50.

As it turns out, ranking football teams and Google’s PageRank have more in common than you might think. A group of mathematicians recently compared them and found that in some key respects, the sports systems were superior—not just for ranking sports teams but for ranking Web pages as well. Their research is in the current issue of the Society for Industrial and Applied Mathematics’ [SIAM] Journal of Scientific Computing.

We have two guests today. Without ranking them in any way, Tim Chartier is an associate professor of mathematics at Davidson College and was the lead author of the SIAM study. He joins us by phone from Davidson, North Carolina. Tim, welcome to the podcast.

Tim Chartier: Yeah. Thank you very much.

Steven Cherry: Erich Kreutzer is listed as the second author of the study. At the time he was an undergraduate researcher at Davidson and developed many of the paper’s real-world experiments. He’s a software engineer at Marin Software, which creates search-related systems for large advertisers. He also has another connection to Google: In 2007, he was a student developer for Google’s annual Summer of Code. He joins us by cellphone from San Francisco. Erich, welcome to the podcast.

Erich Kreutzer: Hi. Thank you very much.

Steven Cherry: Tim, let’s start with the college ranking systems. Tell us about them and why your group wanted to compare them to PageRank.

Tim Chartier: One is called the Colley method, named after Wesley Colley, and that particular method looks only at wins and losses, so there’s no score information put into the ranking method. So what you do is you create linear systems using linear algebra, and so you get ax = b, and you take that information and you solve the linear system, and then you get what we call a rating for each team. And so if we were doing this—for instance, if you and I were playing tennis, for instance, and we decided to use the Colley method to rank us and other people who might also be playing, then if your rating was 0.6 and mine was 0.4, then you would be ranked higher than me.  

The Massey method is the second method, which was put into that paper coming from sports ranking, and that method is named after Ken Massey, who created the method actually as an honors thesis during his undergraduate degree, and he now does consulting for the Bowl Championship Series and also for high school sports. And that method, in the Bowl Championship Series, I believe, still does not look at the score, because the Bowl Championship Series prefers not to have that as part of the ranking. But that method can actually have that folded into it. Those are the two methods that we looked at, and mainly what we were interested in is what’s called the “stability” and the “sensitivity” of the methods. Sensitivity means if you just have a small difference—like one team beats another team—then how much does the rating change? That rating vector earlier, when I talked about—I think I used the numbers 0.4 and 0.6 for the two of us—whereas stability says if you have a small change in the outcome of the games, how much does the ranking change? So we looked at the Colley method, the Massey method, and then as you mentioned earlier, for instance, with JC Penney, PageRank, which is the one that was developed by [Sergey] Brin and [Larry] Page for Google.

Steven Cherry: Erich, tell us about the real-world experiments. I gather that they were specifically designed to test these ideas of stability and sensitivity.

Erich Kreutzer: Well, yes. So in the initial part of the paper we talk about the idea of a perfect season—which is not the real world at all—but the idea is that you have a set of teams, and you can say that the No. 1–ranked team has beaten all the teams below it. The No. 2 team has beaten all the teams below it but lost to the first team, and so on and so forth, until you get to the last team, who has lost to every team above it. And then what we do once we have this idea of a perfect season is we introduce one upset. So, say, the 10th team might actually beat the second team in their outing, and we see how that changes the rankings and ratings of the various algorithms. But as everyone kind of understands sports aren’t as predictable—you’re not going to have this perfect season. So what we needed to do was find a scenario where we could isolate one game and that ended up being an upset. So we talked about this for a while amongst our research group, and finally we came up with the idea of examining possible Monday night football games, since you can kind of go through the season and once a Monday night football game happens, you have all the games that occurred before it already have happened. So you can consider the Monday night football game a rank-one update to the whole—all of the systems. And what we ended up doing was creating a little piece of code that would go back over the—I think it was the last 10 to 15 years of NFL seasons—and look at Monday night football games and see if there were any upsets. And then once we’d found those, we’d do the same analysis we’d done before, which was to see how the rankings and ratings of these various teams changed after that single game.

Steven Cherry: And it turns out that the PageRank system was highly unstable. Tell us about the six-team jump that you saw.

Erich Kreutzer: Yeah. So what we kind of expected was that the team that was upset would generally lose rankings. I mean, you can just think about it—when a human hears that a team who’s going in very high loses to a team that’s been performing pretty poorly, you’d expect their ranking to go down. But it just so happens that in one of our tests with the PageRank algorithm, we actually had a team that was upset increase their ranking. It was actually the New York Giants, I believe, jumped six spots as you were talking about, when they lost to the Cleveland Indians, who actually were in last place according to the PageRank algorithm.

Steven Cherry: Yeah, I guess it would be the Cleveland Browns, I guess. What happened in the other two systems with this same New York Giants game?

Erich Kreutzer: Yes, so in the same game actually for the Colley method, the Giants dropped nine places in rank, and then in the Massey they dropped 10 places instead of the increase in rank. And in both systems, Cleveland, since they were the lower-ranked team, increased their rank. Cleveland jumped up one position in the Colley ranking, and they jumped up three positions in the Massey ranking.

Steven Cherry: So Tim, what do we learn from your paper? I guess one lesson is that PageRank wouldn’t be the best way to rank college sports teams, but there’s a lesson here about ranking Web pages as well, right?

Tim Chartier: Yes, and part of the reason you’re seeing the behavior of PageRank is that if you were to take the ratings and you were to graph them, then PageRank actually follows a power rule, where you have essentially an exponential drop in the ratings as you sorted them and you moved along. So that means that the highest-ranked Web page, for instance, if we’re talking Web pages, has a very high rating. But then very quickly as you move along to those lower-rated Web pages, those all have a very small page rank, which means they have a very small rating and they’re very close to each other. Well, the issue with that is that there’s some stability issues with that in terms of when you just induce that small change in the system, it can really change a lot of things, particularly in what we call that tail—all of those small-ranked Web pages or sports teams. But for instance, in mathematics—and the same would be true in engineering and science—sometimes we do fairly esoteric Web searches in terms of the totality of the Web. And sometimes if you notice, you can actually pull up a Web page—a search—and those top pages are not necessarily the highest quality, and we hypothesize that part of that may be related to the fact of that instability in the tail. We’ve actually used our ranking methods for March Madness to create brackets. And we’ve consistently found that Massey and Colley do much much better than PageRank to rank basketball teams to predict the outcome of March Madness, which again, we can’t predict perfectly—I mean, that’d be nice but that’s not really—I guess that’s what we’re shooting for but only in a most ideal way.

Steven Cherry: So what would be the equivalent of an upset in the case of Web page ranking?

Tim Chartier: Well, in Web page ranking, basically, when one page points at another page, then you are sharing some of your page rank with that page. It’s more complicated than that, and even in the linear systems it’s more complicated than that, because everything is interdependent. That’s what creates the linear system. And that means that my rating depends on the ratings of all the teams I’ve played. So if two teams like the Giants and the Browns play each other, the moment that game is played, their rating will change, but then the rating of everyone else changes too. In PageRank, the way that’s done is by actually having a link on your Web page and pointing to another Web page; you actually share your page rank, if you will, with them, and what I mean by that is, if you want to raise page rank, part of the way that’s done is getting very high quality Web pages to point to you. And there are things that are called link farms, where they create high quality Web pages; they’re often educational sites that educators are pointing to, particularly for K-12 education, and then those Web pages actually link out to businesses and so forth to raise their page rank. I’m not sure I’m quite answering your question, but that’s kind of the analogy between the two, is in a sports setting we treat a win in a certain sense as a link from one page to another.

Steven Cherry: So is it fair to say that the sports systems, the Colley and the Massey systems, would be less sensitive to page farms than, say, Google is?

Tim Chartier: Yeah, in a certain sense, we think that they would be. The bigger issue with Colley and Massey—we spent a lot of our time with Colley, with Web pages—is actually the underlying model that they’re built on. And one of the reasons PageRank works so well, even with its susceptibility to spam, is that it’s actually built on a model of how people surf the Web. So what it’s actually doing is it’s not working on a win/loss type of model, which Colley is; it’s actually built on what’s called the random surfer model, where it assumes that if you’re at a particular Web page you have an 85 percent chance of following a link that is on that Web page. And then 15 percent of the time you’re going to randomly jump or teleport to anywhere on the Web, and that’s a very nice model, even though none of us probably follow that. It’s a very nice model of how people move along the Web, and it’s one of the reasons we get overall good results from it. However, Colley, in terms of Web ranking—we’ve found that it’s not as robust of a model for Web ranking. But we have found that, for instance, social networks, it can work quite well because it is debatable whether PageRank is a good model for that. So we’ve been very active in applying Colley, and we’ve talked a lot, but we’ve not moved into using Massey on, for instance, Twitter; that’s what we’ve used it for, and we do think that in a setting like that, where the model aligns well, that it would also be less susceptible to spam.

Steven Cherry: So when it comes to static Web pages, PageRank is still probably the way to go. But you have some questions about whether that sort of system would be the best for social networks.

Tim Chartier: Yes, because one of the—I didn’t mention this. Well, no, I did, when I said you jump, that’s called teleporting—you teleport. And that doesn’t really make sense in social networks; I mean, in a certain sense you can kind of teleport to somebody else’s Twitter account by the fact that one of your other people retweet something, so you sort of visit their account. But overall, basically, your reader is reading the pages of the people that you follow, and so you’re not just surfing through Twitter at large. So it’s a very different behavior that moves you through the network than, for instance, with PageRank.

Steven Cherry: Very good. Well, thanks a lot to both of you.

Tim Chartier: Thank you.

Erich Kreutzer: Thank you.

Steven Cherry: We’ve been speaking with Professor Tim Chartier of Davidson College and Erich Kreutzer of Marin Software, two of the researchers who compared Google’s PageRank to college sports ranking systems, with somewhat surprising results. For IEEE Spectrum’s “Techwise Conversations,” I’m Steven Cherry.

This interview was recorded 21 June 2011.
Audio engineer: Francesco Ferorelli
Follow us on Twitter @spectrumpodcast

NOTE: Transcripts are created for the convenience of our readers and listeners and may not perfectly match their associated interviews and narratives. The authoritative record of IEEE Spectrum's audio programming is the audio version.

Advertisement
Advertisement