Proof and Consequences
A new book explores the deceptive power of numbers
Hi, this is Steven Cherry for IEEE Spectrum's "This Week in Technology." Did you know that there's a formula to calculate happiness? Happiness is P + 5E + 3H where P is personal characteristics, E is existence, meaning your health, and H is higher order needs. There's a way to attach a number to almost everything, it seems. Here's how to calculate the most miserable day of the year, which turns out to be January 24th, by the way: 1/8 W + 3/8 D – d x TQM x NA, where W is weather, D is debt, M is motivation, and NA is the Need to take Action.
As mathematics filters out from engineering and science to become more and more important to everyday life, unfortunately, all too often, it's merely the appearance of mathematical soundness that people covet. The result? Numbers without substance. And they're everywhere, and they've become the particular bugaboo of my guest today. Charles Seife is a professor of journalism at New York University and a rather prolific science writer of books and articles. His writings have appeared in The Economist, Science, New Scientist, and The New York Times, and as of September 23, he's now the author of five books on those subjects. His newest is called Proofiness: The Dark Arts of Mathematical Deception. Charles, welcome to the podcast.
Charles Seife: Thanks for having me.
Steven Cherry: Charles, by my own calculations, you cannot swing a dead cat without a 94.7 percent chance of hitting a meaningless number, and that's up 6.5 percent from 10 years ago. Tell us about "proofiness," and those ever darker arts of mathematical deception.
Charles Seife: And about 74 percent of people believe the statistics you just gave. The problem with numbers is that they seem to have a higher truth associated with them. In pure math, numbers are as close to absolute truth as we can get. A mathematical proof brooks no argument. It is true, given certain assumptions. Real world numbers don't have that purity. Whenever we humans generate a number, whenever we make a measurement, whenever we count, whenever we estimate something, we create a number that has flaws, because they reflect the flaws of our measurement. Even though people think of numbers as pure, flawed numbers don't have that absolute truth. So we tend to think of numbers as being better than they actually are.
Steven Cherry: According to your book, the problem is a little beyond flaws. Your book starts with a remarkable anecdote about Senator Joseph McCarthy back in the 1950s. More than anyone else, McCarthy was responsible for the so-called Red Scare and persecution of Communists. He was an obscure junior senator from Wisconsin, and in your book, you describe how McCarthy got Americans to take him seriously.
Charles Seife: Yeah, he held up a sheaf of papers, and he said, "I have here in my hand 205 communists working for the state department." And the number of the list changed over time. It went down to 87, back up to 207. The actual number didn't matter. But the very fact that it had a number attached to it meant that people took it very seriously. The specificity of the claim made even the White House wonder: "Maybe McCarthy does know about the communist infiltration of the state department." So it started a little bit of a panic in Washington and a number of hearings followed. It turns out McCarthy was lying. He didn't actually have those people in hand. But it really didn't matter. Then he had sparked this fear, and combined with the fall of China and Russian nuclear tests and all of that, there was a gigantic fear of Communism that he was able to exploit. And because he had the wherewithal to peg his lie with a number, it was extremely powerful, and it launched him into becoming the most divisive figure of his time.
Steven Cherry: McCarthy just made his number up, but there's lying with numbers a little more circumspectly. Your book describes a lot of different ways people do that. Potemkin numbers, as you call them, dissemination, fruit-packing, cherry-picking. Tell us about some of them, and having studied so many ways to lie with numbers, which are the most dangerous.
Charles Seife: Well, they're each dangerous in their own way. Potemkin numbers are numbers made up out of whole cloth. They're the numerical equivalent of Potemkin villages, the facades that were used to convince the empress that there was actually thriving life in the Crimea, when there was in fact just a barren wasteland. Numbers like what McCarthy said when he held up his sheaf of papers—that was a Potemkin number. There are enormous numbers that are just kind of made up out of nowhere. If you look at any advertisement, you're likely to find one. When Vaseline Intensive Care says it delivers 70 percent more moisture in every drop, this is a meaningless number because, first of all, how do you measure the delivery of moisture? It's not easy to see. And how could a drop of Vaseline deliver more moisture than water, which is 100 percent moisture? So there are made up numbers. They're not very dangerous because if you inspect them a little bit, they evaporate; you see that they're bogus. More dangerous are numbers, which are shaped so that they look more convincing than they are, or they are represented to mean more than they actually do. Good example, concrete example: A few years ago, Quaker Oats ran an advertising campaign that seemed to imply that eating Quaker Oats could suck up all the cholesterol in your bloodstream and they had a nice lovely bar graph that showed the cholesterol dropping, dramatically. The thing is, if you look closely at the bar graph, you see that the scale on the y-axis is tinkered with. It doesn't actually start with a zero, as we expect it to, but very high up, and that exaggerates the seeming effect of Quaker Oats. So that's one form of fruit-packing. That's apple-polishing—just as green grocers present their fruit in the best possible light. Even if it's rotten on one side, people will take numbers and turn them just so, so they look really appealing. Another example is cherry-picking, where people will take certain data and ignore data that contradicts them. One example I like to use: Even though global warming is real and man-made—global warming is a significant problem—Al Gore was actually guilty of some cherry-picking in his movie An Inconvenient Truth. If you look at his graphics, where he shows these scenarios where the landscape is dipping under the water, those images he chose were for a case where the sea level rose by 20 feet, which is an extreme estimate. Most scientists think that the rise of the next century is going to be much, much less, 4 or 5 or 6 feet. So by showing graphics of a 20-foot rise and not showing graphics of a lesser rise, he cherry-picked data by picking the worst scenario and ignoring the more likely scenarios to happen.
Steven Cherry: You have a lot of great examples in the book of people playing fast and loose with numbers. A lot of them involve politicians. I should point out that you do so in a very bipartisan way—in the same chapter where you point out Al Gore's cherry-picking, you also to take George W. Bush to task for proofiness. There are a lot of good examples of political proofiness in your book because they're so easy to find; they're delivered right to your door in your morning newspaper frequently—or is there something going with politics?
Charles Seife: If you look at papers nowadays, you can't avoid proofiness, and every time a politician opens his mouth, it's a pretty good chance that some statistic which is dubious will drop out. But I do think there is an increase in mathematical malfeasance over the past few years. There's a recognition that numbers can be fiddled with and they're a very effective form of propaganda. I think, for example—the media is producing polls at a tremendous rate. The media always love polls, but if you look at the number nowadays as opposed to what happened 15, 20 years ago, you'll see that the numbers have increased dramatically. You can't go to CNN.com without having an Internet poll on the front page. And everyone knows that Internet polls don't mean anything. They only reflect a certain portion of people who are on the Internet and feel like they enjoy filling out those polls, which is not even close to representative of the general public. I think there's a large group of people who are producing proofiness on an industrial scale for public consumption.
Steven Cherry: Polls aren't an obvious problem, but some pollsters do seem to really know what they're doing. In the last election there was the one Web site, 538.com, which seemed to get a lot of things right, and to be able to explain their predictions. That guy is actually now working full-time for The New York Times. Obviously there are respectable ways to make numerical predictions. How does the ordinary person tell the good from the bad?
Charles Seife: It's really difficult, and that's part of the problem. Nate Silver at 538.com is really good at what he does. He takes a whole bunch of polls and he compares them and he does analyses, regression analyses, and he tries to extract information from this collection of polls. I think that's atypical of what people do with polls. That people tend to live from poll to poll. Each one gets handed down, and if they like it, they talk about how wonderful it was, and that it shows that their favorite candidate was winning. If it shows what they don't want, and it shows that their candidate is losing, they'll tend to dismiss it or explain it away, and it turns out that if you look at individual polls, the amount of information that's contained in them is fairly small. Quite often, biases creep in, choices of who gets to answer the poll—very significant, choices of how the wording is phrased are very significant, and you'll find that polls typically can be very, very far off from what they're supposed to be. The margin of error guarantees that the real answer is within, 95 percent of the time, within 3 points, plus or minus, usually. Yet you'll find that the actual results of elections are way further off than these polls would allow. A good example recently is the Alaska race, where people thought that Senator Murkowski would win, when in fact, she didn't. And it wasn't even close, in the polls; it was a 10 to 20 point lead. So individual polls can be very deceptive. If there's any value to polls at all, it's in taking a whole bunch of them, comparing them, and treating them with a very great deal of circumspection, as Nate Silver does.
Steven Cherry: Your book not only discusses polling, which is an obvious hotbed of numerical shenanigans, but also vote counting, which, until a few years ago, most of us though was a pretty straightforward matter.
Charles Seife: Any engineer, any scientist knows that any measurement can't be done perfectly, and when you do a measurement, you have to understand that there are errors associated with that measurement. Counting is no different than any other measurement. Therefore there are errors associated with it. When those errors are large compared to the margin between the two candidates, you're in trouble. And this happens. This happens over and over again. The recent example is Al Gore versus George W. Bush. That was in 2000. The votes were within a few thousandths of a percent of one another, yet the error rate was on the order of a couple percent. Similarly, Al Franken versus Norm Coleman. Again, a few thousandths of a percent separated the two candidates, but if you count something and count it again, you'll find you get an error rate on the order of a couple hundredths of a percent. So it turns out that even in the simple act of counting votes, we can't do it perfectly. And sometimes that imperfection swamps the signal that we're trying to read. It swamps the fact that one candidate may be very slightly preferred over another candidate. Given the measurement ability that we have, we might not be able to tell who actually won the election.
Steven Cherry: Charles, do you know—more than 100 years ago, maybe it was Mark Twain, maybe it was Benjamin Disraeli, maybe it was Charles Wentworth Dilke, said: "There are three kinds of lies: lies, damn lies, and statistics." Over the years, a number of books have been written to express that idea in long form—why did you decide to take it on, and why now?
Charles Seife: Well, before I was a journalist, I was a mathematician. I was going the Ph.D. route before I decided to join the circus. And one of the things that helped me decide to go into journalism was my annoyance at the fact that journalists didn't have the mathematical ability to see that certain things were obviously wrong, even when presented in numerical form. They seemed ill-equipped to challenge a bit of wisdom if it was in numerical format, so their sense of skepticism would shut down. I went into journalism trying to bring a little skepticism, a little bit of numerical understanding that was lacking. Over the years, I've been gathering stories, and just watching bad numbers filter down through the media to the public, and as I watched this problem, I realized it was a larger problem than I initially thought, that in fact, humans are ill-equipped, generally speaking, to deal with bad numbers, and we tend to turn off our BS detectors when someone presents us with a number. After gathering 21 years' worth of files, I decided it was time.
Steven Cherry: Well, I'm glad you did, it's a terrific book, and I hope a lot of our readers read it. Thank you!
Charles Seife: I really appreciate it.
Steven Cherry: We've been speaking with New York University journalism professor and science writer Charles Seife about his new book Proofiness: The Dark Arts of Mathematical Deception" For IEEE Spectrum's "This Week in Technology," I'm Steven Cherry.