May 8, 2008, 9:17 pm
My print column this week asks the question, do you know when you’ve been too drunk to drive, and if you do, would you tell a stranger about it over the phone? This question hangs over three recent surveys on the topic, by the government, a restaurant trade group and an AAA-funded foundation.

My guess is that few drivers carry around this chart from the Wisconsin Department of Transportation that allows drivers to estimate their blood-alcohol level. Even if they did, and knew for sure how many drinks of what size they’d consumed, they might have trouble calculating their sobriety.
But suppose you have no doubt you’ve driven drunk. Would you tell a pollster? Survey researchers have found that respondents’ desire to appear in line with social norms can influence results. This can mean exaggerating desirable behavior, like giving to charity, and downplaying stigmatized behavior such as driving drunk. Given that confounding effect, it’s troubling that 9% of drivers admitted in the AAA Foundation survey to having driven at least once in the last 30 days when they believed their blood-alcohol level was over the limit. The government survey found a rate of 15% over the previous year, though it may have included drivers who had been drinking but weren’t drunk. Such findings could lend fuel to legislative efforts to broaden the use of so-called ignition interlock devices, which test drivers’ breath for alcohol and prevent driving if the level detected is too high.
What do you think? Would you know for sure whether you’ve driven while legally drunk? How honest would you be with a pollster about it? Should ignition interlock devices be mandated for first-time offenders? Please let me know in the comments.
May 8, 2008, 3:22 pm
Here’s a reading list of articles about sports numbers, from the true distance of legendary home runs to a formula for measuring the chance of a basketball comeback:
Mickey Mantle hits a home run eight years after his supposed 565-foot blast. (Associated Press Photo)
Mickey Mantle supposedly hit a home run 565 feet 55 years ago. “The more evidence surfaced to debunk it, the stronger the legend of 565 grew,” Jeff Passan wrote on Yahoo last month. In place of the homer that is supposedly history’s longest, a new breed of old moonshots are bidding for recognition. Among the Mantle skeptics is an author of an upcoming book on tape-measure shots who favors a Babe Ruth blast in 1926. Meanwhile, just days after the Mantle anniversary, two Pittsburgh baseball fans presented the theory that, three years before Mantle’s home run, former Pirates slugger Ralph Kiner hit one farther.
Baseball is the sport most awash in numbers — the average box score contains more data than a soccer season — and its stats-heads are getting a whole new set of information to parse, thanks to pitch-tracking technology. Next up, analysts hope, is information about the movement of the ball after it leaves the bat, and fielders in reaction to it. On Hardball Times, Mike Fast proposes 22 potential areas of research, including: “Can we quantify the performance and tendencies of umpires?” and “Can we detect or quantify pitcher fatigue or injury?”
The authors of “Freakonomics” would like to see baseball-style stats crunching for basketball, and note that the Boston Celtics already are employing a numbers maven. Those who read the article and want their own numerical insight into the NBA playoffs can start at 82games.com, which quantifies players’ value to their teams. Among the noteworthy findings: The Celtics-Cavaliers second-round series features the league’s two most valuable players, both in the regular season and the playoffs so far: Kevin Garnett and LeBron James. This 2004 profile of the site shows the uphill battle founder Roland Beech has faced in introducing new stats to the league.
Bill James, the legendary baseball numbers guru who advises the Red Sox, could help basketball make that transition. As it turns out, he’s a Kansas basketball fan. Before the NCAA tournament, he made public his personal calculator for determining when a lead is safe — when the series of events required for the losing team to overcome it is essentially impossible. Had he consulted it when his Jayhawks were trailing by nine points with less than two minutes left in the national championship game, Mr. James would have been happy to see the output: “No way” was Memphis’s lead safe. Kansas came back to force overtime and win.
Tennis umps got the call right in 60% of player challenges studied by a University of Sussex psychologist. George Mather analyzed 1,473 challenges lodged in 2006 and 2007, and judged by the Hawk-Eye replay system. Of these, 94% of them really were close calls. “That, Dr Mather claims, suggests strongly that players challenge when they genuinely believe that a mistake has been made, rather than using it as a ploy to unsettle an opponent or to buy vital seconds to rest before a critical point,” according to the Times of London.
May 7, 2008, 2:21 pm
Following the Indiana and North Carolina primaries, the campaigns of the two Democratic presidential hopefuls used numbers — or excluded them — to tell very different stories about the state of the race. Sen. Barack Obama’s Web site prominently featured his lead in the delegate count, while Sen. Hillary Clinton’s campaign described her win in Indiana as a come-from-behind victory. These numerical narratives — intended to influence both the media and the superdelegates — have limitations.

Sen. Obama’s home page showed that he held 1,853 delegates, to Sen. Clinton’s 1,698.5. The results center, just a click away, shows the delegate totals on a horizontal bar, with 2,025 delegates — the minimum needed to clinch the party’s nomination — marked as the midpoint. Earlier this week, the graphic underplayed Sen. Clinton’s delegate count, making her look further behind than the numbers suggested. But as of Wednesday morning, my check suggests the numbers and visuals jibe.
Sen. Clinton has argued since February, on her campaign-funded site delegatehub.com, that the target delegate number is 2,209 — which includes delegations from Michigan and Florida. Both states were stripped of delegates because they violated party rules by holding primaries in January. The candidates have clashed over how to divvy up the delegations if they are seated.
Despite its name, delegatehub.com doesn’t contain much in the way of delegate numbers besides the 2,209 figure. There is no delegate count, nor profiles of the delegates or anything else one might expect on a hub for delegates. Among its posts is a link to a March item from Marc Ambinder’s Atlantic blog about the popular vote, in which he discusses a scenario in which the cumulative popular vote from primaries and caucuses could split for Sen. Clinton. However, North Carolina’s massive turnout and 14-point win for Sen. Obama has upended that scenario. The Pennsylvania, Indiana and North Carolina primaries were essentially a wash in popular votes, according to vote totals on CNN.com. (By my count, Sen. Obama is ahead in those three most recent primaries by a combined 107 votes.)
In a memo sent to superdelegates Wednesday and posted on its homepage, the Obama campaign both played up this popular-vote gain, and downplayed it. “We believe it is exceedingly unlikely Senator Clinton will overtake our lead in the popular vote and in fact lost ground on that measure last night,” campaign manager David Plouffe wrote. “However, the popular vote is a deeply flawed and illegitimate metric for deciding the nominee — since each campaign based their strategy on the acquisition of delegates.”
Notably, the Clinton campaign also didn’t dwell on the popular vote after Tuesday’s results. Instead, a media release Tuesday night focused on the narrow victory in Indiana. By the Clinton numbers, this is a tale of a late, great comeback: “We started out behind in both the public and internal polls. For example, our March 13 poll showed Hillary trailing by 8 points, while our latest poll gave Hillary a 5 point lead. … Similarly, in mid-February, the Howey-Gauge poll had Barack Obama 15 points ahead of Hillary Clinton (Feb 16-17: Obama 40 / HRC 25). By April 23-24, Hillary had narrowed the gap to only 2 points in the same poll (Obama 47 / HRC 45).”
A trend depends entirely on the endpoints you choose. Four polls conducted in late March and early April all showed Sen. Clinton with a bigger lead than her eventual margin, according to numbers compiled by Pollster.com. In the two weeks before the primary, none of the 18 polls showed Sen. Obama with a lead bigger than two points, and most showed a healthy Clinton lead. She won the state by two points.
Spinning the numbers is an exercise to influence the remaining 260 to 270 uncommitted superdelegates. Sen. Clinton needs to win over most of them, because just 217 pledged delegates in six primary contests remain. Yet before Tuesday’s results, a mathematical model projected that Sen. Obama was likely to win a majority of the undecided superdelegates. I wrote about this model last week.
Since then, the model’s creator, Brian F. Schaffner, research director of American University’s Center for Congressional and Presidential Studies, adjusted it to incorporate the preferences of superdelegates in the same state as the undecideds. Doing so increases the model’s ability to predict prior endorsements to 77% from 70%. It also puts the probability of supporting Sen. Obama at 60% or higher for 139 superdelegates, compared with just 41 for Sen. Clinton. The 10 undecided superdelegates in Indiana and North Carolina each have at least a two-thirds chance of supporting Sen. Obama.
May 5, 2008, 11:18 am
The New York Police Department and a sociologist squared off in a dispute over marijuana-arrest numbers last week. As is typical in a numbers fight, both sides massaged data to their benefit.

The sociologist, Harry G. Levine of Queens College, co-wrote a report for the New York Civil Liberties Union claiming that New York arrests and jails more people for possessing marijuana than any other city in the world. The report counted about 393,000 marijuana-possession misdemeanor arrests in the decade through December 2007, based on police stats.
Several articles headlined a quote from the civil-liberties group’s executive director that New York is “the marijuana arrest capital of the world.” That’s really a best guess — “as best as we can determine, this statement is accurate,” the report says, lacking hard numbers on many large cities’ arrests. The logic goes like this: Just a couple dozen cities world-wide are in the same size range as New York. Of these, most are poorer and “simply cannot afford to use police in the way that New York does,” according to the report. The rest are either in Europe, which has a more lax approach to marijuana, or in Asia, where marijuana use appears to be lower. (Prof. Levine emailed me several news articles about these countries, including one which placed the number of annual arrests in all of Japan a decade ago at between 1,500 and 2,000.)
Notably, these are arrest totals, not rates. New York, which had more than double the population of U.S. runner-up Los Angeles in 2006, would be the U.S. capital of many categories by that measure. For instance, in the first half of last year it was the site of 235 murders, more than any other city, though Chicago’s per-capita rate was more than twice as high. New York is also among the leaders in per-capita marijuana-arrest rates, but the report acknowledges that Atlanta has a higher per-capita rate. “I will, in the next six months to a year, really investigate what’s going on in Atlanta,” Prof. Levine told me. He added, “I have always sought to present numerical data as responsibly and conservatively as I could.”
NYPD Deputy Commissioner Paul J. Browne criticized the report’s “absurdly inflated numbers” to the New York Sun and the local CBS affiliate. I asked Mr. Browne what he meant. He didn’t dispute the overall arrest numbers, though he pointed out that overall marijuana arrests decreased between 2003 and 2006 by about 25%, compared with the previous four-year period. However, arrests rose sharply last year — up 23% from 2006.
Mr. Browne’s bigger dispute was with the characterization of the arrests — that most people arrested possessed “only a small amount,” and “were not smoking in public.” Mr. Browne responded that such offenses would result only in a violation. Mr. Browne said police issued 8,770 marijuana-related violations between 1997 and 2006.
The report also accused the NYPD of bias, because more than half of arrestees from 1997 to 2006 were black, more than three times the number of arrested non-Hispanic whites. Yet there are about 40% more non-Hispanic whites than blacks in the city. But there’s no reliable survey of marijuana use by race in New York City. The report based its criticism in part on two government-funded national surveys, each of which finds that marijuana use tends to be higher among whites than among blacks. Those reports relied on people to report their own drug use — something Mr. Browne took issue with. They “compared a national survey of individuals volunteering information about marijuana use — with no way of confirming truthfulness — to actual street arrests in New York,” he said. He cited a RAND Corp. study last year that found little racial bias in the NYPD’s street interactions with the public.
May 1, 2008, 1:28 pm
The Centers for Disease Control and Prevention released Wednesday a new report on breastfeeding, which it posted on its Web site under a link entitled “Breastfeeding in the U.S. at an All-Time High.” The press mostly repeated that storyline, leading reports with the news that 77% of 434 infants studied in 2005 and 2006 were breastfed at least once, up from 70% over the two prior years and from 60% in 1993-1994. (The same goes for an Associated Press article picked up by The Wall Street Journal and its Health Blog.)

But looked at another way, the CDC numbers show that breastfeeding is flat — and the rate of long-term acceptance of the practice is declining among those who try it. The latest available rate of breastfeeding for six-month-old infants barely cleared 30%, well short of a federal-government goal of 50% by 2010, and barely budged from a decade earlier.
Taken collectively, the numbers mean that more new mothers are trying breastfeeding, but a smaller percentage of those who do try breastfeeding stick with it — and that can have serious health consequences. “It is exclusive breastfeeding for about six months that is most related to optimal health outcomes,” said Lori Feldman-Winter, a pediatrician at the University of Medicine and Dentistry of New Jersey who has helped steer American Academy of Pediatrics efforts to increase breastfeeding rates. Jane Morton, who has also contributed to these efforts and is a clinical professor of pediatrics at Stanford University, told me, “A lot of the benefits really do depend on the exclusivity and duration of breast-feeding.”
Margaret McDowell, a CDC health statistician and co-author of the latest report, told me that both indicators are important. Early breast milk, also called colostrum, contains antibodies and protein that help protect newborns, and that formula doesn’t provide. “Any amount [of breast-feeding] is really good for the infant,” said Ms. McDowell, a registered dietitian. As for the flat six-month rate, “We’d like to do better.”
Hospitals and the workplace can impede progress. Women who get off to a poor start are likely to stop breastfeeding, and their attempt can be hampered from the moment of birth, particularly in the case of C-sections, when the child often is taken to a nursery, Dr. Morton said. “The majority of hospitals give free samples of formula and formula company marketing materials,” Dr. Feldman-Winter said. On the job, keeping the milk supply up can be challenging. “Poor women have jobs with less support for continued breastfeeding and they are more likely to return to work sooner after delivery,” Dr. Feldman-Winter said.
The numbers themeselves are part of the challenge of increasing breastfeeding rates: The data are old, and include a lot of uncertainty. They come from the National Health and Nutrition Examination Survey, in which thousands of Americans each year who agree to participate are interviewed in their homes and then undergo physical examinations in mobile centers. This is expensive work, hence the mere 434 infants included in the latest survey.
Because not all of the infants born in 2005-2006 had reached six months by the time the latest survey was conducted — Ms. McDowell couldn’t say how many had — there wasn’t enough data about breastfeeding at six months for the group. So the CDC’s latest data for the six-month indicator came from infants born in 2003-2004. The data are grouped in two-year periods to build a large enough sample, delaying findings.
Also, the breastfeeding rates are self-reported — meaning the numbers could reflect the increased desire of mothers to breastfeed, rather than increased practice. (The latest numbers agree with another CDC survey, also based on self-reporting.)
April 30, 2008, 11:21 am
The title of Leonard Mlodinow’s book, “The Drunkard’s Walk,” evokes the randomness of events, as if governed by drunken ambling. Seeing the world through this lens is itself disorienting — success is the product of luck; identifying real patterns is nigh impossible; and our natural faculties mislead us at every turn. In recent weeks we’ve explored this world through a probability quiz (and debated the answers). Today I’m interviewing Mr. Mlodinow — a lecturer at Cal Tech who has written for “Star Trek” and collaborated on a book with Stephen Hawking — about his latest book, and about the role of randomness in our lives. (Thanks to readers Kevin Friedman, Devon McCormick and Craig L. Sparks for submitting questions.)
Leonard Mlodinow (Photo by Marcio Fernandes)
WSJ: You argue persuasively that much of what we consider a track record of expertise is really an accident of luck. Is there any true expertise, in your opinion? Are there any experts you trust?
Mr. Mlodinow: I believe there is true expertise in some endeavors, and not in others. There is obviously no such thing as expertise in predicting the results of coin tosses, but there is expertise in predicting the behavior of lasers. I feel that picking stocks or predicting Hollywood hits is more like the former. The process of building a company or making a film is more like the latter.
But there is a related question: Given that we are discussing an endeavor in which it is possible, how can you tell if someone has expertise? That is hard, because expertise plus bad luck can equal a failure, and lack of expertise plus good luck can equal success. The only way to tell the two apart is to observe the individual over a long time, which in statistics often means 100 or even 1,000 trials. This is obviously often not possible, so I recommend instead that we judge people by a thoughtful analysis of their intelligence, philosophy, work ethic, etc., rather than simply by their results.
WSJ: If you can pick an index-outperforming stock 51% of the time, how many picks do you need to make to have better than a 99% chance of outperforming the index? (We’ll assume your picks are uncorrelated and that the magnitude of any outperformance or underperformance is the same.)
Mr. Mlodinow: Consider a stock analyst versus an index fund in a kind of stock-picking World Series. The law of large numbers says if you play a best-of-X series you can be confident that the best team will win — if X is large enough. But for small X, say, a best-of-seven series, there is a surprisingly large chance that the lesser team will win. So in sports just because one team is superior doesn’t mean it will win the series.
The same uncertainty applies to the market. For example, suppose the stock picker has a 51/49 edge over the index fund, meaning he or she will outperform it, in the long run, in 51% of the years in which they compete. How long is the long run in this case? The mathematics shows that in order to justify 99% confidence that the stock picker will outperform the index fund more often than it underperforms it, the contest would have to go on for about 13,700 years.
WSJ: Just because a certain human achievement — say, clutch hitting, or successful stock picking — exhibits the normal statistical variation, does that necessarily mean the best performers were just lucky? Or is there something about human intentionality that makes it possible that the best performers really did exhibit extraordinary skill and were deserving of the result?
Mr. Mlodinow: Intentionality and talent always matter. An extraordinary feat is certainly made more likely by someone’s focus, hard work, etc. But chance also matters. And since there are few situations outside the science laboratory in which the random influences can be eliminated, luck is almost always a part of the statistical variation we observe in people’s feats.
In order to judge which is dominant, we have to consider the specific endeavor. In sports this has been studied extensively. For instance, though basketball players often make many baskets in a row, you can compile a player’s probability of making a basket after making the previous shot and compare it to that player’s probability of making a basket after missing the previous shot. This has been done for many players, even players known to be “streaky,” and the probabilities always come out to be equal, and so the streaks seem to be due to random variation rather than a “hot hand.” Moreover, the patterns of streaks that occur in most major sports have been studied, and look exactly like what one would expect from a purely random process, such as a coin-tossing model. This leads me to believe that that is probably what is going on.
WSJ: Do you think that part of the appeal of sports is their simplicity — that there is, generally, a level playing field and success seems deserved — as opposed to, say, picking stocks or picking movies for a Hollywood studio?
Mr. Mlodinow: Success in sports is deserved, but even for the best players, the headlines usually come from the fluctuations rather the norm, and chance is usually a large part of it. A ballplayer may average a hit per game, or a basketball player 20 points per game, and that will make them stars in the long term. But it is when a player has four hits or makes 40 points in a game that people really start talking. I think the fun of following the movie box office and stocks is very similar to the fun of sports – all three combine passion and unpredictability.
WSJ: After a particular drug is on the market, it will cause a particularly serious adverse effect to happen to one of every 3,000 patients in an epidemiological, i. e., post-hoc analysis. In retrospect, how many patients must be tested in the randomized, double-blinded, placebo-controlled, clinical trial to achieve 95% confidence that the side effect will show up?
Mr. Mlodinow: You need roughly 14,000 patients. Here is how you get that: The process is governed by the binomial distribution, which can be approximated by the normal distribution. The chance of an adverse reaction in any one patient is one in 3,000. Since you want a 95% confidence interval for one (or more) reactions, you want enough patients so that 1.00 is 1.64 standard deviations (or more) below the mean. With 14,000 patients the mean number of adverse reactions will be about 4.6 and the standard deviation is about 2.2, which gives you what you require. (I have rounded my answer to the nearest 1,000).
Correction: To achieve 95% confidence that the side effect will show up, you need 8,985 patients receiving the drug. This blog post misstated the number as 14,000. See the comments of this post for more details.
WSJ: Might we need to proceed irrationally in our lives to succeed? In other words, if we really believed that so much of success was the result of luck, wouldn’t a lot of us just give up trying?
Mr. Mlodinow: Some theorize that this is the evolutionary reason that we like to assume we are in control, even when we clearly aren’t. That may be so, but I don’t mourn the role of luck, I celebrate it. All else equal, it is a lot more fun not knowing how your book will do, or how your life will turn out, than it would be if everything could be determined by a logical calculation. Moreover, the fact that luck matters means you can help yourself by being persistent. A failure doesn’t mean you are unworthy, nor does it preclude success on the next try. As Thomas J. Watson, the highly successful IBM pioneer, said, “If you want to succeed, double your failure rate.”
WSJ: How would you respond to Mark Twain’s quip that “People commonly use statistics like a drunk uses a lamp post: for support rather than illumination”?
Mr. Mlodinow: I think Mark Twain was 110% correct.
Next up in the Numbers Guy interview series: New York University political scientist Steven Brams, author of “Mathematics and Democracy: Designing Better Voting and Fair-Division Procedures.”
|