Want proof? Go to seminary
It's the end of October, so a couple of reminders:
1) No "Don't believe in ghosts?" heds on the front page. Or anywhere.
2) No polls about whether people do or don't believe in ghosts. And, while we're at it ...
3) No baking the poll data! As the Herald-Leader demonstrates in great detail at the top of today's 1A:
U.S. Sen. Mitch McConnell's popularity has continued to slip, suggesting vulnerability in his 2008 re-election bid. But he would still defeat any of four potential Democratic challengers if the race ended today, a new poll shows.
The mixture of conjecture and out-and-out misstatement isn't unusual. It's depressingly common, in pretty good newspapers and pretty bad ones alike, which is why it's worth exploring in some detail here. (Warning: Mildly lengthy screed about survey research, newsroom culture and epistemology follows.) Journalism about social science -- and don't kid yourselves; that's what polling is -- can go wrong in any of several domains, sequentially or all at once. Let's look at a couple before we get back to why Lexington's article above is so thoroughly wrong. (And, yes, there's a lesson for editors: If something's wrong, fix it or spike it.)
Newsroom discussions about ledes of this sort tend to be resolved in nonproductive ways: The person with more status and/or power says "Let's go with the writer on this one" or "well, I guess we just have a difference of opinion on that." Those are adequate answers in some cases, much less so in others. They come about because journalism generally does a bad job at levels-of-analysis issues. Let's take a made-up, but only slightly made-up, example.
Days have been getting longer and we've reached the point in the year at which "day" and "night" are of equal length (assuming we have an agreed definition of "day" and "night"). You say "It's 'spring'! Let's throw some virgins in the river to assure a good planting season!" I say "No, let's go to this drafty stone building, sing some cool songs and listen to a dude in a dress for a while." We have different beliefs about what makes for a good planting season, and that dispute can't be settled by the tools available to journalism; for our purposes, it isn't really a debate.
Or you could say "It's too early to plant" and I could say "No, we can probably get away with it, and we'd get more tomatoes." This is an argument about risk assessment -- more or less the equivalent of real political debate. One side thinks a particular course involves too much risk, and the other thinks the risk-benefit level is appropriate. That's the sort of discussion "objective" journalism, at least in the ideal world, is especially suited for: Evenhanded, impartial presentation of evidence. It's what editors mean when they say stuff like "We just give both sides and let people make up their own minds."
That's a comforting bit of belief and professional ritual, but it's usually a crock, because journalism applies it to too many types of question. Take a third alternative: You say it's the equinox, and I say "no it isn't." The "give both sides" conceit falls apart there. It doesn't work for issues of fact, because issues of fact aren't debatable (that's the "green cheese fallacy"; you can't "give both sides" in a dispute about whether the moon is made of green cheese because the pro-cheese side is fiction). It doesn't work for belief questions. And it doesn't work at all when you start at one level and hop to another.
What does that mean for this poll? Well, some of the story's assertions boil down to risk management issues. You can say a 2-point difference is a "lead" if you want to, but you need to know that you have a much higher risk of error. The reason we talk about "standards customarily used by statisticians" is that "one chance in three we're wrong" isn't one of them.
That's one level of analysis. The bigger one is found in the lede, which declares that the poll says McConnell would beat any of four Democrats if the race ended today. And as a blunt issue of fact (a green cheese issue), that ain't so. The poll says no such thing, and there are a lot of reasons why. But first, this commercial for newsroom sociology.
Journalism doesn't do probability very well. Part of that's down to lack of training (and yes, every journalism curriculum should require at least one course in statistical reasoning, quantitative research methods or something like that). But more to the point, journalism tries to be about certainty, and social science isn't in that realm. It's in the probability business. Hence the saying: If you want proof, go to seminary. In the mundane realm, we're stuck at 95%.
But even if everybody on the desk could crunch numbers like a maniac, we'd still have issues. Hedges and qualifications are inherently uninteresting, which helps drive Mark Liberman's astute observation last week that heds are subject to a Gresham's law of their own. Put another way, "CARTOONS MAKE KIDS VIOLENT!" is a more hed-like hed than "Some cartoon violence, in some situations, seems to make some kids act in ways that we think might translate into violent behavior outside the artificial setting in which we measured this effect."
That's not simply a matter of selling more papers (to Mark's credit, he didn't suggest that it was). It's tied in deeply with a cultural value that's even more thoroughly ingrained than competition: The "watchdog" function of the press. If Saturday morning cartoons do threaten to turn your poppet into Mad Max on a tricycle, we're going to err on the side of Saving Our Kids and slap the story on the front page. Questions of effect size or construct validity aren't at the table for that discussion because they aren't part of how journalism judges its role in societal well-being.* Selling more papers is always nice, but it's not necessary to the watchdog function.
One more extrinsic concern to throw in, which we'll call the Fallacy of the Remote Truck. Why is EyeWitlessActionNews5 broadcasting live from the courthouse at 11 p.m.? Because the truck cost a pile of money and it's not going to sit in the garage if we can use it, even if the only thing moving in the courthouse is the cockroaches. A good poll costs a lot, and by all appearances the Herald-Leader poll was done right. The Remote Truck fallacy doesn't guarantee it a spot on the front page, but it does suggest that a poll doesn't have to have a lot of "news" to be Big News at the organization that paid for it.
Which of those factors might be at play in the Herald story? Any to a lot of them. Power issues, lack of statistical training, misunderstanding of what different kinds of "error" look and act like, desire to satisfy professional norms, and the urge to say what stuff "means," rather than listing all the limits on the conditions under which it might mean something. So let's have a look:
McConnell still leads the pack
But poll suggests vulnerability
U.S. Sen. Mitch McConnell's popularity has continued to slip, suggesting vulnerability in his 2008 re-election bid. But he would still defeat any of four potential Democratic challengers if the race ended today, a new poll shows.
Hold on to "continued to slip" and "suggesting vulnerability" for a second. The issue here is the fact claim in the second sentence -- that the poll shows that McConnell would defeat any of the four Democrats. It isn't true. There are two broad reasons why.
One is practical. "Who would you vote for today?" doesn't answer questions about a future election very well. McConnell won't face four Democrats; he'll face one. Voters (13% to 21% for this question, depending on the Democrat) don't get to say "undecided" in the voting booth. We don't know how many "undecided" voters are telling the truth and how many just don't want to say. And, of course, preelection polls measure what people say they'll do -- not what they will do.
Could we just disagree about that -- is it a question about journalistic risk? Maybe. But the second issue is the stats themselves, and that's a green cheese issue. The numbers the study presents do not show that McConnell "would still defeat any of four potential Democratic challengers." Here's the boring way to say what they do show:
In two cases (Stumbo and Horne), significantly more voters say they would choose McConnell than the Democrat. Thus, we're 95% confident that if all "likely voters who vote regularly in statewide elections" were surveyed, McConnell would be ahead on this question. So he'd "win," as long as you don't count those pesky undecideds or worry about that 5% of cases in which our sample isn't an accurate reflection of the population. (Want "proof" rather than probability? Get those seminary applications ready.)
For Luallen and Chandler, it's a different story. In either case, the poll could be a perfectly accurate reflection of a lead for the Democrat in the whole population (again, ignoring the undecideds). The results don't represent a "statistical tie"; McConnell is likely to have a lead in the entire non-undecided population, but it's likely in the sense that it has about one chance in four of not being true.
... Fewer than half said they had a favorable impression of McConnell, the Senate Republican leader.
Still, the 47 percent who said they had a positive view of McConnell is higher than the 46 who don't.
"It's a mixed bag," said Del Ali, president of the Olney, Md.-based firm Research 2000 that conducted the poll. "He can rightly say 'I could be one of the higher-rated Republicans in the country, and I'm more of a target. In spite of that and the kitchen sink being thrown at me, there's a plurality who still think favorably of me.'"
Among the worst news for McConnell was that 46 percent of respondents disapproved of his performance compared to 45 percent who said they like what he's done -- the first time an independent media poll has shown the senator with a higher percentage of disapproval.
And any incumbent should be worried when his or her popularity and job approval ratings are under the 50 percent threshold, Ali said.
The reporter's interpretations are interspersed with the pollster's, and the reporter comes out second best. The reporter's making a lot out of two 1-point gaps, which are highly unlikely to represent meaningful differences (and thus unlikely to represent the sort of change that could be "worst news"). The pollster actually makes some relevant comments, partly because he sticks to generalities: The results are a mixed bag, and popularity/approval ratings under 50 percent are never a good sign for incumbents.
So how worrisome are the numbers in the lede? We don't know; there's nothing in the story to support the idea that there's been any movement at all.
On the positive side for McConnell, the poll showed he'd have at least a 5-percentage-point lead over each of four potential Democratic challengers. ... "The headline should be that McConnell beats all comers, which is remarkable given the fact that your paper as well as several D.C.-based liberal groups have waged an unrelenting attack on the (GOP Senate) Leader for the past several months," said Billy Piper, McConnell's chief of staff, in a statement.
Here we have a difference of opinion. That's more or less what the headline is, whereas it ought to say something like "Billy Piper is a moron." Unfortunately, we can't switch standards in midseason. We can't say a one-point gap in approval ratings is a big deal, then turn around and tell the chief of staff for the Senate Republican leader that a five-point gap doesn't represent a significant difference at traditionally accepted confidence levels.
We seem to have painted ourselves into a lot of corners here, none of them necessary. Some of the needed changes can be introduced at the training level (as in, reporters who can't explain confidence levels and confidence intervals aren't allowed to write about survey research). Some of them need fixing at the level of how newsrooms work. Those might be a bit, um, tricky.
* That's broadly true of journalistic interpretations of media effects in general. Think of the one justification that's offered every time there's an adverse public reaction to, say, a photo of a teenager's body after a car crash: "If it keeps even one kid from doing something stupid, it's worth it." Which is certainly a reassuring thing to think; it just doesn't have any known relationship to reality.
1) No "Don't believe in ghosts?" heds on the front page. Or anywhere.
2) No polls about whether people do or don't believe in ghosts. And, while we're at it ...
3) No baking the poll data! As the Herald-Leader demonstrates in great detail at the top of today's 1A:
U.S. Sen. Mitch McConnell's popularity has continued to slip, suggesting vulnerability in his 2008 re-election bid. But he would still defeat any of four potential Democratic challengers if the race ended today, a new poll shows.
The mixture of conjecture and out-and-out misstatement isn't unusual. It's depressingly common, in pretty good newspapers and pretty bad ones alike, which is why it's worth exploring in some detail here. (Warning: Mildly lengthy screed about survey research, newsroom culture and epistemology follows.) Journalism about social science -- and don't kid yourselves; that's what polling is -- can go wrong in any of several domains, sequentially or all at once. Let's look at a couple before we get back to why Lexington's article above is so thoroughly wrong. (And, yes, there's a lesson for editors: If something's wrong, fix it or spike it.)
Newsroom discussions about ledes of this sort tend to be resolved in nonproductive ways: The person with more status and/or power says "Let's go with the writer on this one" or "well, I guess we just have a difference of opinion on that." Those are adequate answers in some cases, much less so in others. They come about because journalism generally does a bad job at levels-of-analysis issues. Let's take a made-up, but only slightly made-up, example.
Days have been getting longer and we've reached the point in the year at which "day" and "night" are of equal length (assuming we have an agreed definition of "day" and "night"). You say "It's 'spring'! Let's throw some virgins in the river to assure a good planting season!" I say "No, let's go to this drafty stone building, sing some cool songs and listen to a dude in a dress for a while." We have different beliefs about what makes for a good planting season, and that dispute can't be settled by the tools available to journalism; for our purposes, it isn't really a debate.
Or you could say "It's too early to plant" and I could say "No, we can probably get away with it, and we'd get more tomatoes." This is an argument about risk assessment -- more or less the equivalent of real political debate. One side thinks a particular course involves too much risk, and the other thinks the risk-benefit level is appropriate. That's the sort of discussion "objective" journalism, at least in the ideal world, is especially suited for: Evenhanded, impartial presentation of evidence. It's what editors mean when they say stuff like "We just give both sides and let people make up their own minds."
That's a comforting bit of belief and professional ritual, but it's usually a crock, because journalism applies it to too many types of question. Take a third alternative: You say it's the equinox, and I say "no it isn't." The "give both sides" conceit falls apart there. It doesn't work for issues of fact, because issues of fact aren't debatable (that's the "green cheese fallacy"; you can't "give both sides" in a dispute about whether the moon is made of green cheese because the pro-cheese side is fiction). It doesn't work for belief questions. And it doesn't work at all when you start at one level and hop to another.
What does that mean for this poll? Well, some of the story's assertions boil down to risk management issues. You can say a 2-point difference is a "lead" if you want to, but you need to know that you have a much higher risk of error. The reason we talk about "standards customarily used by statisticians" is that "one chance in three we're wrong" isn't one of them.
That's one level of analysis. The bigger one is found in the lede, which declares that the poll says McConnell would beat any of four Democrats if the race ended today. And as a blunt issue of fact (a green cheese issue), that ain't so. The poll says no such thing, and there are a lot of reasons why. But first, this commercial for newsroom sociology.
Journalism doesn't do probability very well. Part of that's down to lack of training (and yes, every journalism curriculum should require at least one course in statistical reasoning, quantitative research methods or something like that). But more to the point, journalism tries to be about certainty, and social science isn't in that realm. It's in the probability business. Hence the saying: If you want proof, go to seminary. In the mundane realm, we're stuck at 95%.
But even if everybody on the desk could crunch numbers like a maniac, we'd still have issues. Hedges and qualifications are inherently uninteresting, which helps drive Mark Liberman's astute observation last week that heds are subject to a Gresham's law of their own. Put another way, "CARTOONS MAKE KIDS VIOLENT!" is a more hed-like hed than "Some cartoon violence, in some situations, seems to make some kids act in ways that we think might translate into violent behavior outside the artificial setting in which we measured this effect."
That's not simply a matter of selling more papers (to Mark's credit, he didn't suggest that it was). It's tied in deeply with a cultural value that's even more thoroughly ingrained than competition: The "watchdog" function of the press. If Saturday morning cartoons do threaten to turn your poppet into Mad Max on a tricycle, we're going to err on the side of Saving Our Kids and slap the story on the front page. Questions of effect size or construct validity aren't at the table for that discussion because they aren't part of how journalism judges its role in societal well-being.* Selling more papers is always nice, but it's not necessary to the watchdog function.
One more extrinsic concern to throw in, which we'll call the Fallacy of the Remote Truck. Why is EyeWitlessActionNews5 broadcasting live from the courthouse at 11 p.m.? Because the truck cost a pile of money and it's not going to sit in the garage if we can use it, even if the only thing moving in the courthouse is the cockroaches. A good poll costs a lot, and by all appearances the Herald-Leader poll was done right. The Remote Truck fallacy doesn't guarantee it a spot on the front page, but it does suggest that a poll doesn't have to have a lot of "news" to be Big News at the organization that paid for it.
Which of those factors might be at play in the Herald story? Any to a lot of them. Power issues, lack of statistical training, misunderstanding of what different kinds of "error" look and act like, desire to satisfy professional norms, and the urge to say what stuff "means," rather than listing all the limits on the conditions under which it might mean something. So let's have a look:
McConnell still leads the pack
But poll suggests vulnerability
U.S. Sen. Mitch McConnell's popularity has continued to slip, suggesting vulnerability in his 2008 re-election bid. But he would still defeat any of four potential Democratic challengers if the race ended today, a new poll shows.
Hold on to "continued to slip" and "suggesting vulnerability" for a second. The issue here is the fact claim in the second sentence -- that the poll shows that McConnell would defeat any of the four Democrats. It isn't true. There are two broad reasons why.
One is practical. "Who would you vote for today?" doesn't answer questions about a future election very well. McConnell won't face four Democrats; he'll face one. Voters (13% to 21% for this question, depending on the Democrat) don't get to say "undecided" in the voting booth. We don't know how many "undecided" voters are telling the truth and how many just don't want to say. And, of course, preelection polls measure what people say they'll do -- not what they will do.
Could we just disagree about that -- is it a question about journalistic risk? Maybe. But the second issue is the stats themselves, and that's a green cheese issue. The numbers the study presents do not show that McConnell "would still defeat any of four potential Democratic challengers." Here's the boring way to say what they do show:
In two cases (Stumbo and Horne), significantly more voters say they would choose McConnell than the Democrat. Thus, we're 95% confident that if all "likely voters who vote regularly in statewide elections" were surveyed, McConnell would be ahead on this question. So he'd "win," as long as you don't count those pesky undecideds or worry about that 5% of cases in which our sample isn't an accurate reflection of the population. (Want "proof" rather than probability? Get those seminary applications ready.)
For Luallen and Chandler, it's a different story. In either case, the poll could be a perfectly accurate reflection of a lead for the Democrat in the whole population (again, ignoring the undecideds). The results don't represent a "statistical tie"; McConnell is likely to have a lead in the entire non-undecided population, but it's likely in the sense that it has about one chance in four of not being true.
... Fewer than half said they had a favorable impression of McConnell, the Senate Republican leader.
Still, the 47 percent who said they had a positive view of McConnell is higher than the 46 who don't.
"It's a mixed bag," said Del Ali, president of the Olney, Md.-based firm Research 2000 that conducted the poll. "He can rightly say 'I could be one of the higher-rated Republicans in the country, and I'm more of a target. In spite of that and the kitchen sink being thrown at me, there's a plurality who still think favorably of me.'"
Among the worst news for McConnell was that 46 percent of respondents disapproved of his performance compared to 45 percent who said they like what he's done -- the first time an independent media poll has shown the senator with a higher percentage of disapproval.
And any incumbent should be worried when his or her popularity and job approval ratings are under the 50 percent threshold, Ali said.
The reporter's interpretations are interspersed with the pollster's, and the reporter comes out second best. The reporter's making a lot out of two 1-point gaps, which are highly unlikely to represent meaningful differences (and thus unlikely to represent the sort of change that could be "worst news"). The pollster actually makes some relevant comments, partly because he sticks to generalities: The results are a mixed bag, and popularity/approval ratings under 50 percent are never a good sign for incumbents.
So how worrisome are the numbers in the lede? We don't know; there's nothing in the story to support the idea that there's been any movement at all.
On the positive side for McConnell, the poll showed he'd have at least a 5-percentage-point lead over each of four potential Democratic challengers. ... "The headline should be that McConnell beats all comers, which is remarkable given the fact that your paper as well as several D.C.-based liberal groups have waged an unrelenting attack on the (GOP Senate) Leader for the past several months," said Billy Piper, McConnell's chief of staff, in a statement.
Here we have a difference of opinion. That's more or less what the headline is, whereas it ought to say something like "Billy Piper is a moron." Unfortunately, we can't switch standards in midseason. We can't say a one-point gap in approval ratings is a big deal, then turn around and tell the chief of staff for the Senate Republican leader that a five-point gap doesn't represent a significant difference at traditionally accepted confidence levels.
We seem to have painted ourselves into a lot of corners here, none of them necessary. Some of the needed changes can be introduced at the training level (as in, reporters who can't explain confidence levels and confidence intervals aren't allowed to write about survey research). Some of them need fixing at the level of how newsrooms work. Those might be a bit, um, tricky.
* That's broadly true of journalistic interpretations of media effects in general. Think of the one justification that's offered every time there's an adverse public reaction to, say, a photo of a teenager's body after a car crash: "If it keeps even one kid from doing something stupid, it's worth it." Which is certainly a reassuring thing to think; it just doesn't have any known relationship to reality.
Labels: polls
2 Comments:
"2. No polls about whether people do or don't believe in ghosts."
Too late. The AP's Fram and Tompson waved the bait and my local took it. Front page, last Friday.
Yeah, alas. I was hoping maybe the folks who had stashed that one in hope of localizing their own might have a twinge of conscience and spike it.
Post a Comment
<< Home