### Break out the clue bat

Is it polling season again already? All right, let's let Motown's Oldest Daily show us how to get almost everything wrong, on the off chance some of us want to get it right before the next round.

First question: Is the hed true? (this is the 2A hed; a truncated version of the story is also the 1A lede.) For several reasons, probably not. First, the survey measures intent, not cause. We have no way of knowing whether "conservative endorsements" made a difference because we didn't ask about it. That may make intuitive sense, since the poll followed the endorsements, and it may be what the pollster claims, but it's still a post hoc error, and asserting it as a fact is sloppy practice.

Question 1A, though: Is Cox (Mike Cox, the state AG) actually "on top"? That's less of a yes-or-no question than a "how likely" question, and the way you answer it sets a precedent that can affect your credibility. Cox is ahead by 2 points* in a sample of 400 Republican likely voters. Let's see what the paper thinks about that:

Cox's 2-percentage-point advantage over U.S. Rep. Pete Hoekstra of Holland was within the poll's margin of error but still statistically significant, especially since he overcame a 12-percentage-point lead Hoekstra had in late May.

Um, no. You can't be "within the margin of error" and "statistically significant" at the same time, any more than a basketball player can be 5-foot-6 and 6-foot-5 at the same time. (And a change in Candidate A's support from Time 1 to Time 2 has nothing to do with the significance of the A-B gap at time 2, but that's dogpiling.) So let's recap those terms briefly.

A randomly drawn sample gives you a picture of the population you're trying to generalize about. How good a picture depends on a lot of things, but for today's purposes, it's primarily the size of the sample. (That's why you should always use "margin of sampling error" on first reference; that's a statistical function, and it has nothing to do with error introduced by, say, faulty question design.**) If you draw a lot of these samples, they'll eventually form a normal distribution around the "population value," or the result you'd get if you asked all Republican likely voters in Michigan. The margin of error describes the width of that curve at its base. Nineteen times out of 20, a sample of 400 will fall within 4.9 points on either side of the population value. There's a 5 percent chance (the 20th case) that it won't.***

Where does statistical significance come in? A "significant" difference means that at a predetermined level of confidence (traditionally**** 95%), your sample difference represents a real difference in the population. In a case like ours, a 50-44 difference between A and B wouldn't be significant: draw the curves yourself and you'll see the overlap.

That doesn't mean there's no difference. If we turn down the confidence level to about two-thirds, the overlap goes away (draw the same curves with a base of 2.5 points on either side of the sample, rather than 4.9 points). A is probably leading B, but we're a lot less confident than we used to be. And when the gap gets down to 2 points, the best thing you can say is that there's an uncomfortably high likelihood that the difference came about by accident.

Not only is the difference in today's story nonsignificant, but we need to be careful about those changes from last month too. Given same-size samples, neither change (Cox's 8-point increase and Hoekstra's 6-point decline) would be significant. Both are more likely to be real than not, but you can't proclaim one set of rules and play by another.

Is it appropriate to compare the May and June samples? Well, sorta. This month's poll has 800 likely voters, 400 from each party, yielding a margin of sampling error of 4.9 points at 95% confidence for the two party subsamples. May's poll reports an N of 600, with a margin of 4 points, but it doesn't give subsample sizes. Not necessarily apples and oranges, but at the least two different varieties of apple.

Couple of lessons to take away. One, there's not a lot of news value in this poll. Phrase it as "holder of statewide office looks more like being about even with congressman in non-incumbent race for governor than a month ago" and you get a better picture. Two, the standards you set now are the ones you're likely to be held to later. Readers won't be able to tell statistical ineptitude from partisan bias, and if you're this careless about what gets a poll story to the top of the front, you can expect them to infer the worst whenever they disagree with a result.

* Hi, Fish!

** "If the election were held today, would you vote for A or do you hate America?"

*** If you want proof, go to seminary.

**** There's still no excuse for not reporting it.

First question: Is the hed true? (this is the 2A hed; a truncated version of the story is also the 1A lede.) For several reasons, probably not. First, the survey measures intent, not cause. We have no way of knowing whether "conservative endorsements" made a difference because we didn't ask about it. That may make intuitive sense, since the poll followed the endorsements, and it may be what the pollster claims, but it's still a post hoc error, and asserting it as a fact is sloppy practice.

Question 1A, though: Is Cox (Mike Cox, the state AG) actually "on top"? That's less of a yes-or-no question than a "how likely" question, and the way you answer it sets a precedent that can affect your credibility. Cox is ahead by 2 points* in a sample of 400 Republican likely voters. Let's see what the paper thinks about that:

Cox's 2-percentage-point advantage over U.S. Rep. Pete Hoekstra of Holland was within the poll's margin of error but still statistically significant, especially since he overcame a 12-percentage-point lead Hoekstra had in late May.

Um, no. You can't be "within the margin of error" and "statistically significant" at the same time, any more than a basketball player can be 5-foot-6 and 6-foot-5 at the same time. (And a change in Candidate A's support from Time 1 to Time 2 has nothing to do with the significance of the A-B gap at time 2, but that's dogpiling.) So let's recap those terms briefly.

A randomly drawn sample gives you a picture of the population you're trying to generalize about. How good a picture depends on a lot of things, but for today's purposes, it's primarily the size of the sample. (That's why you should always use "margin of sampling error" on first reference; that's a statistical function, and it has nothing to do with error introduced by, say, faulty question design.**) If you draw a lot of these samples, they'll eventually form a normal distribution around the "population value," or the result you'd get if you asked all Republican likely voters in Michigan. The margin of error describes the width of that curve at its base. Nineteen times out of 20, a sample of 400 will fall within 4.9 points on either side of the population value. There's a 5 percent chance (the 20th case) that it won't.***

Where does statistical significance come in? A "significant" difference means that at a predetermined level of confidence (traditionally**** 95%), your sample difference represents a real difference in the population. In a case like ours, a 50-44 difference between A and B wouldn't be significant: draw the curves yourself and you'll see the overlap.

That doesn't mean there's no difference. If we turn down the confidence level to about two-thirds, the overlap goes away (draw the same curves with a base of 2.5 points on either side of the sample, rather than 4.9 points). A is probably leading B, but we're a lot less confident than we used to be. And when the gap gets down to 2 points, the best thing you can say is that there's an uncomfortably high likelihood that the difference came about by accident.

Not only is the difference in today's story nonsignificant, but we need to be careful about those changes from last month too. Given same-size samples, neither change (Cox's 8-point increase and Hoekstra's 6-point decline) would be significant. Both are more likely to be real than not, but you can't proclaim one set of rules and play by another.

Is it appropriate to compare the May and June samples? Well, sorta. This month's poll has 800 likely voters, 400 from each party, yielding a margin of sampling error of 4.9 points at 95% confidence for the two party subsamples. May's poll reports an N of 600, with a margin of 4 points, but it doesn't give subsample sizes. Not necessarily apples and oranges, but at the least two different varieties of apple.

Couple of lessons to take away. One, there's not a lot of news value in this poll. Phrase it as "holder of statewide office looks more like being about even with congressman in non-incumbent race for governor than a month ago" and you get a better picture. Two, the standards you set now are the ones you're likely to be held to later. Readers won't be able to tell statistical ineptitude from partisan bias, and if you're this careless about what gets a poll story to the top of the front, you can expect them to infer the worst whenever they disagree with a result.

* Hi, Fish!

** "If the election were held today, would you vote for A or do you hate America?"

*** If you want proof, go to seminary.

**** There's still no excuse for not reporting it.

Labels: clues, polls, statistics

## 1 Comments:

Hey fev. I agree that "statistically significant" is meaningless in this context and that the story is muddled. But Cox is indeed probably in the lead. As Kevin Drum has to point out every few years (with a handy little table of confidence levels), even a slim lead in a poll means you're probably ahead -- in this case, he's roughly a 65% favorite. That's less than the 95% that pollsters like, but it's not nothing.

Of course I don't expect journalists to understand this stuff any more than you. But it's worth explicating.

Also, no comments about the "... lifts Cox" in the hed?

Post a Comment

## Links to this post:

Create a Link

<< Home