Tell your statistics to shut up
Hurrah the mass media. Sometimes they get enough of the Big Picture to raise your hopes a bit -- notwithstanding the vast number of details they manage to slaughter in the bargain. Such is the latest offering in the NYT's public editor's slot, "Precisely false vs. approximately right: A reader's guide to polls" (p. 10 of today's Week in Review section).
This week's author is Jack Rosenthal, whom we last encountered in this spot when he claimed to be pinch-hitting (he wasn't, but that's another story) for Bill Safire a few weeks back. And our roving utility dude makes some overall sharp points about survey data and what to do -- or not do -- with it: Methods matter. Details matter. Sources matter. And even the mighty Times isn't above running unsupported nonsense based on otherwise innocent polling data.
However good his intentions, though, he gets off on the wrong foot early:
The Times published a correction explaining the misrepresentation, and the news media that used the story would probably agree with what Cliff Zukin, a Rutgers authority on polls, told Mystery Pollster, a polling blog: how unfair it is to publish a story “suggesting that college students on spring break are largely drunken sluts.”
One doubts they'd agree with any such thing. Usually, when you point out that a "survey" story -- like survey.com's fake tale of how much time workers waste on the job -- is bogus from the word go, the response from the "news media" is something like: Tell your statistics to shut up.*
Overoptimisim, at least, isn't a sin. Oversimplification is, and that's where Rosenthal starts to commit errors almost as awful as the ones he decries. Let's look at a few:
Beware of decimal places. When a polling story presents data down to tenths of a percentage point, what the pollster almost always demonstrates is not precision but pretension. A recent Zogby Interactive poll, for instance, showed that the candidates for the Senate in Missouri were separated by 3.8 percentage points. Yet the stated margin of sampling error meant the difference between the candidates could be seven points. The survey would have to interview unimaginably many thousands for that zero point eight to be useful.
Here a good idea -- beware of spurious precision -- becomes a nonsensical, if not outright dangerous, commandment: Down with decimal points! Off with their little heads! His example shows why:
... the candidates for the Senate in Missouri were separated by 3.8 percentage points. Yet the stated margin of sampling error meant the difference between the candidates could be seven points.
Whoa. If he's referring to the July Zogby poll, the distance between candidates is correct, but the margin of sampling error would have to be 1.6 points for that sentence to make any sense. If we take the margin of sampling error from the June version of that poll (+/- 3.4 points), the "difference between the candidates" could be anything from a 10.8-point Talent lead to a 3-point McCaskill lead and still be considered a non-chance representation of the population. Why we're implicating the poor decimal point in 3.8 is a bit baffling.
...The survey would have to interview unimaginably many thousands for that zero point eight to be useful.
Here, he probably means "for the 0.2" to be useful rather than "the 0.8" (anybody who would round 0.8 downward would suck eggs). Either way ... huh? For one thing, it doesn't take "unimaginably many thousands" to make the margin of a Senate campaign relevant (Cantwell pipped Gorton by less than 4,000 in the 2000 Washington race). For another, how hard is it to come up with a case where rounding would give a false result?
Let's try some imaginary poll numbers: Talent 49.4%, McCaskill 42.5%, with a margin of sampling error of +/- 3.4 percentage points. The long and short of what that means is that for a sample of this size (about 850 people), we're 95% sure that Talent's support in the whole population is somewhere between 46 and 52.8; McCaskill's is between 39.1 and 45.9. There are no non-chance cases in which Talent is trailing.
Round the basic numbers (Talent 49.4=49, McCaskill 42.5=43) and the picture is different: Our sample could be an accurate reflection of a McCaskill lead (46.4 to Talent's 45.6). And that doesn't include rounding the margin of sampling error, where the relevance of detail is much more pronounced. The difference between 4 and 3.8 percentage points** is a difference of 50 people (n=600 vs. n=650); to get to 3 points, you need about 1,100 in your survey.
Experienced researchers offer a rule of thumb: rather than trust improbably precise numbers, round them off. Even better, look for whole fractions.
Fine. Round off the "improbably precise" 45.382 to 45.4. But don't tell me it's equal to 44.5. And whatever a "whole fraction" is (1/6? 1/8?), don't look for one.
Rosenthal nails useful points throughout: Sampling error for subgroups is bigger than for the whole sample.*** Stupid phrasing leads to stupid results. Poll respondents, being human, like to make themselves look good (that's why smart newspapers never confuse self-reported church attendance with church attendance). But he undercuts himself by getting his concepts confused and his terms bollixed:
The Times and other media accompany poll reports with a box explaining how the random sample was selected and stating the sampling error. Error is actually a misnomer. What this figure actually describes is a range of approximation.
Yeah, I wish. Some are open about their methods; some aren't. But "error" isn't a misnomer; it just means variance not accounted for stuff you're trying to account for. Call this margin a "confidence interval" if that makes you feel better.
For a typical election sample of 1,000, the error rate is plus or minus three percentage points for each candidate, meaning that a 50-50 race could actually differ by 53 to 47.
"Error rate," on the other hand, is a misnomer. "Rate" has nothing to do with it.
But the three-point figure applies only to the entire sample. How many of those are likely voters?
Dunno. Did you ask?
In the recent Connecticut primary, 40 percent of eligible Democrats voted. Even if a poll identified the likely voters perfectly, there still would be just 400 of them, and the error rate for that number would be plus or minus five points. So to win confidence, a finding would have to exceed 55 to 45.
Good start, bad finish. Remember, he hasn't said what a "typical election sample" would consist of, and that's one of the things that needs to appear in the infobox. Is it anyone "eligible" to vote? Registered voters? Likely voters? Likely Democratic voters? It could be any of those (though where he gets the mystic 400, I don't know; he seems to be multiplying stuff at random). And the margin of sampling error (at the 95% confidence level, which he ignores throughout, though you can't calculate sampling error without confidence level) for a sample of 400 is 4.9, not 5, points. I know, it's that damn decimal again, but look at it this way: for a subsample of 475, it's 4.5 points, meaning a 9-point split rather than 10.
Go forth and bear his concerns in mind when monitoring copy during the onrushing campaign season. But it's worth asking, at the same time, why journalists are so perpetually convinced that stuff needs to be dumbed down to the point where it's unrecognizable before the alleged public can understand it. Half an hour with a hard-eyed editor who survived a basic stats course would have done this column a world of good.
If his copy didn't get that half-hour because the desk was overloaded, or because it didn't have a pinch-hitter who hits bogus stats into the cheap seats, that's understandable. If he didn't get a hard edit because of his title (he was a senior editor at the NYT for 25 years, after all), that's a different matter. If you edited the writer rather than the writing, you did the readers -- and the writer -- a disservice.
* This week's trivia quiz: What favorite book among ex-Missourian hands is this the title of the appendix of?
** Doug's going to say this should be "percent," and I'm going to disagree.
*** For busy copyeds: Multiply 1.96 by the square root of 0.25/n to get the margin of sampling error for n (at the 95% confidence level).
This week's author is Jack Rosenthal, whom we last encountered in this spot when he claimed to be pinch-hitting (he wasn't, but that's another story) for Bill Safire a few weeks back. And our roving utility dude makes some overall sharp points about survey data and what to do -- or not do -- with it: Methods matter. Details matter. Sources matter. And even the mighty Times isn't above running unsupported nonsense based on otherwise innocent polling data.
However good his intentions, though, he gets off on the wrong foot early:
The Times published a correction explaining the misrepresentation, and the news media that used the story would probably agree with what Cliff Zukin, a Rutgers authority on polls, told Mystery Pollster, a polling blog: how unfair it is to publish a story “suggesting that college students on spring break are largely drunken sluts.”
One doubts they'd agree with any such thing. Usually, when you point out that a "survey" story -- like survey.com's fake tale of how much time workers waste on the job -- is bogus from the word go, the response from the "news media" is something like: Tell your statistics to shut up.*
Overoptimisim, at least, isn't a sin. Oversimplification is, and that's where Rosenthal starts to commit errors almost as awful as the ones he decries. Let's look at a few:
Beware of decimal places. When a polling story presents data down to tenths of a percentage point, what the pollster almost always demonstrates is not precision but pretension. A recent Zogby Interactive poll, for instance, showed that the candidates for the Senate in Missouri were separated by 3.8 percentage points. Yet the stated margin of sampling error meant the difference between the candidates could be seven points. The survey would have to interview unimaginably many thousands for that zero point eight to be useful.
Here a good idea -- beware of spurious precision -- becomes a nonsensical, if not outright dangerous, commandment: Down with decimal points! Off with their little heads! His example shows why:
... the candidates for the Senate in Missouri were separated by 3.8 percentage points. Yet the stated margin of sampling error meant the difference between the candidates could be seven points.
Whoa. If he's referring to the July Zogby poll, the distance between candidates is correct, but the margin of sampling error would have to be 1.6 points for that sentence to make any sense. If we take the margin of sampling error from the June version of that poll (+/- 3.4 points), the "difference between the candidates" could be anything from a 10.8-point Talent lead to a 3-point McCaskill lead and still be considered a non-chance representation of the population. Why we're implicating the poor decimal point in 3.8 is a bit baffling.
...The survey would have to interview unimaginably many thousands for that zero point eight to be useful.
Here, he probably means "for the 0.2" to be useful rather than "the 0.8" (anybody who would round 0.8 downward would suck eggs). Either way ... huh? For one thing, it doesn't take "unimaginably many thousands" to make the margin of a Senate campaign relevant (Cantwell pipped Gorton by less than 4,000 in the 2000 Washington race). For another, how hard is it to come up with a case where rounding would give a false result?
Let's try some imaginary poll numbers: Talent 49.4%, McCaskill 42.5%, with a margin of sampling error of +/- 3.4 percentage points. The long and short of what that means is that for a sample of this size (about 850 people), we're 95% sure that Talent's support in the whole population is somewhere between 46 and 52.8; McCaskill's is between 39.1 and 45.9. There are no non-chance cases in which Talent is trailing.
Round the basic numbers (Talent 49.4=49, McCaskill 42.5=43) and the picture is different: Our sample could be an accurate reflection of a McCaskill lead (46.4 to Talent's 45.6). And that doesn't include rounding the margin of sampling error, where the relevance of detail is much more pronounced. The difference between 4 and 3.8 percentage points** is a difference of 50 people (n=600 vs. n=650); to get to 3 points, you need about 1,100 in your survey.
Experienced researchers offer a rule of thumb: rather than trust improbably precise numbers, round them off. Even better, look for whole fractions.
Fine. Round off the "improbably precise" 45.382 to 45.4. But don't tell me it's equal to 44.5. And whatever a "whole fraction" is (1/6? 1/8?), don't look for one.
Rosenthal nails useful points throughout: Sampling error for subgroups is bigger than for the whole sample.*** Stupid phrasing leads to stupid results. Poll respondents, being human, like to make themselves look good (that's why smart newspapers never confuse self-reported church attendance with church attendance). But he undercuts himself by getting his concepts confused and his terms bollixed:
The Times and other media accompany poll reports with a box explaining how the random sample was selected and stating the sampling error. Error is actually a misnomer. What this figure actually describes is a range of approximation.
Yeah, I wish. Some are open about their methods; some aren't. But "error" isn't a misnomer; it just means variance not accounted for stuff you're trying to account for. Call this margin a "confidence interval" if that makes you feel better.
For a typical election sample of 1,000, the error rate is plus or minus three percentage points for each candidate, meaning that a 50-50 race could actually differ by 53 to 47.
"Error rate," on the other hand, is a misnomer. "Rate" has nothing to do with it.
But the three-point figure applies only to the entire sample. How many of those are likely voters?
Dunno. Did you ask?
In the recent Connecticut primary, 40 percent of eligible Democrats voted. Even if a poll identified the likely voters perfectly, there still would be just 400 of them, and the error rate for that number would be plus or minus five points. So to win confidence, a finding would have to exceed 55 to 45.
Good start, bad finish. Remember, he hasn't said what a "typical election sample" would consist of, and that's one of the things that needs to appear in the infobox. Is it anyone "eligible" to vote? Registered voters? Likely voters? Likely Democratic voters? It could be any of those (though where he gets the mystic 400, I don't know; he seems to be multiplying stuff at random). And the margin of sampling error (at the 95% confidence level, which he ignores throughout, though you can't calculate sampling error without confidence level) for a sample of 400 is 4.9, not 5, points. I know, it's that damn decimal again, but look at it this way: for a subsample of 475, it's 4.5 points, meaning a 9-point split rather than 10.
Go forth and bear his concerns in mind when monitoring copy during the onrushing campaign season. But it's worth asking, at the same time, why journalists are so perpetually convinced that stuff needs to be dumbed down to the point where it's unrecognizable before the alleged public can understand it. Half an hour with a hard-eyed editor who survived a basic stats course would have done this column a world of good.
If his copy didn't get that half-hour because the desk was overloaded, or because it didn't have a pinch-hitter who hits bogus stats into the cheap seats, that's understandable. If he didn't get a hard edit because of his title (he was a senior editor at the NYT for 25 years, after all), that's a different matter. If you edited the writer rather than the writing, you did the readers -- and the writer -- a disservice.
* This week's trivia quiz: What favorite book among ex-Missourian hands is this the title of the appendix of?
** Doug's going to say this should be "percent," and I'm going to disagree.
*** For busy copyeds: Multiply 1.96 by the square root of 0.25/n to get the margin of sampling error for n (at the 95% confidence level).
3 Comments:
And always keep in mind Strayhorn's Big Law of The Desk: Polls aren't news and shouldn't be reported as such.
Ideally, polls should be printed on the same page as the horoscopes, the solunar tables, and phrenology results.
This comment has been removed by a blog administrator.
I'll still respect you in the morning. :)
I merely defer to the excellent explanation in Wickham's "Math for Journalists" and a rather emphatic polling-specialist professor in my undergrad years as a poli sci (and other things) major at IU.
The numbers work either way (whether you relate them back to the base or to a difference in percentages), and one can find as many adherents one way as the other.
Doug
Post a Comment
<< Home