More fibbing with "polls"
Today's quiz is in the form of a word problem:
You're editing an article for online publication:
Net sentiment favors Smith
"Say what you will about Bruton Smith. You won't be the first.
"Charlotte's billionaire speedway owner has both shared and inspired strong opinions and harsh words over the years.
"In his latest dust-up with government officials -- a dispute over a proposed drag-racing strip at Lowe's Motor Speedway -- Smith appears to have plenty of public opinion on his side.
"In an unscientific poll being conducted Thursday on Charlotte.com, about two out of every three voters said Concord officials should bow to pressure from Smith and let him build the $60 million strip at Lowe's Motor Speedway. ... More than 1,200 people had voted in the poll as of 9 p.m."
The AP Stylebook says poll stories should always report margins of sampling error. What is this poll's margin of sampling error?
a) Plus or minus 2.8 percentage points (Hi, Doug!)
b) Dunno. Since you control the confidence level, how big would you like it to be?
c) Huh?
d) Online polls of this sort do not have margins of sampling error because they are not probability samples. It's like asking for the margin of sampling error in the horoscope or the R-squared change attributable to the Billy Graham column. Statistical techniques apply to statistical things only, not to stuff on the funny pages.
If you answered (d), take the rest of the day off. This is a fake poll. It's a different kind of deceit from the Fox poll discussed below, but it's still fundamentally dishonest. Fox is using dishonest techniques to stack the deck ideologically; this story amounts to fibbing for economic advantage. It presents as exclusive information something that isn't really information at all.
It's also the sort of offense against truth that's unlikely to be corrected, because of the peculiar way newspapers approach corrections. Usually, a correction requires a provable, binary, yes-and-no error: The story says X, for which some not-X is true (We said "Smith convicted," where either "Jones, not Smith, convicted" or "Smith acquitted" would make up the not-X). And in this case, we can't demonstrate a not-X. It'd be like asking the paper to correct the horoscope or the Billy column.
In other words, it may indeed be true that, as the hed asserts, "Net sentiment favors Smith." But that sentiment cannot be determined by asking readers to click and vote on the issue. You can't make assertions about "Net sentiment" because you have no way of knowing whether the "votes" represent a legitimate picture of it. Doesn't matter that it's 1,200 people (a third again as many as in the Fox poll). Wouldn't matter if it had been 12,000. It is not possible to draw a reliable or valid inference from a nonprobability sample. Period.
Again, the Fox poll is in a different category of deceptiveness. The hedline question, yielding the claim that more Democrats think the world would be better off if the U.S. "loses" the Iraq war (the tease hed talks about whether the U.S. would be better off, but that sort of error is more likely due to Fox's traditional incompetence at editing than to political bias), is badly worded because it's impossible to tell what a "yes" or "no" answer means. And the methodology is careless; if you want to sample registered voters, you're introducing variance you can't explain -- meaning it goes into the general category of "error variance" -- if you only sample among those who are at home on a Tuesday or Wednesday evening in late September.
That said, though, the poll does produce some valid, replicable data. If you want to know how many Americans (OK, how many registered voters who were home on Tuesday, Sept. 25) claim to have prayed for the president within the last two weeks, we have a real number for which we can calculate things like "margin of sampling error at 95% confidence." Why you'd want to know that is up to you, but it's a real number.
Takeaway points for polling season:
1) Properly conducted, surveys can produce valid and reliable snapshots of public opinion.
2) Every report on a survey must include the confidence interval ("margin of error") and the confidence level at which it is calculated, the sample size, basic characteristics of the population being sampled, and dates the poll was conducted.
3) Nonprobability samples must never be reported as indicators of public opinion. (If that sounds like "never write a story about a fake survey," that's what it means.)
4) Polls answer only the questions they ask and say only what they say.
5) Do not compare polls unless comparing identical questions asked of identical populations.
Is it worse to lie by card-stacking or by snake-oil-peddling? That isn't the question. The question is: Why lie at all?
You're editing an article for online publication:
Net sentiment favors Smith
"Say what you will about Bruton Smith. You won't be the first.
"Charlotte's billionaire speedway owner has both shared and inspired strong opinions and harsh words over the years.
"In his latest dust-up with government officials -- a dispute over a proposed drag-racing strip at Lowe's Motor Speedway -- Smith appears to have plenty of public opinion on his side.
"In an unscientific poll being conducted Thursday on Charlotte.com, about two out of every three voters said Concord officials should bow to pressure from Smith and let him build the $60 million strip at Lowe's Motor Speedway. ... More than 1,200 people had voted in the poll as of 9 p.m."
The AP Stylebook says poll stories should always report margins of sampling error. What is this poll's margin of sampling error?
a) Plus or minus 2.8 percentage points (Hi, Doug!)
b) Dunno. Since you control the confidence level, how big would you like it to be?
c) Huh?
d) Online polls of this sort do not have margins of sampling error because they are not probability samples. It's like asking for the margin of sampling error in the horoscope or the R-squared change attributable to the Billy Graham column. Statistical techniques apply to statistical things only, not to stuff on the funny pages.
If you answered (d), take the rest of the day off. This is a fake poll. It's a different kind of deceit from the Fox poll discussed below, but it's still fundamentally dishonest. Fox is using dishonest techniques to stack the deck ideologically; this story amounts to fibbing for economic advantage. It presents as exclusive information something that isn't really information at all.
It's also the sort of offense against truth that's unlikely to be corrected, because of the peculiar way newspapers approach corrections. Usually, a correction requires a provable, binary, yes-and-no error: The story says X, for which some not-X is true (We said "Smith convicted," where either "Jones, not Smith, convicted" or "Smith acquitted" would make up the not-X). And in this case, we can't demonstrate a not-X. It'd be like asking the paper to correct the horoscope or the Billy column.
In other words, it may indeed be true that, as the hed asserts, "Net sentiment favors Smith." But that sentiment cannot be determined by asking readers to click and vote on the issue. You can't make assertions about "Net sentiment" because you have no way of knowing whether the "votes" represent a legitimate picture of it. Doesn't matter that it's 1,200 people (a third again as many as in the Fox poll). Wouldn't matter if it had been 12,000. It is not possible to draw a reliable or valid inference from a nonprobability sample. Period.
Again, the Fox poll is in a different category of deceptiveness. The hedline question, yielding the claim that more Democrats think the world would be better off if the U.S. "loses" the Iraq war (the tease hed talks about whether the U.S. would be better off, but that sort of error is more likely due to Fox's traditional incompetence at editing than to political bias), is badly worded because it's impossible to tell what a "yes" or "no" answer means. And the methodology is careless; if you want to sample registered voters, you're introducing variance you can't explain -- meaning it goes into the general category of "error variance" -- if you only sample among those who are at home on a Tuesday or Wednesday evening in late September.
That said, though, the poll does produce some valid, replicable data. If you want to know how many Americans (OK, how many registered voters who were home on Tuesday, Sept. 25) claim to have prayed for the president within the last two weeks, we have a real number for which we can calculate things like "margin of sampling error at 95% confidence." Why you'd want to know that is up to you, but it's a real number.
Takeaway points for polling season:
1) Properly conducted, surveys can produce valid and reliable snapshots of public opinion.
2) Every report on a survey must include the confidence interval ("margin of error") and the confidence level at which it is calculated, the sample size, basic characteristics of the population being sampled, and dates the poll was conducted.
3) Nonprobability samples must never be reported as indicators of public opinion. (If that sounds like "never write a story about a fake survey," that's what it means.)
4) Polls answer only the questions they ask and say only what they say.
5) Do not compare polls unless comparing identical questions asked of identical populations.
Is it worse to lie by card-stacking or by snake-oil-peddling? That isn't the question. The question is: Why lie at all?
Labels: polls, self-selection
3 Comments:
"The question is: Why lie at all?"
Well, silly, reality has a well-known liberal bias, so they have to lie if they're going to get their message out.
Or at least, that's how they make it look.
Damn you, empirically based world!
Oh yeah, this one is definitely 2.8 percent {grin}
Post a Comment
<< Home