### Shortest off-season yet

It seems as though the Silly World Series was just last week, and here we are in the peak of silly season again.

In other words, even though it's way too early in the season to do so, it's already time to complain about inept reporting of public opinion polls. Step forward, The State of Columbia!

Problematic conclusions for a bunch of reasons, only one of which the story bothers to mention, and it gets that one wrong. So let's reiterate the basic rules of reporting on polls:

1) The mainbar must include the minimum details needed for a marginally smart reader to judge the poll's reliability: Population sampled, sample size, dates poll was conducted, confidence interval (aka "margin of sampling error") and confidence level. If you kick the story back to cityside for these details and the assigning editor says "What's a confidence level?" -- well, what would you do with a sports editor who couldn't tell you what an earned-run average was?

2) Never draw conclusions that go beyond your data. A survey of registered voters eight months before primary season starts tells you what registered voters say eight months before primary season starts. It doesn't, and can't, address questions of who has cause to worry or who's in a tougher fight than whom.

3) Don't use language that goes beyond your data.

4) Doublecheck all the arithmetic. Then doublecheck it again.

OK, how about some details?

While you're at it, particularly when the story holds to the front and reefers to details inside: Put

I'm hunting for the damn chart again, and it says the poll ended May 27 -- eight months before the S.C. Democratic primary (assuming the paper got it right last Sunday). And the proportion of undecided voters is a hint of how risky it is to talk about who's leading anything at this point. (Yes, we're also still waiting to be told it's a poll of registered voters.)

One, the poll doesn't show anything about who has reason for concern. That isn't what it measures. Two, let's just go ahead and retire the phrase "within the poll's margin of error" (which, ahem, sends me hunting for the chart again). It's irrelevant here. Three -- gotta love that cq, huh?

"Dominates" is an opinion, not a fact, and a poorly based one at that. Clinton is at about the level with "undecided" among registered Democrats eight months before the primary.

Everybody talks about this "margin of error," but nobody bothers to tell us what it is or why it might matter. Let's review for a moment, then. The margin of sampling error* describes the band in which nonchance cases can be expected to fall at a given confidence level. Using the general standard of 95% confidence, 19 samples out of 20 will be within plus or minus the margin (3.8 percentage points for the whole sample here). So there's one chance in 20 that your sample is outside this range and the real population figure -- which we're trying to estimate -- is something way different from what our poll predicts.

Why insist on the confidence level? Because the margin is meaningless without it. You can lower the margin of sampling error to 1.9 points with a flick of your wrist. All you need to do is accept a one-in-three chance that you're wrong.

Here's how it's explained (err...) in the methodology blurb:

Uh, sort of. Those are differences of about 7%, even though it's 3.8 percentage points either way. And we know exactly how much of the sample (53.4%) believes the war is the most important problem; it's the figure for the

That's why "within the margin of error" is meaningless for describing a candidate's lead. Sampling error applies to each candidate's result, meaning you have to double the margin -- so candidate A's lower bound is above candidate B's upper bound -- to be sure

Now we're screwing up the arithmetic. These samples are different sizes: The poll had about 25% Democrats and about 40% Republicans. The maximum margin of sampling error at 95% confidence for Democrats on this question is about 7.6 percentage points. It's slightly smaller for this question and for the presidential preference question (about 6.9 points, given the gap between Hillary and not-Hillary votes), but it gives you an idea of how quickly a shrinking sample size can change the band in which your nonchance results operate.

This is an interesting poll, but that doesn't mean it's exciting, which is a large part of the problem. The writer's going for drama where there isn't any. One effect of that (well, that and saying 8.4 points is "almost 10") is an impression that the paper is taking sides -- that it's plumping for Clinton on the news pages. That's not a good impression to give.

Let's pretend it's spring training and practice doing things right. Hit your cutoffs now and you're more likely to hit them in the pennant race.

* Please use its full name on first reference. Surveys are subject to a number of different types of error, and sampling is only one of them.

Why the range? The size of the group involved in answering each question.

Then, of course, there's a basic question of whether any poll result can be reported accurately to two decimal points.

(I'll forgo mentioning that the paper reported the margins as "percent," not "percentage points." {grin})

