Saturday, October 14, 2006

Science Saturday: Quiz folo

Tnx to all who participated in and helped publicize last week's science journalism quiz. (Are you listening out there, search committees?) A few comments on the questions, then on to the real point, which is along the lines of What Is To Be Done?

First, yeah, there are a few trick questions and a few design issues. Quizzes always get better the second time you give 'em (except J4400 current-events quizzes, which are infallible). Some things can't be answered without looking at the study itself. Which is kind of the point. That ought to be the writer's job, but if writers aren't going to do it, step forward the copy editors.

What do you do when the article costs $12.95 a pop? Read the abstract and vow to make friends with somebody at a nearby research library who can be persuaded to drop you a PDF of the fulltext next time you need one. Hey, Marlowe got information out of librarians (OK, booksellers, but the effect is much the same).

First question. Does the study measure "self-esteem in men"? We were looking for "no," on grounds that "body esteem" is a newer and narrower concept, but may give partial credit on this one. The point is that the headline is wrong about everything it touches, and the AP text at least hints at most of the points that should have been gotten right.

The main sin is overgeneralizing. The study looked at a very small portion of "media," with different effects from the TV and magazine constructs, and a hugely unrepresentative convenience sample of "men" (guys from psyc survey classes who participated for extra credit, with self-identified gays dropped from the results). The authors specifically note the former -- what, no Internet? -- as a shortcoming, but you didn't really need the "discussion" section to figure that out.

A cross-sectional survey can't establish cause-and-effect relationships, so even if "self-esteem" is the target, this study wouldn't show a monotonic "erosion" (even if it had found it). And if you didn't crunch the numbers yourself, at least put some attribution in the hed.

In all, as the Estimable GK used to put it, we seem to have actually subtracted from the sum of human knowledge. Urk.

The second question goes to the sin of featurizing the data. As it turns out, the Abercrombie ad is the sort of image that isn't associated with bad feelings (Mark appears to have found his copy of the AP tale elsewhere; this paper chopped out the references to Michelangelo's David, who at least -- being a statue and hairless -- hints at the "real body" measure). Even if this study had surveyed the "average guy" mentioned in graf 2, the personalized touch in the first few grafs takes us way outside the limits of the data.

Back to the big type. Everybody who said "none" for Question 3, take a bow. Here, the deck hed simply repeats the writer's error in deciding that "sexual assertiveness," discussed in the study, must mean "sexual aggressiveness." That's an inexcusable blunder, but you can't blame the desk for assuming that the reporter got things right. More on this below, but meanwhile: (a) assertiveness seems rather a good thing in this study, and (b) "own body comfort" predicts more assertiveness but no increase in risk-taking. Kind of a sloppily worded question, but there's still nothing like having the regression results themselves at hand.

This seems like a good time to point out that "significance" and "substance" aren't the same thing. So Mark gets extra credit (sorry about the deadline; it's a desk thing) for noting that the significant predictors don't explain much of the variance at all.

The true-false stuff is mostly intended to underscore a couple of common flaws. One, correlation isn't cause. Surveys don't establish that sort of relationship. Two, there's a huge difference between "study says" and "researcher says." Schooler may have a point about what really concerns men. She's certainly raised a provocative question and found a way to start investigating it. But the True Concerns of Men (even of undergraduates in survey classes) are not where this study goes. (And, as she points out in the discussion, the whole construct is pretty new and not yet widely validated.)

OK, errand time. Back soon with actual prescriptions for the future.


