(tl;dr: It's letter of recommendation season, and so I decided to write one to a paper that's really been influential in my recent thinking. Psychometrics, y'all.)
To whom it may concern:
I am writing to provide my strongest recommendation for the paper, "Attack of the Psychometricians" by Denny Borsboom (2006). Reading this paper oriented me to a rich tradition of psychometric modeling – but more than that, it changed my perspective on the relationship between psychological measurement and theory. (It also taught me to use the term "sumscore"* as an insult). I urge you to consider it for a position in your reading list, syllabus, or lab meeting.
I first met AotP (or Attack!, as I like to call it) via a link on twitter. Not the most auspicious beginning, but from a quick skim on my phone, I could tell that this was a paper that needed further study.
The paper presents and discusses what it calls the central insight of psychometrics: that "measurement does not consist of finding the right observed score to substitute for a theoretical attribute, but of devising a model structure to relate an observable to a theoretical attribute." In other words, the goal is to make models that link data to theoretical quantities of interest. What this means is that measurement is essentially continuous with theory construction. By creating and testing a good measurement model, you're creating and testing a key component of a good theory.
Attack! has made me think about the origins of this situation. Here's my attempt at an origin story. In the olden times, all the psychologists went to the same conferences and worried about the same things. But then a split formed between different groups. Educational psychologists and psychometricians knew that different problems on tests had different measurement properties, and began exploring how to select good and bad items, and how to figure out people's ability abstracted away from specific items. Cognitive psychologists, on the other hand, spurned this item-level variation and embraced the dogma of exchangeable experimental items. People did Lots Of Trials, all generated from the same basic template. The sumscore reigned supreme, and yielded important insight into Memory, Attention, and Reasoning (irrespective of what was being remembered, attended to, or reasoned about).
Psychophysicists diverged from the cognitivist hierarchy. They always knew that they needed to infer a latent relationship (the psychometric curve). As they got better at doing this, they fit models that included parameters of the decision process – for example, a "lapse" parameter to capture inattention) – as well as the quantities of interest. And because they typically fit these curves within individual subjects, these parameters were participant-level estimates. But the models that fit these curves were often specific to particular metric relationships and not appropriate for increasingly complicated domains.
Now in modern cognitive science, we get work on sophisticated constructs – for example, in moral psychology or psycholinguistics – where experimenters break with the cognitivist dogma and use non-exchangeable items. Sometimes items are sentences or even whole vignettes. Yet for the most part these researchers have forgotten to model item variation (except occasionally using a random intercept for items in their linear mixed effects models). Clark (1972) scolded them about the problematic statistical inferences that could result from forgetting to model items and this guidance has reappeared in recent exhortations to Keep It Maximal! But as far as I can tell, no one really talks about modeling items in more detail *in order to learn more about what is in people's heads*.
Attack! has infested my brain. Now when I see someone use differentiated items in their task yet use the sumscore as their measure of the latent trait of interest, I think, "you're just leaving information on the table." I suddenly want to fit psychometric models to everything. Because, in the end, what do you want as a psychologist? A better understanding of the latent space that we're trying to theorize about. I used to think that this was called Theory and it was distinct from Data Analysis. Thanks to Attack! I now know that measurement and theory are (or at least should be) contiguous with one another.**
On a personal note, Attack! is a great read and will play well with your interest in sociological biases that shape the structure of scientific inquiry. You shouldn't pass this paper up. Do not hesitate to contact me with questions or concerns.
Michael C. Frank
* For those of you not in the know, the sumscore is just what we normal psychologists call "percent correct" – treating the sum of your correct answers on the test as your score, as opposed to inferring the latent trait (ability) from the performance on the observed variables.
** This contiguity idea is interestingly related to the Bayesian Data Analysis turn in the Bayesian cognitive modeling world, where we now think about linking functions that relate models to data directly. In fact, I think these are really the same idea when you get down to it. Here's a great paper that describes this viewpoint: Tauber, Navarro, Perfors, & Steyvers (2017).