Due to editorial request, this paper has now been retitled to “Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition”
Briefly: we explore mysteriously high correlations between individual measures of personality or emotionality and evoked BOLD activity, showing that these correlations are a priori much too high given the reliability of the measures involved. We go on to show that these correlations are systematically produced by a flawed method of adopting the same criteria for sample selection as for the subsequent correlation measure.
You can download the final paper here.
You can also download my chapter with Nancy Kanwisher on the problem more broadly here.
Jabbi et al. rebuttal: We were pleased to see a response from the authors of the papers we criticize, and we dedicate a page specifically to this rebuttal. Please go here.
Commentaries: Perspectives on Psychological Science has now made available the final set of commentaries that will be printed along-side our paper. You may find those here.
You can find our formatted reply to all the commentary here: https://www.edvul.com/pdf/VulHarrisWinkielmanPashler-PPS-2009-reply.pdf
You can find an excellent related paper from Tal Yarkoni and Todd Braver here.
Here is a charming 1950 article by Edward Cureton which completely anticipates the present debate — but using the term “baloney” rather than “voodoo” (thanks to Dirk Vorberg for pointing us to this).
Supplementary Q and A
Since some interesting questions have been raised about our paper (in these blogs and otherwise), we’ll do our best to address them here.
Q: Interpretations of your paper are varied (some suggest that this critique is damning to all social science, social neuroscience, or these articles in particular). If I believe your critique, what conclusion should I walk away with about these fields and studies?
A: We focus on correlations between fMRI measures of the brain and individual differences in personality and emotion. The field of social neuroscience extends far beyond these studies. Of the studies we sampled, just under half of the people reported using what we consider to be appropriate analyses. So we are certainly not suggesting that all (or even close to all) of the papers we surveyed are wrong. Moreover, some of the studies that used non-independent analyses to obtain correlation measures also reported findings that did not involve the localization of individual difference measures in the brain, and we are saying nothing about those other findings.
Finally, with respect to the set of studies that used the non-independent correlation analyses we criticize, we argue that the actual reported correlation values are biased, inflated, and thus, it might be reasonable to say, pretty meaningless. However, this does not mean that the true correlation is therefore zero. Some of the studies do provide evidence suggesting that there is probably some nonzero correlation there. We don’t think that a correlation of 0.1 is nearly as important as a correlation of 0.8, but it could still have scientific value.
Our main point, however, is more positive: there are several transparent ways in which accurate estimates of the correlation may be obtained in these studies, and in future studies approaching the same problems. We argue that this is what should be done, even on the data that have already been published.
Q: Since reliabilities apply to scores on measures rather than the measures themselves, how can you use reliabilities from other samples to make inferences about the scores used in the particular studies you describe?
A: Like nearly all social scientists, we assume that the reliability of a measure estimated from scores obtained on one sample will generalize to other samples. It is true that these measures of reliability will vary from sample to sample, but this is true of any measure ever obtained from a sample population. We hope (and we assume the authors of the articles do too) that the participants sampled into the reported studies were representative (with respect to the inferences in question). If they are, then we have no reason to suspect that the reliability of scores on any of the measures in these samples will differ substantially from those of other samples that have been previously used to evaluate these measures.
Q: Does reliability put an absolute constraint on the correlation that may be obtained?
A: No, reliability puts a constraint on the expected value of the correlation. Noise may make the correlation higher sometimes, and lower at other times. We argue that these articles have selected favorable noise that increases the apparent correlation, thus causing these estimates to systematically exceed the maximum possible expected value of a correlation between these measures.
We should reiterate again that we think that the theoretical upper bound on the expected correlation to be much higher than what should reasonable be expected. The upper bound assumes a perfect underlying correlation.
We’ll be happy to answer questions or comments about our paper here. If you have such questions or comments, you can email them to me.
We are excited that our critique has drawn some attention. We have now lost track of all the opinions out there, so if you have one (positive or negative alike) that you would like me to post a link to, please contact me. Below find a collection of opinions we know about. You can find a much more thorough collection of links to discussions of our paper on the Amazing World of Psychiatry blog