“Fatal Flaws” in the Colorado Solitary Confinement Study

Guest Post by Stuart Grassian, M.D.

Editors’ Note: The Colorado Department of Corrections recently released the controversial results of a year-long, federally funded study conducted at the Colorado State Penitentiary, a supermax prison in Cañon City where more than 700 men are held in solitary confinement. Entitled “One Year Longitudinal Study of the Psychological Effects of Administrative Segregation,” the study found that long-term solitary confinement had no detrimental effect on the mental health of inmates–including inmates with pre-existing mental illness. In fact, some prisoners were found to “improve” in 23-hour-a-day lockdown under conditions of extreme isolation.

Solitary Watch asked Dr. Stuart Grassian, one of the world’s leading experts on the psychiatric effects of solitary and other extreme forms of confinement, for his reactions to the study. Grassian, a Board-certified psychiatrist and former faculty member of the Harvard Medical School, has lectured extensively on this subject. He served as an expert in individual and class-action lawsuits addressing solitary confinement, and his conclusions have been cited in a number of federal court decisions.  He has  provided invited testimony before legislative hearings in New York State, Maine and Massachusetts and the Commission on Safety and Abuse in America’s Prisons. Grassian has also been retained and consulted by  public advocacy groups, including the Innocence Project,  the National Prison Project of the ACLU,  Massachusetts and Maine Civil Liberties Unions, the Capital Defense Fund of the NAACP,  and the Center for Constitutional Rights, among others. Much of his work on the subject is described in “Psychiatric Effects of Solitary Confinement” (Washington University Journal of Law and Policy, 22: 2006).

Dr. Grassian reports that he was invited by the authors of  Colorado study to participate in the presentation of their research at the 2010 annual meeting of the American Psychological Association (APA). “In reviewing their research,” he writes, “I found there were several fatal flaws in their methodology, and so stated during the presentation, including their choice not to incorporate into their analysis data that squarely contradicted their conclusions.  This research has now, without any further analysis or correction, been submitted for publication to the National Institute of Justice.”

We are publishing in full a version of the critique that Grassian provided to the authors of the study, which he has adapted to be more accessible to general readers. He writes that the critique “is based upon the report itself, discussions held publicly at the presentation at the APA Meeting,  as well as the written transcript of the deposition of the lead author, Maureen O’Keefe, in Dunlap v. Zavaras“–a federal suit by a death row inmate Colorado State Penitentiary, alleging that his conditions of confinement constitute cruel and unusual punishment.


1.  Research Subjects,  Control Group.

Basically, the research subjects are Colorado inmates who were subject to disciplinary hearings that might result in their referral to Solitary Confinement (Ad Seg) in Colorado State Prison.  They are categorized as either having a mental illness diagnosis (MI) or no mental illness diagnosis (NMI).  Those referred to Ad Seg thus have a close comparison group (similar to what is termed a “control group” in research); that is, those who were returned to General Population, with some sanction short of Ad Seg.  Thus, the MI Ad Seg have a “control group” – the MI GP – and similarly, the NMI Ad Seg’s control group is NMI GP.  The authors pride themselves on having thus obtained in this manner a controlled study. (Controlled studies are able to isolate one variable – in this case, housing in Ad Seg – while leaving other variables constant in the groups studied.)

Naturally, the greatest focus will be on those having a diagnosis of some mental illness, the most vulnerable individuals, presumably those most vulnerable to decompensate as a result of Ad Seg confinement.

2.  Data Collected – the Problem of Validation.

The researchers must establish some means of determining the mental health status of the inmates being studied.  They choose to use various self-report rating scales, in which the inmates check off  symptoms and generally describe their severity, usually on a five-point scale

The question, of course, is whether these self-report scales have any meaningful relationship to the inmates’ actual psychiatric difficulties,  that is, whether are validated as a means of inquiring into psychiatric status.   Well, they are validated,  but not for people in the position of inmates.  They have been validated for college student volunteers and for outpatients in psychotherapy (that is, for these groups, their self-reports actually do correlate with other, objective measures of psychiatric symptomatology).   Especially in regard to outpatients, this is not surprising; it is intuitively reasonable that people seeking help are likely to try to be accurate in their self-report.

But inmates are in no way similarly placed.  Revealing weakness is dangerous, potentially subjecting the inmate to harassment, possibly even to physical danger.  Moreover, in the present study, the first author revealed at a deposition that the subjects were told that the research was intended to study how inmates were adjusting to prison life.   Well, quite clearly, how unwise it would be for an inmate to declare he was adjusting poorly;  that is not the kind of information he would like to present, for example, at a parole hearing.

There are other problems as well.  For example, the graduate student, Alyusha,  who actually met with the inmates was apparently an attractive young woman, talking with inmates who had virtually no contact with such young attractive women.  Even the research group itself noted the likely distorting effect of this fact, referring to it as the “Alyusha Effect.”  The inmates were likely to be reluctant to reveal weakness to this attractive young woman.

Thus, it cannot be assumed that inmate self-reports are a valid means of assessing psychiatric status.  It would not at all be surprising if these self-reports in fact bore little or no relationship at all to psychiatric status.

3.  The Attempts Made to Validate the Self-Reports.

The authors made token attempts to validate the inmate self-reports against reports (filling out brief check-the-box forms) of corrections officers and of clinicians.  However, by their own admission at public forum and at deposition, the authors acknowledge these reports are not of value.  They have no idea who or how the corrections officers filled out their forms; no specific instructions were provided, and over half the forms were never filled out at all.  Similarly with the forms filled out by the clinicians, the authors gave no guidelines or requirements as to how the forms would be filled out, and had no information whatsoever to suggest that the clinicians did more than they would normally do in a screening interview – that is, attempting to speak to the inmate through the cell door, either by talking through the crack between the edge of the door or else opening up the food slot, and bending down in an uncomfortable position to speak through the slot.  In any event, as the authors acknowledge, both officers and clinicians are already burdened by their routine paper work, and it would not be surprising to find that they put minimal or no effort at all in checking off these forms.    And indeed, while the inmate self-reports revealed no psychiatric symptomatology associated with ad seg housing,  the clinician forms found even less symptomatology than that of the inmates.

The authors acknowledge that little use can be made of the officer and clinician reports. The problem, simply, is that for these individuals, their mission (be it security or clinical treatment) is elsewhere;  it is not in filling out these forms. 

4.  The Authors Chose to Ignore Data That Squarely Contradicts Their Conclusions and Moreover Would Assess Validity of the Self-Report Data.

The most important comparison groups are the two groups of inmates with mental illness diagnosis referred for disciplinary hearing – those then housed in Ad Seg versus those then housed in GP.  Now since they all have psychiatric diagnoses, there will be records of mental health contact – symptoms noted in clinicians notes, medications prescribed, and so forth.  None of this data was reviewed at all.  For example, did those in Ad Seg end up requiring more medication than those in GP?   Absolutely no information, no attempt made to discover this data.

But, there was one piece of data recorded in the DOC files.  DOC files record incidents of emergency psychiatric contact (e.g. suicidal or self-destructive behavior) and emergence of psychotic symptoms.  Among the MI in Ad Seg (N=59) there were 37 such episodes (an average of .62 episodes per inmate – almost 2 for every 3 inmates).  Among the MI in GP (N=33), on the other hand,  there were only 3 (.09 per inmate – less than 1 for every 10 inmates).  Could this have been random – i.e. not a reflection of some significant difference in the result?   Statistically, the chance of that is entirely minute, approximately p=.0002;  i.e. a chance of 1 in 5,000, a mighty small number.  (In research, statistical significance requires only a probability of randomness of .05,  i.e. as much as 1 in 20!)     Thus, this objective data squarely contradicts the authors’ conclusion that Ad Seg does not produce significantly more psychiatric difficulties than does GP housing.  The authors simply declined to perform this straightforward statistical analysis, even after the oversight was explicitly pointed out.

This data is critical in another way as well,  as a proper means of assessing validity of the self-reports:   If the self-reports were a valid measure of psychiatric distress, we should see each crisis episode reflected in the inmate’s corresponding self-report. But if in filling out his self-report, the inmate responds so as to indicate he is doing just fine, then the self-reports are worthless.  They are garbage; they are in no way a measure of psychiatric distress. Now, it would have been quite easy for the authors to review these cases, a total of 37 recorded instances that would require simply a review of the corresponding self-report rating by the inmate during the time period at issue. I explicitly pointed this out to the authors prior to their public presentation of the data and prior to their final submission for publication.  Yet the authors declined to perform this crucial check on their data.  And, indeed, even looking cursorily at the data,  it is fairly obvious that such a review would have revealed that the self-report data was worthless.

5.  Conclusion.

There are a number of other methodological difficulties with their report, but in the end, much of the 163 page final report consists of long and endless statistical dissections of the self-report data. Yet these minute dissections reveal nothing, because the data they dissect does not in any meaningful manner reflect the psychiatric pathology they are supposed to be studying.   They endlessly dissect garbage.  And statistics are not alchemy; they cannot transform garbage into anything else but different arrangements of garbage.  Thus the saying among computer people and statisticians:  “G.I. – G.O. — garbage in, garbage out.”



    The Sourcebook on solitary confinement provides a comprehensive single point of reference on solitary confinement, its documented health effects, and professional, ethical and human rights guidelines and codes of practice relating to its use. It is hoped that the Sourcebook will encourage policy makers and prison managers to put in place safeguards and mechanisms to limit the use of solitary confinement and to mitigate its harmful consequences.

