Friday, January 20, 2012

Allen Frances on the DSM-5 Field Trials

Two approaches are possible when the DSM 5 field trials reveal low reliability for a given suggestion: 1) admit that the suggestion was a bad idea or that it is written so ambiguously as to be unusable in clinical practice, research, and forensics; Or, 2) declare by arbitrary fiat that the low reliability is indeed now to be relabeled ‘acceptable’.
So Allen Frances has a couple of new pieces about fatal problems with the DSM-5 Field Trials. First, they certainly appear to be retroactively and illegitimately lowering their reliability standards based on disappointing field-trial results:
The commentary states: “A realistic goal is a kappa between 0.4 and 0.6, while a kappa between 0.2 and 0.4 would be acceptable.” This is simply incorrect and flies in the face of all traditional standards of what is considered ‘acceptable’ diagnostic agreement among clinicians. Clearly, the commentary is attempting to greatly lower our expectations about the levels of reliability that were achieved in the field trials – to soften us up to the likely bad news that the DSM-5 proposals are unreliable. Unable to clear the historic bar of reasonable reliability, it appears that DSM-5 is choosing to drastically lower that bar – what was previously seen as clearly unacceptable is now being accepted.
It’s rather strange that they didn’t have to establish and publicly defend acceptable reliability levels beforehand. This is especially problematic given that, Frances reports, the Oversight Committee was insisting in 2010 that they would not move forward if the field trials returned “crappy results.” But the second phase that was to test revised criteria based on their performance in the first phase was dropped. It appears they are moving forward by redefining “crappy.”

As he points out, reliability is only one criterion, but it’s an essential one: “Why does this matter? Good reliability does not guarantee validity or utility – human beings often agree very well on things that are dead wrong. But poor reliability is a certain sign of very deep trouble. If mental health clinicians cannot agree on a diagnosis, it is essentially worthless.”

Furthermore, they did not set up the field trials to assess how the new diagnostic criteria would affect prevalences. As the post hoc rationalization goes:
[O]ne contentious issue is whether it is important that the prevalence for diagnoses based on proposed criteria for DSM-5 match the prevalence for the corresponding DSM-IV diagnoses” …. “to require that the prevalence remain unchanged is to require that any existing difference between true and DSM-IV prevalence be reproduced in DSM-5. Any effort to improve the sensitivity of DSM-IV criteria will result in higher prevalence rates, and any effort to improve the specificity of DSM-IV criteria will result in lower prevalence rates. Thus, there are no specific expectations about the prevalence of disorders in DSM-5.”
But, as Frances makes clear,
This is irresponsible for two reasons. First off, we are already suffering from serious diagnostic inflation. Rates of psychiatric disorder are already sky high (25% in the general population in any year; 50% lifetime) and we recently have experienced three runaway false epidemics of childhood disorders in the past 15 years. Second, drug company marketing has been so abusive as to warrant enormous fines and so successful as to result in widespread misuse of medication for very questionable indications….

The DSM-5 proposals will uniformly increase rates, sometimes dramatically. Not to have measured by how much is unfathomable and irresponsible. The new diagnoses suggested for DSM-5 will (mis)label people at the very populous boundary with normality…The field trial developers seem either unaware or insensitive to the unacceptable risks involved in creating large numbers of false positive, pseudo-patients.

Indeed, quite contrary to the blithe assertions put forward in the commentary, we should have rigorous expectations about prevalence changes triggered by any DSM revision. Rates should not be wildly different for the same disorder UNLESS there is clear evidence of a serious false negative problem and firm protections against creating a massive false positive problem. And new disorders with high prevalences should not be included without substantial scientific evidence and convincing proof of accuracy, reliability, and safety. We have known since they were first posted that none of the DSM-5 proposals comes remotely close to meeting a minimal standard for accuracy and safety. And now, the AJP commentary seems to be softening us up for the bad news that their reliability is also lousy.
These field trials sound like a mess. They would be insufficient even if they were well designed and executed and met acceptable standards of reliability, which does not appear to be the case.

Imagine a medical licensing program for which performance standards are not specified in advance and students who receive Ds - frequently misdiagnosing healthy people with serious illnesses and prescribing expensive and dangerous courses of treatment - are granted licenses. Scary.

