The Drawbacks of PHQ-9 Scoring in Depression Assessment

In a study by Brooke Levis and Brett D. Thombs at McGill University in Montréal, reported by Mad in America, it was found that more than twice as many people were diagnosed with depression based on the PHQ-9. Specifically, 24.6% of participants met the PHQ-9 score of 10 or higher, whereas only 12.1% met the criteria for depression on the SCID, a structured clinical interview conducted by a physician.

As we increasingly recognize the significant role mental disorders play in our communities and the importance of their early detection, it becomes evident that more accurate mental health screening methods are needed. In this article, we will explore how the PHQ-9 scoring system works, how it became the standard for screening depression, and how it fails to account for issues such as comorbidities like Mild Cognitive Impairment (MCI).

What is PHQ-9 Scoring and how does it work?

The PHQ-9 test is a questionnaire that comprises 9 questions, as the name suggests. It is designed to provide a quantitative approach to screening depression and measuring patients’ response to treatment, based on the nine diagnostic criteria for major depressive disorder in the DSM-IV. In the PHQ-9, patients are asked to score 9 sentences describing specific sentiments in terms of frequency, with 0 being “not at all” and 3 being “nearly every day” over the last 2 weeks. These nine sentences are:

Little interest or pleasure in doing things
Feeling down, depressed, or hopeless
Trouble falling or staying asleep, or sleeping too much
Feeling tired or having little energy
Poor appetite or overeating
Feeling bad about yourself — or that you are a failure or have let yourself or your family down
Trouble concentrating on things, such as reading the newspaper or watching television
Moving or speaking so slowly that other people could have noticed, or the opposite — being so fidgety or restless that you have been moving around a lot more than usual
Thoughts that you would be better off dead or hurting yourself in some way

For each of these statements, patients give a score from 0 to 3, resulting in a total score from 0 to 27. A score of 0-4 is termed “minimal depression,” scores of 5-9 are “mild depression”, and scores of 10-14 indicate “moderate depression”. No advice is given for the treatment of mild and moderate depression, as it is left to the physician’s judgment throughout symptoms and functional impairment. Scores of 15-19 are diagnosed as “moderately severe depression”, and scores of 20-27 are diagnosed as “severe depression”. In moderately severe and severe depression, several possible treatments are advised, including antidepressants, psychotherapy, or both.

How PHQ-9 Became the Standard Patient Health Questionnaire

The acronym PHQ stands for Patient Health Questionnaire. It is the single-page, self-administered component of a larger assessment called PRIME-MD, which also includes a 12-page Clinician Evaluation Guide (CEG). The CEG takes the form of a structured interview that clinicians use to follow up on the responses to the PHQ.

Although initially part of the broader PRIME-MD assessment, the PHQ was separated and used on its own as the PHQ-9 in the late 1990s. Since then, clinicians and researchers have commonly employed it as the primary tool for measuring depression and its risks.

The PHQ-9’s status as the standard measure for depression was further cemented in 2010 when the pharmaceutical company that developed PRIME-MD made the PHQ, PHQ-9, and other associated assessment scales copyright-free. This move allowed the questionnaire to be more widely adopted by organizations and healthcare providers, leading to the PHQ-9’s rapid proliferation and widespread use.

Important Limitations of PHQ-9 Scoring

While the PHQ-9 addresses the need for a standard, copyright-free, and cost-free method for assessing depression, it also comes with serious caveats that physicians often overlook. Most notably, while the PHQ-9 scoring is intended to guide providers as a screening tool, they frequently rely solely on its score to make a diagnosis and prescribe medication. This approach fails to consider potential comorbidities and disorders that are crucial for a comprehensive clinical assessment. For example, the link between depression and Mild Cognitive Impairment (MCI) is well established: Depression can lead to cognitive impairment, and vice versa, creating a vicious circle that requires careful evaluation.

These limitations are underscored by the fact that providers misdiagnose depression 66% of the time and generalized anxiety disorder 71% of the time, according to a study published in the National Library of Medicine. Such high rates of misdiagnosis highlight the need for a more effective system for managing mental health. It may be time to consider alternative or supplementary screening tools to improve diagnostic accuracy and patient care.

Alternative Diagnostic Systems

As we advance further into the AI revolution, new solutions to old problems are emerging constantly, breaking age-old patterns and setting new standards. In the realm of depression, the LANGaware mental health test offers an innovative AI-based approach that relies on subtle speech and voice patterns. Since these patterns are involuntary, they provide a more objective measure compared to self-assessment methods. The test only requires the patient to provide a small voice sample while describing an image or activity, making the process simple and non-intrusive.

The LANGaware mental health test takes just 5 minutes to determine whether a patient has mild or severe depression, or if they are healthy. Most importantly, it is complemented by a cognitive test that helps assess potential comorbidities, such as Mild Cognitive Impairment (MCI) or dementia.

Final Thoughts on PHQ-9 Scoring

As part of the PRIME-MD, the PHQ serves as an initial self-assessment tool that helps physicians determine the general direction for further evaluation and develop a well-thought-out therapeutic plan. However, when used as a standalone test, PHQ-9 scoring can be too subjective, and relying on it as a definitive diagnostic tool is misguided. Instead, it would be more effective as an initial screening tool, with a definitive diagnosis being left to more comprehensive examinations.

In the realm of screening tools, newer methods are emerging that utilize advanced techniques to enhance the accuracy of this initial stage. By moving from what is merely convenient to what is more precise, physicians can improve diagnostic efficiency and ensure that our communities receive the mental health treatment they deserve.