Factor Structure of Persian Translation of the Patient Health Questionnaire in Iranian Earthquake Survivors


Hassan Rafiey 1 , Fardin Alipour 2 , * , Richard LeBeau 3 , Yahya Salimi 4 , Shokoufeh Ahmadi 5

1 Social Welfare Management Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran

2 Department of Social Work, Social Welfare Management Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran

3 Department of Psychology, University of California, Los Angeles, USA

4 Department of Epidemiology, School of Public Health, Kermanshah University of Medical Science, Kermanshah, Iran

5 Department of Health in Emergency and Disaster, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran

How to Cite: Rafiey H, Alipour F, LeBeau R, Salimi Y, Ahmadi S. Factor Structure of Persian Translation of the Patient Health Questionnaire in Iranian Earthquake Survivors, Iran J Psychiatry Behav Sci. 2018 ; 12(4):e59416. doi: 10.5812/ijpbs.59416.


Iranian Journal of Psychiatry and Behavioral Sciences: 12 (4); e59416
Published Online: October 2, 2018
Article Type: Original Article
Received: August 4, 2017
Revised: January 16, 2018
Accepted: March 13, 2018


Background: There is great need across the globe for self-report scales of depression that are brief to administer, comprehensive in content, and psychometrically valid. As most of the widely used and well-validated scales originate in English, it is essential to carefully translate them, following the psychometric validation of the adapted scales.

Objectives: The current study aimed at investigating the translation and validation of the Persian version of one of the most widely utilized self-report depression scales in the world, the patient health questionnaire (PHQ).

Methods: The current study evaluated the validity and reliability of the PHQ in a population-based sample of 600 adult survivors of an earthquake in Iran in 2015. Researchers used the forward-backward method to translate the PHQ-9 into the Persian language. Data were analyzed using both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA).

Results: Consistent with the results of validation studies on the original English-language versions of the scales, the current study found strong evidence of internal consistency (Cronbach’s alpha = 0.86). CFA suggested that the 1-factor structure fit reasonably well. Convergent validity was confirmed by the significant and positive correlation between the scores of national stressful events survey for posttraumatic stress disorder-short scale (NSESSS-PTSD) and the PHQ-9 (r = 0.29, P < 0.001). Approximately one-third of the sample reported some depression symptoms, with less than 10% indicating moderately severe to severe symptoms.

Conclusions: Although replication studies are needed, the current study results suggested that the Persian version of the PHQ-9 was a valid and reliable tool to efficiently, comprehensively, and accurately assess depression symptoms in the Iranian subjects.

1. Background

Depressive disorders are the greatest contributors to the burden of disease among all psychological disorders (1). Due to the increasing recognition of deleterious impact of depression on psychosocial functioning and quality of life, screening for depression is increasingly common outside the mental health treatment centers in areas such as primary care clinics and schools (2). Concurrently, there is a growing emphasis on psychological and psychiatric interventions that are informed by outcome monitoring (3-5). Both of these trends underscore the need for depression self-report scales that are brief to administer, comprehensive in content, and psychometrically valid. Numerous such scales are developed in recent decades.

Among the most widely used and well-validated self-report measures of depression is the patient health questionnaire (PHQ) (6). The scale contains nine items assessing DSM-IV (diagnostic and statistical manual of mental disorders, fourth edition) major depressive disorder (MDD) symptoms in the last week, which is scored based on a four-option Likert scale from 0 (not at all) to 3 (nearly every day). The scale is subjected to extensive psychometric validation (7) and nearly cited 9000 times. Since its publication, the PHQ-9 is successfully translated into dozens of languages and these foreign language versions of the scale had good validity and reliability compared with the original English language scale (8).

Iran is a country with a great need for well-validated assessments of depression. With a population of over 82 million people, Iran is the 18th most populous country in the world (9). A recent review of the extant literature regarding the prevalence of MDD in Iran by Sadeghirad et al., estimated that the prevalence of MDD in the past year was 4.1%, which corresponds to an estimated 3.3 million Iranians. These findings were based on 24 studies that used a combination of well-validated self-report scales and clinician-administered diagnostic interviews to assess MDD symptoms in 49273 Iranians. Also, results of the Iranian mental health survey showed that the prevalence of depression based on score 2 and higher in the general health questionnaire (GHQ)-28 was 10.39% (10). Impacts of the earthquake on mental health are a concern among survivors of disasters. Depression is one of the major mental health outcomes after natural disasters. The prevalence of depression after earthquakes was reported 9% to 79% (11-15). In a study on survivors of an earthquake in Pakistan in 2005, the prevalence of depression was 70.9% six months after the disaster (16). Notably, female gender was a major risk factor for MDD across studies, with females being nearly twice as likely to have MDD as males (17).

The authors are aware of only four studies examining a Persian (Farsi) translation of the PHQ-9 to assess depression symptoms in Iranian subjects. In the first study, Khamseh et al. translated the PHQ-9 into Persian and administered it to 185 patients receiving treatment for diabetes. This version of the scale demonstrated strong reliability (Cronbach’s alpha = 0.86). A clinical cut off score of 13 was established for the measure and 46% of patients in this sample exceeded the cut off. The authors noted that the rate of depression observed in the sample was unusually high compared with the general population of Iran, but noted that it was consistent with high rates of depression symptoms in individuals with diabetes (18). In the second study, 1006 patients receiving treatment in healthcare centers were screened for depression (19). The results suggested strong validity and reliability of the Persian measure of the PHQ-9 and indicated that although 77% of the screened subjects had at least some depression symptoms, moderately severe and severe cases were quite rare (7%). Third, a pilot study evaluated the Persian version of the PHQ-9 in 20 patients seeking medical treatment for asthma and adequate psychometric properties were found for the scale (20). The fourth study found adequate psychometric properties of a brief version of the PHQ (PHQ-4) in patients with chronic obstructive pulmonary disease (21).

Although these studies suggested that the Persian version of the PHQ-9 may be a valid tool to assess depression symptoms among Iranian subjects, they are limited by their reliance on a mostly urban population of individuals receiving medical treatment. In addition, the studies primarily aimed at determining the prevalence of MDD rather than validating the scale. Therefore, factor structure of the scale is not investigated yet.

2. Objectives

The current study aimed at addressing these limitations by thoroughly exploration and confirmation of the structural validity of the Persian version of the PHQ-9 in a large rural sample of Iranians exposed to a natural disaster.

3. Materials and Methods

3.1. Participants

A population-based cross sectional study was conducted in August 2015 in rural areas affected by Ahar, Heris, and Varzaqan earthquakes (2012) in Iran’s Eastern Azerbaijan province. Two earthquakes with the severity of 6 and higher on the moment magnitude scale (MMS) occurred with an 11-minute interval. At least 306 people were killed and more than 3000 others injured. The data were collected from 600 respondents in the earthquake-stricken areas using multi-stage cluster random sampling technique. Data collection was conducted by six psychologists and social workers from the people with an age range of 18 - 87 years. Before the assessments, all data collectors were called together to review the questionnaire and agreement was achieved on the explanation of each item. Since a face-to-face interview was employed, there were no problems of missing data in the current study.

After selecting samples of rural (primary units) households (secondary units) using the Kish method, one family member (aged 18 years or above) in each household was randomly selected. Kish selection method is one of the best methods to select members in survey sampling and use a pre-assigned table of random numbers to find the person to be interviewed (22). The current study was approved by the Ethical Committee of University of Social Welfare and Rehabilitation Sciences (USWR), Tehran, Iran. Also, participants signed the written informed consent form and had the right to withdraw from the study any time.

3.2. Translation Procedures

Researchers used the forward-backward method to translate PHQ-9 into the Persian language. The questionnaire was translated from English into Persian by two members of the research group independently and the primary Persian version of the questionnaire was developed based on the comparison of the two translations. Then, the Persian version was back-translated into English by two expert translators who were blind to the original version of PHQ-9. The both PHQ-9 versions were discussed and agreement on a final Persian version of the scale was made after matching the items in an expert panel.

3.3. Measures

The PHQ-9 is comprised of nine items scored based on a four-option Likert scale from 0 (not at all) to 3 (nearly every day). Total scores range 0 - 27 with higher scores indicating more severe depression. to interpret total scores in PHQ-3 scale, scores range 0 - 4 are considered as no depression, 5 - 9 mild depression, and 10 - 14, 15 - 19, and 20 - 27 as moderate, moderately severe, and severe depression, respectively. A checklist assessing various demographic factors was also completed by each participant.

A post-trauma stress disorder (PTSD) scale was employed to assess the convergent validity of the PHQ-9 among the study sample. The national stressful events survey for PSTD (NSESSS-PTSD) has nine items asking the individuals to rate the severity of their symptoms following a traumatic event. Each item is scored based on a five-option Likert scale. The total score ranges 0 to 36 with higher scores indicating greater severity of PSTD (14).

3.4. Statistical Analysis

To assess content validity, 10 mental health and questionnaire-designing experts reviewed the 9 - item questionnaire to ensure relevance and clarity of the items. Each reviewer independently rated the content validity based on four criteria including relevancy, clarity, simplicity, and necessity.

To assess the PHQ psychometric parameters, data were analyzed by several techniques. An exploratory factor analysis (EFA) (principal components analysis) was conducted on a randomly selected sample of 50% of the study sample in order to extract subscales used. To confirm hypothesized factor structure, confirmatory factor analysis (CFA) was conducted on the remaining 50% of the sample. Goodness-of-fit indices for CFA were considered as Chi-squared/df < 5, root mean square error of approximation (RMSEA) < 0.08, as well as comparative fit index (CFI), goodness-of-fit index (GFI), Tucker-Lewis index (TLI) > 0.9, and standardized root mean squared residual (SRMR) < 0.08 (23, 24).

Internal consistency was measured using Cronbach’s alpha. Based on the interpretation guidelines for the coefficient, scores 0.70 - 0.79 are fair, 0.80 - 0.89 good, and 0.90 and higher excellent. P values less than 0.05 were considered as the level of significance. Statistical analyses were conducted using STATA 13.

4. Results

Six hundred people completed the PHQ. As shown in Table 1, the mean age of the participants was 35.2 (SD = 12.8) years; the majority of participants were male (66.2%), married (65.8%), and had completed primary education (58.3%). The mean score for the Persian version of PHQ in all 600 participants was 3.88 (SD = 6.39). At least minimal depression symptoms were reported in 34.5% of the sample (n = 207) and the breakdown in terms of severity was as follows: 6.7% minimal, 7.8% mild, 10.7% moderate, 6.3% moderately severe, and 3% severe. See Table 1 for selected characteristics of the study participants.

Table 1. Characteristics of the Study Participants (n = 600)
Item (Summarized Content)Component Loading
Little interest (little interest or pleasure in doing things)0.549
Feeling down (feeling down, depressed, or hopeless)0.739
Trouble sleeping (trouble in falling or staying asleep, or sleeping too much)0.749
Tired (feeling tired or having little energy)0.662
Poor appetite (poor appetite or overeating)0.671
Feeling bad about self0.705
Trouble concentrating0.696
Slowing (moving or speaking so slowly that other people could have noticed)0.660
Thoughts of self-harm0.630

Ten mental health and questionnaire-designing experts confirmed the content validity of the PHQ-9 after translating some items and modifying them based on Farsi language and Iranian culture. Good internal consistency for total PHQ score was demonstrated (Cronbach’s alpha = 0.86). The highest and lowest corrected item-scale correlation was 0.67 and 0.46, respectively. EFA extracted one factor that explained for 46% of total variance. Since the unrotated factor loadings are easily specified, rotation was not conducted. Kaser-Meier-Olkin (KMO) measure with the value of 0.87 and the Bartlet test of sphericity (Chi-square = 896.006, df = 36, P < 0.001) indicated the suitability of data for factor analysis. The one-factor structure of the questionnaire was confirmed by the scree plot. Factor loadings are shown in Table 2.

Table 2. Item Components Loadings for PHQ Scale
VariablesNo. (%)
Male397 (66.2)
Female203 (33.8)
Marital status
Single195 (32.5)
Married395 (65.8)
Other10 (1.7)
Educational level
Primary school162 (27.2)
Primary to diploma350 (58.3)
University degree88 (14.7)
Depression prevalence
None393 (65.5)
Minimal40 (6.7)
Mild47 (7.8)
Moderate64 (10.7)
Moderately severe38 (6.3)
Severe18 (3)

Maximum likelihood estimations were used for all CFA. The CFA showed a very good fit to the data (χ2 = 35.88; df = 21; normed χ2 = 1.12 < 5; confirmatory factor index (CFI) = 0.998; TLI = 0.996; SRMR = 0.022, RMSEA = 0.05 (90% CI: 0.022 - 0.078) and P close = 0.431).

Box 1. Original Version of the Questionnaire
Over the Last 2 Weeks, How Often Have You Been Bothered by Any of the Following Problems?
1. Little interest or pleasure in doing things
2. Feeling down, depressed, or hopeless
3. Trouble in falling or staying asleep, or sleeping too much
4. Feeling tired or having little energy
5. Poor appetite or overeating
6. Feeling bad about yourself, or that you are a looser, or have let yourself or your family down.
7. Trouble concentrating on things, such as reading the newspaper or watching television
8. Moving or speaking so slowly that other people cannot notice or the opposite; being so fidgety or restless that you were moving around a lot more than usual.
9. Thoughts that you would be better off dead or of hurting yourself in some way.

Figure 1 shows the confirmatory factor analysis results along with standardized factor loadings for the PHQ. Also, convergent validity was confirmed by the significant and positive correlation between the scores of NSESSS-PTSD and the PHQ-9 (r = 0.29, P < 0.001).

Figure 1. Confirmatory factor analysis standardized factor loadings for the PHQ

5. Discussion

Approximately one-third of the sample reported at least some depressive symptoms, with less than 10% scoring in the moderately severe or severe range. Consistent with the findings regarding the psychometric properties of the original English language version of the scale, the Persian version demonstrated strong internal consistency, the presence of a single factor, and convergent validity (7). These findings suggested that the Persian version of the PHQ-9 may be a valid and reliable measure to assess depression symptoms among Iranians. In the current study, CFA on a validation sample thus showed adequate support for the one-factor structures of the Persian version of PHQ.

These findings should be considered in the context of several limitations. First, despite its relatively large sample size, the generalizability of the findings may be limited by certain aspects of homogeneity among the study participants (i e, the fact that they all came from the same geographic region and were exposed to the same traumatic event). Thus, these results need to be replicated in more diverse samples. Second, the study design precluded tests of certain psychometric properties of the scale such as discriminant validity and test-retest reliability. These should be examined in future studies. Third, due to the nature of the study design, the questionnaire was administered orally by an interviewer as opposed to its more typical format of being completed by the individual themselves using pencil and paper or computer. It is important for future studies to ensure that these results are replicated when administered in a different format.

The need for valid, reliable, and easily accessible symptom report scales is great in Iran. Scales assessing mental disorders may be especially important given the frequency of natural disasters in the country. Despite these limitations, the study had several strengths, including the use of well-standardized psychometric validation procedures and a large and carefully selected sample that differed significantly from the samples that the Persian version of the PHQ-9 was previously validated with. Although further replication is needed, the current study results suggested that the Persian version of the PHQ-9 was a valid and reliable tool to efficiently, comprehensively, and accurately assess depression symptoms in Iranian subjects. The widespread application of the Persian version of the PHQ-9 in a variety of settings had the potential to improve the identification of depression symptoms, which was essential to connect individuals to available resources, as well as increasing awareness of what further resources were needed.




  • 1.

    Whiteford HA, Degenhardt L, Rehm J, Baxter AJ, Ferrari AJ, Erskine HE, et al. Global burden of disease attributable to mental and substance use disorders: Findings from the global burden of disease study 2010. Lancet. 2013;382(9904):1575-86. doi: 10.1016/S0140-6736(13)61611-6. [PubMed: 23993280].

  • 2.

    Siu AL, Bibbins-Domingo K, Grossman DC, Baumann LC, Davidson KW; U. S. Preventive Services Task Force, et al. Screening for depression in adults: US preventive services task force recommendation statement. JAMA. 2016;315(4):380-7. doi: 10.1001/jama.2015.18392. [PubMed: 26813211].

  • 3.

    Boswell JF, Kraus DR, Miller SD, Lambert MJ. Implementing routine outcome monitoring in clinical practice: Benefits, challenges, and solutions. Psychother Res. 2015;25(1):6-19. doi: 10.1080/10503307.2013.817696. [PubMed: 23885809].

  • 4.

    Alipour F, Khankeh HR, Fekrazad H, Kamali M, Rafiey H, Ahmadi S. Social issues and post-disaster recovery: A qualitative study in an Iranian context. Int Soc Work. 2015;58(5):689-703. doi: 10.1177/0020872815584426.

  • 5.

    Rafiey H, Alipour F, LeBeau R, Amini Rarani M, Salimi Y, Ahmadi S. Evaluating the psychometric properties of the mental health continuum-short form (MHC-SF) in Iranian earthquake survivors. Int J Ment Health. 2017;46(3):243-51. doi: 10.1080/00207411.2017.1308295.

  • 6.

    Kroenke K, Spitzer RL, Williams JB. The PHQ-9: Validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606-13. [PubMed: 11556941]. [PubMed Central: PMC1495268].

  • 7.

    Forkmann T, Gauggel S, Spangenberg L, Brahler E, Glaesmer H. Dimensional assessment of depressive severity in the elderly general population: Psychometric evaluation of the PHQ-9 using Rasch analysis. J Affect Disord. 2013;148(2-3):323-30. doi: 10.1016/j.jad.2012.12.019. [PubMed: 23411025].

  • 8.

    Wang W, Bian Q, Zhao Y, Li X, Wang W, Du J, et al. Reliability and validity of the Chinese version of the patient health questionnaire (PHQ-9) in the general population. Gen Hosp Psychiatry. 2014;36(5):539-44. doi: 10.1016/j.genhosppsych.2014.05.021. [PubMed: 25023953].

  • 9.

    Mirzaie M, Darabi S. [Population aging in Iran and rising health care costs]. Iran J Age. 2017;12(2):156-69. Persian.

  • 10.

    Noorbala AA, Faghihzadeh S, Kamali K, Bagheri Yazdi SA, Hajebi A, Mousavi MT, et al. Mental health survey of the Iranian adult population in 2015. Arch Iran Med. 2017;20(3):128-34. [PubMed: 28287805].

  • 11.

    Onder E, Tural U, Aker T, Kilic C, Erdogan S. Prevalence of psychiatric disorders three years after the 1999 earthquake in Turkey: Marmara earthquake survey (MES). Soc Psychiatry Psychiatr Epidemiol. 2006;41(11):868-74. doi: 10.1007/s00127-006-0107-6. [PubMed: 16906439].

  • 12.

    Fan F, Zhang Y, Yang Y, Mo L, Liu X. Symptoms of posttraumatic stress disorder, depression, and anxiety among adolescents following the 2008 Wenchuan earthquake in China. J Trauma Stress. 2011;24(1):44-53. doi: 10.1002/jts.20599. [PubMed: 21351164].

  • 13.

    Ghaffari-Nejad A, Ahmadi-Mousavi M, Gandomkar M, Reihani-Kermani H. The prevalence of complicated grief among Bam earthquake survivors in Iran. Arch Iran Med. 2007;10(4):525-8. [PubMed: 17903061].

  • 14.

    Rafiey H, Alipour F, LeBeau R, Salimi Y, Sayad M. Evaluating the persian translation of the national stressful events survey PTSD short scale in a sample of Iranian earthquake survivors. J Loss Trauma. 2017;22(8):660-8. doi: 10.1080/15325024.2017.1373888.

  • 15.

    Rafiey H, Momtaz YA, Alipour F, Khankeh H, Ahmadi S, Sabzi Khoshnami M, et al. Are older people more vulnerable to long-term impacts of disasters? Clin Interv Aging. 2016;11:1791-5. doi: 10.2147/CIA.S122122. [PubMed: 27994445]. [PubMed Central: PMC5153288].

  • 16.

    Hashmi S, Petraro P, Rizzo T, Nawaz H, Choudhary R, Tessier-Sherman B, et al. Symptoms of anxiety, depression, and posttraumatic stress among survivors of the 2005 Pakistani earthquake. Disaster Med Public Health Prep. 2011;5(4):293-9. doi: 10.1001/dmp.2011.81. [PubMed: 22146668].

  • 17.

    Sadeghirad B, Haghdoost AA, Amin-Esmaeili M, Ananloo ES, Ghaeli P, Rahimi-Movaghar A, et al. Epidemiology of major depressive disorder in iran: A systematic review and meta-analysis. Int J Prev Med. 2010;1(2):81-91. [PubMed: 21566767]. [PubMed Central: PMC3075476].

  • 18.

    Khamseh ME, Baradaran HR, Javanbakht A, Mirghorbani M, Yadollahi Z, Malek M. Comparison of the CES-D and PHQ-9 depression scales in people with type 2 diabetes in Tehran, Iran. BMC Psychiatry. 2011;11:61. doi: 10.1186/1471-244X-11-61. [PubMed: 21496289]. [PubMed Central: PMC3102614].

  • 19.

    Mohit A, Jalili A, Nohesara S, Bolhari J. [A study of depression screening in primary care setting of Iran]. Int Med J. 2016;23(2). Persian.

  • 20.

    FallahTafti S, Cheraghvandi A, Safa M, Talischi F. Pilot study of reliability of PHQ-9 questionnaire for evaluation of depression in hospitalized asthmatic patients. Eur Respiratory Soc. 2012.

  • 21.

    Eslaminejad A, Safa M, Ghassem Boroujerdi F, Hajizadeh F, Pashm Foroush M. Relationship between sleep quality and mental health according to demographics of 850 patients with chronic obstructive pulmonary disease. J Health Psychol. 2017;22(12):1603-13. doi: 10.1177/1359105316684937. [PubMed: 28770626].

  • 22.

    Nemeth R, editor. Respondent selection within the household-A modification of the Kish grid. Meeting of Young Statisticians. Citeseer; 2002.

  • 23.

    Schumacker RE, Lomax RG. A beginner's guide to structural equation modeling. Psychology Press; 2004. doi: 10.4324/9781410610904.

  • 24.

    Kline P. Handbook of psychological testing. Routledge; 2013. doi: 10.4324/9781315812274.

  • Copyright © 2018, Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/) which permits copy and redistribute the material just in noncommercial usages, provided the original work is properly cited.