Measures of self-perceived oral health are useful tools for assessing oral health, identifying health disparities and evaluating the impact of health interventions at a population level [1, 2]. Indeed, Oral Health related Quality of Life (OHrQOl) measures, such as the General Oral Health Assessment Index Questionnaire (GOHAI) can provide valuable information about oral health and its components, such as pain and discomfort, dysfunctions or the psychosocial impacts of oral diseases .
The GOHAI has already been widely used in clinical or epidemiological studies worldwide as it is available in different languages. It includes three sets of response categories (three, five, and six categories) that generate a high GOHAI score for people with satisfactory oral health . The questionnaire is adapted to study various types of adult populations. The French version of the GOHAI has been validated for use in a population aged 18-45 years old  while the Italian version has been adapted for an older population (mean age 75 years, range 59-95) . A recent study showed that the French GOHAI had good psychometric characteristics for adults with schizophrenia and construct validity was supported by three factors . Despite confirmation of three factors structure for PWS reporting of the total score remains a common practice which implicitly assumed a one-dimensional nature of the scale.
Schizophrenia is a severe, disabling psychiatric disorder with either episodic or continuous evolution that can result in physical, psychological and social problems related to both the disease and the potential side effects of its treatment [7, 8]. Persons with schizophrenia (PWS) have excess mortality (their life expectancy is reduced by 20%) and excess morbidity . Among somatic comorbidities in PWS, poor oral health has been shown to contribute to the overall poor health of these patients . Generally, schizophrenia leads to disturbances in the progression of thought, errors in contextual analysis and errors of logic. Often PWS do not recognise their health needs and delay seeking advice or treatment . In recent decades, a number of studies have reported poor oral health in PWS [12, 13], but few studies have explored self-perceived oral health in PWS . Oral health is a complex concept differently defined by each person according to the understanding of what a healthy mouth is, the type of symptoms already experienced, cultural values, past experience with the health care system, general health, psychosocial wellbeing, the impact of severe mental illness and age or gender .
Searching for differential item functioning (DIF) in an OHrQOL questionnaire is a way to explore the equivalence of items across population sub-groups . DIF occurs when individuals with the same underlying (i.e., latent) level of health do not interpret a measure’s items in the same way. Awareness of this bias is of particular importance in clinical research where scale scores are used to investigate gender, ages or the mental status differences and ensure that derived scores are comparable across groups. A lack of measurement equivalence at the item level, may lead to spurious mean differences in the observed scores between this parameters, because one cannot be certain there is a meaningful difference, thereby making mean score differences un-interpretable . It is thus fundamental to check whether items function similarly across different population groups in order to understand more clearly the psychological characteristics of OHrQOL. Consequently, in the presence of DIF, a latent variable model is well suited for data analysis because it can estimate the different values of the parameters related to each item for the different groups of individuals. Among the latent variable models, the Rasch analysis is frequently used because of its simplicity: only one parameter for each answer category is required. Moreover, with the Rasch model, it is possible to estimate parameters without bias even in the presence of missing data .
In OHRQOL, one study showed that the impact of DIF across gender on the overall score was minimal for the Children Perception Questionnaire (CPQ11-14) . However, DIF were found in the CPQ11-14 for cultural factors and ethnicity [19, 20, 21]. For the Pediatric Quality of Life Inventory (PedsQL) the parent-proxy report was found to be inferior to the student self-report . As far as we know, DIF has never been investigated for the GOHAI questionnaire. Furthermore, the impact of mental illness on DIF for OHrQOL questionnaires has never been explored.
The aim of this study was to test the GOHAI items for DIF according to demographic characteristics (gender, age) and mental health status (schizophrenic disorders versus general population) using Rasch analysis.
Data were extracted from two previous crosssectional studies.
The first one was the French validation study of the GOHAI conducted in 2000-2001 among a sample of 255 adults in the “Puy de Dome” department (France) . Disadvantaged adults were over-represented in this sample and their age varied from 18 to 45 years. The sample included low-income earners benefiting from extended public health coverage.
The second study was the validation study of the GOHAI conducted in 2015 in a PWS sample . Data were collected within a multicentre cross-sectional descriptive study (BUCCODOR). A cluster sampling method was used to recruit 108 PWS aged 21-75 years in the “Côte d’Or” department (France). Schizophrenia was defined according to the standard classification of mental disorders (DSM-V).
For both studies, data were collected using personal interviews and dental examinations. Details of the methodology used for sampling and data collection have been described in previous publications [4, 23].
This study has been approved by the ethics Committee for the Protection of Persons (CPP) number I of Eastern France (registration number: 2014-A00358-39) and the Comité National Informatique et liberté in Paris which gave permission and set conditions for the collection of personal information for the study.
The GOHAI is an OHrQOL questionnaire, which was initially validated for use in an elderly population in North America . The questionnaire is composed of 12 questions with nine negative questions and three positive ones in order to discourage respondent acquiescence. The 12 questions assess physical functions (eating, talking and swallowing) for items 1, 2, 3 and 4 and psychosocial impacts (self-esteem, social withdrawal and worries about oral health) for items 6, 7, 9, 10 and 11. Items 5, 8 and 12 explore pain and symptoms (use of drugs to relieve pain, discomfort) related to the presence of oral diseases. There are five response categories with an associated score (l=always, 2=often, 3=sometimes, 4=seldom, and 5=never). Scores from the positively worded questions are reversed to calculate the global score so that the direction of all responses is the same. The GOHAI score is computed by adding up the scores of the 12 responses so that the highest score (60) indicates excellent oral health.
The sample size for Rasch analysis should be at least equal to the number of questions multiplied by the number of answer categories . Given that a 5-point Likert scale was used, the minimal number of participants needed was 60. Nevertheless, a sample of at least 200 individuals is generally used . The sample size (n = 363) obtained with the data from the two studies was thus considered sufficient.
For the Rasch analysis, item scores were ordered so that low values represented the worst level of OHrQOl and that high values represented satisfactory OHrQOl. For example, the answers to question 3 (“How often were you able to swallow comfortably?”) was coded from 0 (never) to 4 (always), while those to question 4 (« How often have your teeth or dentures prevented you from speaking the way you wanted?”) were coded from 0 (always) to 4 (never).
The Partial Credit Model (PCM) was applied to examine the GOHAI . Ad-hoc tests of fit of the PCM were realized using RUMM 2030 software. We used the chi squared statistics of RUMM to test the fit of the PCM to the dataset.
The criteria of infit and outfit mean-square (MnSq) were set at 0.6 to 1.4. The fit of items outside this range was considered poor . Items were characterized by using difficulty parameters δ with the unit of logit (a standardized score with mean as 0 and SD as 1 for unit of the latent variable) and were calculated for each item .
In case of dysfunctioning items (items presenting difficulty parameters in an unexpected order compared to the codes used for each answer category) they were recorded by collapsing adjacent answer categories, until each recoded item became a functioning item and supported the unidimensionality of the model .
In order to investigate the DIF, these parameters were estimated in different groups of patients: general population (GP) sample versus PWS, males versus females, younger (18-24) versus medium (25-45) or older subjects (45-74). Then the estimates of these parameters among groups were compared and only significant differences at 5% were retained.
Then, the latent variables were explained by the same variables as those used for the DIF analysis: gender, age, and health status of the individuals (GP or PWS). Interactions with age, gender and health status of the individuals were tested. Only significant variables at 5% were retained.
The overall sample included 369 subjects; 65% were female, 30% were PWS and 75% were aged 25-45 years. The gender and age distributions of the population per group are presented in Table 1 and Fig. 1. The mean GOHAI scores for PWS and GP were 45.5 (SD=8.41) and 46.43 (SD=9.46), respectively.
The fit of the PCM was significant (p<0.001), even if we adjust the statistics to a lesser number of individuals in order to evaluate the impact of the overpower of this test of fit due to the relatively large sample size (p=0.0293 with n=200). This could be explained by presence of DIF and of dysfunctioning items. In particular, items 3, 4, 5, 8, 10 and 11 presented a poor fit of the PCM to the data (Table 2 without DIF).
In Fig. 2, the curves show the probabilities of response for item 1 as a function of the latent trait. The curves for all the items presented the same global appearance with a translation on the right or on the left on the curves following the items. We can see from these curve that modality 3 is rarely used.
All items were considered dysfunctioning items, and answer categories coded 1 and 3 were collapsed with an adjacent answer category. Consequently, the answer categories “Never” and “Seldom” on the one hand and the answer categories “Sometimes” and “Often” on the other hand were collapsed. Note that for the three inversed items, the collapsed answer categories were “Always” and “Often” on the one hand, and “Sometimes” and “Seldom” on the other hand (Fig. 3). There are thus two difficulty parameters δ per item instead of four.
After collapsing the answer categories, no item was dysfunctioning and the fit to the model was significantly improved even though it was still unsatisfactory (the p-value was still below 0.001).
DIF was searched for by exploring gender, age and health categories. Table 3 presents the items affected by DIF.
Among the 6 items (3, 4, 5, 8, 10 and 11) with an initial poor fit of the PCM, 4 presented DIF (3, 4, 5 and 10), while only one item (7) also presented DIF.
The handling of the DIF resulted in a significant improvement of the log-likelihood of the model (p=0.002), even if the fit of the PCM always is significant (p<0.001). By adjusting the size of the sample size to evaluate the impact of the overpower of the test of fit due to the relatively large sample size, we obtained an unsignificant fit (p=0.0697 with n=200), (Table 2 with DIF).The covariates (health status, age and gender) were introduced into the model in order to explain the values of the latent variable. The variable “age” gave a significant explanation of the latent variable: the latent variable decreased with age (-0.40±0.08-p<0.001 for each increase of the age of 10 years). This decrease represented an effect size of 0.27 which can be qualified of a small to medium effect. Nevertheless, this effect explained only 2% of the variance of the latent trait, due to the heterogeneity of the latent variable between all of the individuals of the sample. The health status of individuals (GP versus PWS) and gender did not significantly explain differences in the values of the latent variable (respectively p=0.41 and p=0.49).
The aim of the study was to explore DIF of the GOHAI items according to demographic characteristics (gender, age) and mental health status (schizophrenic disorders versus general population) This research have a potential theoretical importance in increasing researchers’ understanding of the interplay between this parameters.
First we focused on one important prerequisite for such comparisons, measurement invariance.
The GOHAI showed initial problems with the basic requirements of the PCM. When the collapsing procedure was applied (producing a new three-level rating scale: 0 = never; 1 = sometimes; 2 = often/always) problems were solved and the measurement quality improved. After this adjustment, the unidimensionality was supported and the difficulty setting decreased to two per item instead of four.
In accordance with our results, a previous diagnosis of the GOHAI rating scale made by Franchignoni et al. showed that two (1 = seldom; 3 = often) of the five rating categories did not comply with the criteria for category functioning.5 When they collapsed them, using the same procedure as we did, the probability of selecting one of the three revised rating categories became a clear function of the level of ability shown by the subject. The Rasch analysis demonstrated the substantial unidimensionality of the GOHAI and pointed out the opportunity to decrease the number of response categories to three rating levels instead of the original five.
Participants were possibly unable to clearly distinguish between the categories ‘Always’ and ‘Very often’ and the categories ‘Seldom’ and ‘Never’. Another explanation could be that the evaluated symptom/dysfunction was always present or did not exist (never). The response categories “very often” and “rarely” would thus not be relevant if the symptom were constantly present or absent. For the non-constant situations, the central modality could thus be sufficient to describe the frequency of the event.
Some of the GOHAI items performed differently depending on the subgroup. Poor oral health may have a serious impact on quality of life, on everyday functioning, self-esteem and social inclusion but the perception of OHrQOL between PWS and GP seems to be different. OHRQoL is a psychological concept whereas symptoms are objective physical aspects. It is the impact of oral symptoms, rather than symptom itself, that is important.
The first explanation could be a misunderstanding of the positively worded questions of the GOHAI (n°3,5) and a particular difficulty for people to grasp the underlying concept of question 7 . This explanation is in accordance with the results of previous studies on the cultural adaption of the English GOHAI. In the Chinese version of the GOHAI, question 3 was reworded negatively to improve understanding . For the French validation of the GOHAI with PWS the lowest item-scale correlation coefficients were obtained for items 3, 5 and 7 .
The second hypothesis is that participants and particularly PWS may have found it difficult to perceive the concept of OHrQOL particularly for those questions. There are psychological and cultural aspects in the evaluation of OHrQOL that determine the way objective symptoms are interpreted as giving rise to impacts on quality of life . For example, in PWS, the authors observed some paradoxical situations where patients with severe dental diseases reported good oral health . In PWS, the perception of oral health strongly depends on the extent of the dental disease [13, 14] and on the intensity of the symptoms . The extent of the dental disease among inpatients is also directly related to the intensity of the schizophrenia, the magnitude of negative symptoms associated with schizophrenia, and to the length of hospitalization [30, 31]. Thus, PWS frequently do not identify or express their health needs, the side effects of treatments or their pain [10, 11]. In persons with disabilities, it has also been shown that differences in HRQOL not only reflect health status differences but also functional differences related to disability . Furthermore, the influence of functional differences on HRQOL scores may vary across different disability subgroups [30, 32]. Thus, the evaluation of OHrQOL may be potentially complex in people with disability or in PWS. This is emphasized for questions relating to satisfaction with appearance because psychological status determines vulnerability to low self-esteem and thus self-evaluation of oral health .
Although items with differential item function across mental status were identified, its impact on the overall score can’t be estimated. Further investigation should be conducted for a better understanding of the oral problems linked with the illness.
Concerning items 3,4 (ability to swallow and speak) and 10 (feel nervous or self-conscious) are less with older, this is accordance with the fact that the latent variable significantly decreased with age (-0.40±0.08-p<0.001) between the youngest individuals (18-25 years) and the oldest individuals. As in the study of Franchignoni et al, items evaluating important oral functions were items showing high difficulties . In fact, the elderly more commonly experience difficulty in swallowing or speaking due to oral impairment. Communication difficulties strongly affect general health, social life and wellbeing and swallowing difficulties may threaten life of the most vulnerable ones [1, 14]. The sequelae of oral diseases, such as tooth loss, are not reversible and thus accumulate over time even if prosthetic treatments are provided. The impact of oral disorders on quality of life thus increases with age for everyone, including institutionalized psychiatric patients, who are generally older patients [1, 30, 31].
In a study conducted in older adults, high GOHAI scores were found to show no significant influence of age on the frequency and severity of the impact of oral diseases . An explanation is that older people adapt to poor health and consider their impaired dental status as normal. In this situation, the objective dental status is disconnected from the subjective perception of oral health and quality of life.
The gender-related DIF identified for item 5 showed that women were more likely than men to report feeling discomfort for eating. Yau et al. showed a minimal impact of DIF across gender in Children with the CPQ 11-14 questionnaire . Lin et al., in the evaluation of the Persian Pediatric Quality of Life Inventory Oral Health Scale no DIF items were found across gender . Although one item across gender were detected, they were not possible to precise the impact of DIF in practice. Our results confirm the previous studies and the low impact of gender in DIFF in HRQOL scale.
Our study was limited to a sample of French participants. To extend the generalizability of our results, we encourage scholars in this area to examine our proposed model with different samples across different countries.
Second, our samples are heteregenous. GP sample is more younger than PWS sample. These differences may explain why men GOHAI scores were higer in PWS population. It is also possible that some of the differnces were due to DIF related to socioeconomic status.
This study revealed the presence of DIF in five of the 12 GOHAI items based on the analysis of data from two French populations; PWS and a general adult population. We showed the GOHAI scores were not be comparable across sub-groups defined by health status, age and gender without accounting for DIF. In our study, the effect of DIF “age” explained only 2% of the variance of the latent trait, but the decreased with age of latent variable represented an effect size not negligible and can influence mean GOHAI score comparisons. DIF analysis with gender and health status indicated that differences at the item and scale level exist but we can’t shows differences in the values of the latent variable. In the future, other studies should explore this way with other OHRQOL assessment tools and populations with mental illness.
The authors thank Philip Bastable for his help with English language correction.
Tubert-Jeannin S, Riordan PJ, Morel-Papernot A, Porcheray S, Saby-Collet S. Validation of an oral health quality of life index (gohai) in france. Community Dent Oral Epidemiol 2003;31:275-284. PubMedCrossrefGoogle Scholar
Franchignoni M, Giordano A, Levrini L, Ferriero G, Franchignoni F. Rasch analysis of the geriatric oral health assessment index. Eur J Oral Sci 2010;118:278-283. PubMedWeb of ScienceCrossrefGoogle Scholar
Denis F, Hamad M, Trojak B, et al. Psychometric characteristics of the “General Oral Health Assessment Index (gohai)” in a french representative sample of patients with schizophrenia. BMC Oral Health 2017;17:75. Web of ScienceCrossrefGoogle Scholar
Kasckow JW, Twamley E, Mulchahey JJ, et al. Health-related quality of well-being in chronically hospitalized patients with schizophrenia: comparison with matched outpatients. Psychiatry Res 2001;103:69-78. CrossrefPubMedGoogle Scholar
Kisely S, Baghaie H, Lalloo R, Siskind D, Johnson NW. A systematic review and meta-analysis of the association between poor oral health and severe mental illness. Psychosom Med 2015;77:83-92. PubMedWeb of ScienceCrossrefGoogle Scholar
Denis F. the oral health of patients in psychiatric institutions and related comorbidities. Soins Psychiatr 2014;290:40-44. Google Scholar
Tang LR, Zheng W, Zhu H, et al. Self-reported and interviewerrated oral health in patients with schizophrenia, bipolar disorder, and major depressive disorder. Perspect Psychiatr Care 2016;52:4-11. PubMedCrossrefGoogle Scholar
Millsap, R. E. (2012). Statistical Approaches to Measurement Invariance. New York, NY: Routledge-Taylor & Francis Group. Google Scholar
Hamel JF, Sebille V, Le Neel T, Kubis G, Boyer FC, Hardouin JB. What are the appropriate methods for analyzing patient-reported outcomes in randomized trials when data are missing? Stat Methods Med Res 2015. . CrossrefPubMedWeb of ScienceGoogle Scholar
Yau DT, Wong MC, Lam KF, McGrath C. Evaluation of psychometric properties and differential item functioning of 8-item child perceptions questionnaires using item response theory. BMC Public Health 2015;15:792. CrossrefWeb of SciencePubMedGoogle Scholar
Traebert J, Page LAF, Thomson WM, Locker D. Differential item functioning related to ethnicity in an oral health-related quality of life measure. Int J Paediatr Dent 2010;20:435-441. Web of ScienceCrossrefGoogle Scholar
Traebert J, de Lacerda JT, Thomson WM, Page LF, Locker D. Differential item functioning in a brazilian-portuguese version of the child perceptions questionnaire (cpq). Community Dent Oral Epidemiol 2010;38:129-135. Web of ScienceCrossrefGoogle Scholar
Aguilar-Diaz FDC, Page LAF, Thomson NM, Borges-Yanez SA. Differential item functioning of the spanish version of the child perceptions questionnaire. J Investig Clin Dent 2013;4:34-38. CrossrefGoogle Scholar
Denis F, Millot I, Abello N, et al. The protocol study, multicentre evaluation of oral health in persons with schizophrenia: a crosssectional study. Mathews J Dent 2016;1:006. Google Scholar
Tabachnick BG, Fidell LS. Using multivariate statistics. New York: Pearson Education; 2001. Google Scholar
Trevor G, Fox CM. Rasch modeling applied: rating scale design. In: Applying the rasch model: fundamental measurement in the human sciences. 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates; 2012:219-233. Google Scholar
Persson K, Axtelius B, Söderfeldt B, östman M. Oral health-related quality of life and dental status in an outpatient psychiatric population: a multivariate approach. Int J Ment Health Nurs 2010;19:62-70. Web of ScienceCrossrefGoogle Scholar
Lin CY, Kumar S, Pakpour AH. Rasch analysis of the Persian version of PedsQI™ Oral Health Scale: further psychometric evaluation on item validity including differential item functioning. Health Promotion Perspectives, 2016;6:145-151. CrossrefGoogle Scholar
About the article
Published Online: 2017-10-28
Funding: This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Conflicts of interest: The authors report no conflicts of interest.
Authors’ contributions: FD was the signatory investigator on the study. FD and JBH contributed to the concept and design of the study. All of the authors contributed to the interpretation of the data, revised the manuscript, and approved the final content of the manuscript.
Citation Information: Translational Neuroscience, Volume 8, Issue 1, Pages 139–146, ISSN (Online) 2081-6936, DOI: https://doi.org/10.1515/tnsci-2017-0020.
© 2017 Frederic Denis et al.. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0