Concordance between clinician-and 2016 criteria-based diagnoses of fi bromyalgia

Objectives: The Fibromyalgia Survey Diagnostic Criteria2016 (FSD-2016 criteria) were recently recommended for both clinical and research purposes. The present study aims to examine whether there is concordance between clinician-based and FSD-2016 criteria-based diagnoses of FM, and secondly, to examine how the illness severity and physical function relate to the criteria-based diagnosis among patients referred to a rheumatism hospital. Methods: Participants with a clinician-based diagnosis of FM were included consecutively when referred to a patient education programme for patients with FM. Illness severity was assessed with the Fibromyalgia Survey Questionnaire (FSQ). Based on the FSQ, the fulfilment of the FSD-2016 criteria was evaluated. Physical function was assessed using the Fibromyalgia Impact Questionnaire (FIQ) function scale and self-reported employment status. Results: The sample included 130 patients (84% women) from 20 to 66 years of age. Eighty-nine per cent met the FSD-2016 criteria, and 44% of the patients were fully or partially employed. Great variability in illness severity was seen irrespective of employment status. There was an association between illness severity and physical function (r=0.4, p<0.001). For 95% of the patients, the FSQ illness severity scores classify as severe or very severe, and even for those not fulfilling the diagnostic criteria the scores were moderate and severe. Conclusions: Therewas relatively high agreement between clinicianand criteria-based diagnoses. The illness severity overlapped irrespective of different employment status and fulfilment of FSD-2016 criteria.


Introduction
Fibromyalgia (FM) is a contested widespread musculoskeletal pain condition that cannot be confirmed by laboratory or radiological assessments. Patients experience persistent pain with an unpredictable fluctuating intensity [1], and they report multiple other problems, e.g. pronounced fatigue, sleep and concentration problems, depression, headache, irritable bowel, and impaired functioning [2]. FM is explained by multiple interacting mechanisms such as hypersensitivity of the central nervous system, deficits in endogeneous pain inhibition, alterations in the neuroendocrine system, autonomic nervous system, immune system and stress regulation mechanisms, and additionally, by genetic vulnerability and psychological mechanisms [3][4][5]. There is no known curative treatment for FM, but the European League Against Rheumatism highlights the importance of an early diagnosis in order to start treatment tailored to the individual's needs and illness severity [6].
A challenge in the diagnostic process is that the symptoms of FM mimic a number of other diseases such as within rheumatology, neurology and psychiatry [4]. Accordingly, patients are often sent to various medical specialists for diagnostics [7], and it may take years to arrive at a FM diagnosis [7]. Meanwhile, patients may worry and fear the worst, and additionally, health professionals and others may accuse patients of exaggerating bodily sensations [8]. For patients, it can be a relief to get a diagnosis confirming not having a progressive or fatal disease [8], and patients may find it better to have a FM diagnosis than having no diagnosis [9].
Since the 1970s, several diagnostic criteria sets have been published [10]. Until recently, the most influential has been the American College of Rheumatology criteria from 1990 (ACR-1990 criteria) [11]. According to these criteria, pain should have persisted for at least three months, generalized by being present axially, and above and below the waistline on the right and left sides. Additionally, at least 11 of 18 defined points should generate pronounced tenderness from thumb palpation pressure of 4 kg/cm 2 . The ACR-1990 criteria have been applied for research purposes worldwide. However, physicians in clinical practice, particularly non-rheumatologists, have often either ignored the criteria or performed the tender point examination incorrectly [12]. Thus, in the 2000s, new diagnostic criteria have been developed that eliminate the tender point assessment in order to make criteria reliable and applicable for both clinical and research purposes, and thereby, ensure that research generates clinical relevant knowledge.
Like in the ACR-1990 criteria, the preliminary ACR-2010 criteria required that pain should have persisted for at least three months. The ACR-2010 criteria replaced the requirements of generalized pain and tender point count with an illness severity scale, the Fibromyalgia Survey Questionnaire (FSQ), based on the number of pain sites and the severity of other symptoms related to FM [13]. In 2011, the ACR-2010 criteria were revised to the modified 2010/2011-criteria so that they could be applicable as selfreports in epidemiological studies [14]. However, the authors later demonstrated that in a rheumatology clinic, clinician-based diagnoses failed to identify about 50% of those fulfilling the modified 2010/2011 criteria-based diagnosis [15]. In 2016, a new revision was launched, the Fibromyalgia Survey Diagnostic-2016 criteria (FSD-2016 criteria). Here, the criterion of generalized pain was added to the modified 2010/2011 criteria defined as the presence of pain in at least four of the five body regions defined earlier in the ACR-1990-criteria, and additionally, the FSD-2016 criteria do not rule out concomitant diagnoses [16]. Whereas the ACR-1990 criteria emphasised allodynia, the FSD-2016 criteria seems to focus more on central pain perceptions and distress [17].
In 2019, another criteria set was launched by the American Pain Association [4], the AAPT-2019 criteria. In addition to pain, the occurrence of moderate to severe degree of fatigue or sleep problems is required, and generalized pain is defined as at least six out of nine defined pain sites. The AAPT-2019 criteria lack an illness severity scale, it is dichotomous in fulfilling a FM diagnosis or not and have not been translated or validated in a Norwegian setting. However, the FSD-2016 criteria and the FSQ were recently translated and found valid in a Norwegian context [18]. Still, it remains to examine whether these criteria match clinicians' judgement of diagnosis. Our purpose was to examine whether there is concordance between the FM diagnosis determined by the FSD-2016 criteria and clinicians' judgements of FM diagnosis, and secondly, to examine how illness severity and physical functioning relates to the criteria-based diagnosis.

Design and ethics
A cross-sectional study design was applied. Participants were included in the study during a six-month period in 2018. They were recruited consecutively as they were referred by physicians to a patient education programme for patients with FM at the Rehabilitation Department at the Hospital for Rheumatic Diseases in Lillehammer, Norway. Most of the patients were referred by physicians in primary health care while about one third were referred by medical specialists at the outpatient hospital clinic. The Norwegian Social Science Data Service approved the study (no. 2018/57956/3/EPA). The study complies with the Helsinki Declaration. The patients received written and oral information about the purpose and content of the study and the voluntary nature of participation. All patients accepted the invitation and signed a consent form.

Assessments
Patient characteristics: The patients filled in a structured questionnaire about personal characteristics in terms of age, gender, education, and occupational status, as well as information about illness duration, whether and when a FM diagnosis was set, and their use of health services in relation to FM.
Evaluation of criteria-based diagnosis: The Fibromyalgia Survey Diagnostic Criteria -2016 (FSD-2016 criteria) are evaluated based on the FSQ. To fulfil the fibromyalgia criteria, the pain must have lasted for at least three months, WPI≥7 and SSS≥5, alternatively WPI from 4-6 and SSS≥9, and additionally, pain must be localised in at least four of five body regions: axial, right/left upper and lower limbs [16].
Problems in performing daily activities: The Fibromyalgia Impact Questionnaire (FIQ) function scale assesses problems in performing everyday life activities [21]. In this study, we applied the FIQ function scale including questions about the ability to perform 10 different daily life activities (0=never, 3=always). The scores were summed, divided by the number of answered items, and multiplied by 3.33, which gives a score from 0 (best) to 10 (worst). The FIQ function scale has demonstrated moderate to good validity and reliability [22].

Statistical analysis
SPSS version 26 was used for data analyses. Ordinal data are provided in numbers and percentages, and continuous data as averages and standard deviations (SD). Both parametric and non-parametric tests were applied to analyse the correlations between variables and group differences. p-Values for two-sided hypotheses at 5% were considered statistically significant.

Patient characteristics and use of health services
All invited patients agreed to participate (n=130). Table 1 displays patients' characteristics and use of health services. Of the patients, 84% were women, and 44% were fully or partially employed. Fifty-nine per cent of the participants were diagnosed within the last year. In the total sample, the time since diagnosis ranged from 1 month to 33 years. The patients had been examined by several medical specialists, undergone various specialised radiological assessments, and the patients had been treated by multiple health professionals.
Clinically set diagnosis and concordance to criteria-based diagnosis One hundred and 24 patients (95%) out of 130 reported having a clinician-set FM diagnosis, and six patients reported being under diagnostic investigation. All patients in the sample reported duration of pain for more than three months, and all but one had axial pain. One hundred and 16 patients (89%), including the six patients under diagnostic investigation, met the FSD-2016 criteria. Of the 14 patients who did not fulfil the criteria, 10 reported pain in only three body regions, while another four did not reach Fibromyalgia severity and functional ability in relation to criteria-based diagnosis Table 2 shows the FSQ scores and the FIQ physical function for the total sample. The FSQ total (FS-score) varied from mild to very severe; whereas according to the classification of illness severity, 1% of the patients had mild, 4% moderate, 23% severe, and 72% very severe. There was no difference between illness severity (FS-score) and FIQ physical function for those working full-time/part-time and those not working (full sickness/disability benefit, unemployed), p>0.05. A detailed description of the illness severity measured by the FS-score showed a large overlap between patients with different employment status ( Figure 1). Figure 2 shows the distribution in the FS-score and the FIQ function score in relationship to fulfilment of diagnostic criteria. For the whole sample, a statistically significant correlation was found between the FS-score and the FIQ function score (r=0.4, p<0.001). Those who did not meet the FSD-2016 criteria had lower illness severity than those meeting the criteria, 13.6 (1.5) vs. 23.0 (3.8), p<0.001 and better physical functioning 3.6 (1.5) vs. 4.9 (2.3), p=0.04 than the others, as illustrated in Figure 2, but still they had moderate or severe FM.

Discussion
Eighty-nine per cent of the patients who were diagnosed clinically or were under clinical diagnostic evaluation for FM fulfilled the FSD-2016 criteria. In the whole sample, patients had consulted several medical specialists and tried out therapies provided by various health professionals. There was a large overlap in illness severity between groups of different employment status. Presently, the concordance between the criteria-and clinician-based diagnoses was rather good. Multiple physicians had set the clinical diagnosis of FM, and we do not know how they had arrived at their decisions. It is likely that the diagnostic process and judgements among the numerous primary health physicians and among the medical specialists at the hospital varied substantially. Nevertheless, their clinical diagnosis accorded to a great degree with the criteria-based diagnosis. One explanation can be that the FSD-2016 criteria capture a shared opinion among Norwegian physicians that multi-sited pain together with multiple other symptoms are hallmarks of FM. It is unlikely, however, that many of the clinicians applied the Norwegian version of the FSD-2016 criteria as they were not yet implemented. Furthermore, the recruitment of participants to the Norwegian criteria validation study [18] and our study were performed at the same time in two different geographical regions in Norway. Thus, we think that the clinician-based diagnosis largely reflects clinical experiential judgement with or without support from the ACR-1990 or the modified 2010/2011 criteria.
Another Norwegian study examined the fulfilment of the FSD-2016 criteria among 33 Norwegian patients referred to a specialised pain clinic. They were included if they had experienced pain for at least three months and met the requirement of generalized pain in at least four     out of five body regions [23]. Among those selected patients, 82% also fulfilled the illness severity required by the FSD-2016 criteria. In contrast to this study, we included all those previously diagnosed by clinicians and referred to a FM treatment programme. Even though both studies found high fulfilment of the FSD-2016 criteria, our inclusion criteria differed whereas Tschudi-Madsen et al. [23] included their patients when they fulfilled the generalized pain of the FSD-2016 criteria, whereas we included patients previously been diagnosed clinically. Salaffi and coworkers [24] examined 732 Italian patients referred by primary health physicians to diagnostic evaluation in a rheumatology clinic based on history of chronic widespread pain. The rheumatologists found that 405 patients had FM. The rheumatologists' clinical-based diagnosis was set as a criterion for analysis, and sensitivity and specificity were found to be high for the modified 2010/2011 criteria, somewhat less so for the FSD-2016 criteria, and poorest for the AAPT-2019 criteria. In addition, our results showed an even better agreement between clinician-and criteria-based diagnoses if we excluded the criterion of generalized pain. Ten patients fulfilled the criteria of multisite pain (WPI score), but did not fulfil the FSD-2016 criteria as they had only spatially distributed pain in three out of at least four required spatial regions. For their part, Wolfe et al. [15] identified 121 patients meeting the modified 2010/2011 criteria among a sample of 497 patients in a university rheumatology clinic, and there was 79% agreement between clinician-and criteria-based diagnoses. However, the rheumatologists also failed to identify 49.6% of criteriapositive patients and incorrectly included 11.4% of criteria-negative patients. Looking across these studies, the concordance between clinician-and criteria-based diagnoses was rather high in all studies, but somewhat higher in our study than in the other studies. It is plausible, however, that the high agreement between clinician-and criteria-based diagnosis in our study was that our patients were not sent to a diagnostic evaluation, but to participate in a multidisciplinary patient education programme. The relatively high degree of measured illness severity suggests that the diagnosis may have been unmistakable for the clinicians and that it might have been different if patients were less severely afflicted. Promising though is that general practitioners and not specialists in rheumatology mostly had set the diagnosis, and therefore indicate that the criteria is applicable for non-rheumatologists. Taken together, the studies suggest that the use of FSD-2016 criteria for research purposes include patients evaluated by physicians to have FM. However, our study can neither approve nor refute the suggestion of Arnold and Clauw [25] that many patients may be underdiagnosed and undertreated. The high number of earlier treatments for FM with little success in our study, however, suggests that patients are not undertreated, but rather inadequately treated. Accordingly, we agree with Clauw's [26] statement that we have to stop the 'FM diagnostic criteria war' and find out how to better treat the patients. Previously, an almost linear association between the number of pain sites and the occurrence of a number of non-musculoskeletal symptoms has been demonstrated [27]. In our study, the illness severity assessed by the FSQ of most patients were classified as severe and very severe, and none of the patients were in the range of no severity. Even those not fulfilling the FSD-2016 criteria had FS-scores classified as either moderate or severe. However, most of the symptoms evaluated by the FSQ are not exclusive to FM, for example, the symptoms are rather common in many chronic non-malignant pain conditions. In our sample most of those not meeting the FSD-2016, had multiple painful sites but only in three of the least required four body regions. If there are multisited pain in three instead of four bodily regions together with moderate or severe FS-scores, we suggest that clinicians should consider the possibility that their patient might benefit from treatments given to patients with FM [28,29].
In line with the findings of Choy et al. [7], the patients in the present study had also consulted several medical specialists and many had undergone several advanced radiological assessments. It may be necessary to perform such assessments for the purpose of differential diagnosis or for identifying comorbidities, but it often takes time. In their survey, Choy et al. [7] found that a diagnostic journey took years. Unfortunately, we have no data about the time it took before a decision about diagnosis was taken. However, the reported use of health services in our study demonstrates another challenge, namely that both before and after diagnosis, patients are sent to several health professionals in order to try out different treatments. This is in line with the findings of a meta-synthesis that first described a timeconsuming diagnostic journey which, after being diagnosed, was replaced by another long journey to find appropriate help [8]. Thus, it seems important to identify patients earlier and develop better ways to help individual patients. Patients do not necessarily experience less uncertainty when their symptoms are classified and named [30], and without appropriate explanations and treatments, the doctor-patient interactions can be challenged [31,32].
It is a strength of the study that we succeeded in recruiting all the patients referred to a patient education programme for patients with FM. However, the sample size is not very large, the study was conducted at only one hospital, and most of the patients had very severe illness. The patients were referred by multiple physicians from a wide geographic region, though. The clinicians' judgements and referral practices are essential in the present study. It is thus a shortcoming that we do not know anything about the physicians who set the diagnosis and how they arrived at it. On the other hand, it is a strength that the physicians were not aware that we would check their clinical diagnoses against diagnostic criteria. Accordingly, the clinician-based diagnoses reflect a 'real-life' situation, and it is in this study a strength to include many different physicians. Nevertheless, the high agreement between clinician-and criteria-based diagnoses cannot be generalised to all patients diagnosed by physicians, as 72% in our sample had very severe illness. Thus, future studies should also examine the agreement if patients have less severe illness. Nevertheless, it is likely that using the FSD-2016 criteria for research purposes will provide important knowledge for clinical use concerning those most afflicted in clinical practice.

Conclusions
The findings of our study showed relatively good concordance between clinician-and criteria-based diagnoses of FM among patients mostly classified with very severe illness. Those not fulfilling the FM criteria also had moderate or severe illness scores. The illness severity varied and overlapped in groups with different employment status. The patients had often consulted several medical specialists and been treated by therapists with various professional backgrounds.