Stillbirth diagnosis and classi ﬁ cation: comparison of ReCoDe and ICD-PM systems

Objectives: The identification of causes of stillbirth (SB) can be a challenge due to several different classification systems of SB causes. In the scientific literature there is a continuous emergence of SB classification systems, not allowing uniform data collection and comparisons between populations from different geographical areas. For these reasons, this study compared two of the most used SB classi ﬁ cations, aiming to identify which of them should be preferable. Methods: A total of 191 SBs were retrospectively classified by a panel composed by three experienced-physicians throughout the ReCoDe and ICD-PM systems to evaluate which classification minimizes unclassified/unspecified cases. In addition, intra and inter-rater agreements were calculated. Results: ReCoDe defined: the 23.6% of cases as unexplained, placental insufficiency in the 14.1%, lethal congenital anomalies in the 12%, infection in the 9.4%, abruptio in the 7.3%, and chorioamnionitis in the 7.3%. ICD-PM defined: the 20.9% of cases as unspecified, antepartum hypoxia in the 44%, congenital malformations, deformations, and chromosomal abnormalities in the 11.5%, and infection in the 11.5%. For ReCoDe, inter-rater was agreement of 0.58; intra-rater agreements were 0.78 and 0.79. For ICD-PM, inter-rater agreement was 0.54; intra-rater agreements were of 0.76 and 0.71. Conclusions: There is significant difference between ReCoDe and in Inter intra-rater due to their lack of specific guidelines which can facilitate the interpretation. Thus, the authors suggest correctives strategies: the implementation of specific guidelines and illustrative case reports to easily solve interpretation issues.


Introduction
According to a recent report called A Neglected Tragedy: The Global Burden of Stillbirths, the first-ever stillbirth report by the UN Inter-Agency Group for Child Mortality Estimation (UN-IGME) [1], every 16 s one stillbirth (SB) occurs: about two million SBs per year. These losses are responsible for social, psychological, economic, and medical negative consequences [1][2][3]. SBs can also determine misunderstandings between parents and healthcare operators, causing medical malpractice claims [4]. For these reasons, SBs represent a significant burden for all societies because they can generate unpredictable negative impact on families [1][2][3][5][6][7][8][9].
The high number of SBs is a growing public health issue: "in 2000, the ratio of the number of stillbirths to the number of under-five deaths was 0.30; by 2019, it had increased to 0.38" [1]. According to the United Nation Inter-Agency Group for Child Mortality Estimation (2020), this can be due to "absence of or poor quality of care during pregnancy and birth; lack of investment in preventative interventions and the health workforce; inadequate social recognition of stillbirths as a burden on families; measurement challenges and major data gaps; absence of global and national leadership; and no established global targets" [1].
Knowledge of the cause of stillbirth is important for two reasons. Firstly, it is important to provide the most accurate information for continued care of the families and secondly, it is crucial to know accurate causes to inform healthcare prevention strategies. Thus, healthcare professionals (especially obstetricians, clinicians, and pathologists) should be able to correctly classify the causes of SBs. In clinical and pathology routines, the identification of SBs' causes can be a challenge because of the intrinsic complexity of fetal/maternal pathophysiology. It can be difficult to reach a definite cause of death. This challenge is complicated due to several different classification systems of SB causes used by the world's medical professionals. In the scientific literature there is a continuous emergence of SB classification systems, not allowing uniform data collection and comparisons between populations from different geographical areas [10][11][12][13][14][15][16][17][18].
In the light of the above, this study compared two of the most used SB classifications (ReCoDe and ICD-PM), aiming to identify which of them should be preferable. In order to reach this goal, unexplained/unspecified cases of each classification were compared. Intra and inter-rater agreements were compared and discussed.

Materials and methods
In this study, the authors adhered to the World Medical Association Declaration of Helsinki regarding ethical conduct of research.

Population and data analysis
The authors conducted a retrospective analysis of all cases of SBscharacterized by gestational ages equal or higher than 22 weekswhich occurred in the gynecologic and obstetric hospital (called Sant'Anna; referral hospital for high-risk pregnancies; all cases came from pregnancies managed by the above-mentioned hospital) of Turin (Italy) from January 2015 to December 2019. During this period there were 34,417 deliveries (SB rate: 6.1 per 1,000). The study included twin-pregnancies. Each case was collected in an Excel sheet containing the following data: parents' medical history, date of delivery, maternal age, gestational age, fetal sex, birth weight, maternal laboratory tests, pregnancy/delivery details, autopsy and histological (fetus, placenta, and umbilical cord) data. In all cases, fetal autopsy/histology and placental/umbilical cord gross and microscopic examinations were conducted. These data were also collected in the Excel sheet. In order to allow the comparison between present study's results and literature's data, after analyzing the abovementioned information, a panel of three physicians (experienced in SB diagnosis), classified all SBs with ReCoDe and ICD-PM systems. Then, the abovementioned cases were divided in two groups for both ReCoDe and ICD-PM data: group 1unclassified/unspecified (ReCoDe Group I and ICD-PM A6/I7 categories) cases [10] (Figures 1 and 2); group 2classified/specified cases [19].

Intra/inter-rater agreements
The second part of this study consisted of intra and inter-rater agreements' calculation. In order to accomplish this aim, two experienced physiciansdifferent from the ones who had been part of the aforementioned expert panel -(operator #1 -OP#1 and operator #2 -OP#2) separately classified the SBs according to the ReCoDe and ICD-PM systems (Figures 1 and 2). Inter-rater agreements between the two OPs for both classifications were calculated throughout the weighted kappa coefficient-K [20][21][22]. After 1 month from the first scoring round, each operator was called to rescorethroughout ReCoDe and ICD-PM classificationsthe SBs considering the same available data. Intra-rater agreement (weighted kappa coefficient for >2 ordered categories-K) was calculated for OP#1 and OP#2 for both classification systems [20][21][22].

Results
A summary of the most significant data is available in Tables 1-3. Data on 191 SBs were collected. Maternal ages were characterized by a mean of 34.1 years (SD ± 5.5 years; median 34 years; mode 32 years). Gestational ages' mean was 204.2 days (SD ± 43.4 days; median 194 days; mode 154 days).

General considerations
The abovementioned results highlighted significant differences between ReCoDe and ICD-PM classifications. These differences are manifest comparing similar categories of the two methods: (1) infections, 18 cases for ReCoDe and 22 for ICD-PM; (2) fetal growth restriction, 7 for ReCoDe and 8 for ICD-PM. As depicted in Figures 3 and 5, this happened   because ReCoDe classification is characterized by Group B (umbilical cord), Group C (placenta), and Group F (mother). These categories are not provided by ICD-PM in which umbilical cord, placenta, and maternal pathologies are listed in the so-called maternal conditions. Thus, using ReCoDe system some cases can be classified directly as caused by the above-mentioned conditions. On the contrary, using ICD-PM, maternal conditions are defined as "a condition that would reasonably be considered to be part of the pathway leading to perinatal death" but not the leading one [19]. This justifies the aforementioned discrepancies between the two classifications for infection and fetal growth restriction categories. Similar considerations can be suggested for the only case that was assigned by ICD-PM to A3 (antepartum hypoxia) and by ReCoDe to A1.1 (lethal congenital anomaly) (see Figure 4). Antepartum hypoxia is not provided as leading cause by ReCoDe system causing a different interpretation and classification of this case by observers.
As suggested by the scientific literature and by the abovementioned results, direct comparation between two or more different SB classifications is particularly difficult because of their intrinsic differences [10][11][12][13][14][15][16][17][18]. For this reason, the present manuscript focused on the evaluation of unexplained/unspecified cases and on inter and intrarater agreements, as follows.

Considerations on unexplained and unspecified cases
Recently, Wojcieszek et al. pointed out that the strength of SBs' classification systems should be based on consensus on the fundamental characteristics of such systems [16], highlighting limits and strengths of several existing methods. One of the methods used to evaluate two or more SB classifications relies on the comparison of their limits and strengths establishing which of them minimizes the number of unexplained/unspecified cases [12,13]. In the scientific literature, there are few indications about this subject, especially for ICD-PM classification. There is a major number of articles in which methodsother than ICD-PMare compared. In 2009 Flenady et al. identified percentages of unexplained SBs for the following classification systems (Amended-Aberdeen 44.3%; Extended Wigglesworth 50.2%; PSANZ-PDC 15.4%; ReCoDe 13.8%; Tulip 10.2%; CODAC 9.5%) [23]. In 2016, Nappi et al. registered unexplained SBs in the 14% for ReCoDe, in the 16% Galan-Roosen, and in the 18% for Tulip [24]. In 2008, Vergani et al. described unexplained SB in the 14.3% of cases [18]. However, these three articles did not report intra and/or inter-rater agreement.
In 2019 Dapoto et al. study yielded unexplained/unspecified causes in the 16.7% of cases for ReCoDe and in the 9.3% for ICD-PM [13]. They identified a clear difference between the two classification which they did not ascribe "to a better performance of the classification [ICD-PM], but simply of the lack of recognition of a primary vs. associated condition" [13]. In the present study, the panel of experienced physicians defined different percentages, especially for ICD-PM.
The percentage of ReCoDe's unexplained cases (23.4%) of the present manuscript is slightly higher than the one available in other manuscripts [10,18,23]. This can be justified by differences in populations from whom studies' samples come from and the quality/quantity of information available for each SB [10,18,23].
ICD-PM classification yielded 40/191 cases (20.9%) as unspecified. If compared with the manuscript by Dapoto et al. [13], at first sight this could appear as a meaningful difference of percentages. Their results appear to be affected by a different interpretation of the definition of what unspecified means for ICD-PM classification. In Dapoto et al. manuscript, the percentage of 9.3% was referred to the cases classified both A6/I7 and M5. For these authors the causes of SBs can be defined as unspecified only when both fetal and maternal data are unremarkable [13]. This interpretation seems to contrast with the World Health Organization (WHO) application of ICD-10 to deaths during the perinatal period [19]. In 2016 the WHO published its indications about how ICD-PM classification should be used, specifying as A1-A6 (antepartum) and I1-I7 (intrapartum) categories express fetal "disease or condition that initiated the morbid chain of events leading to death" [19]. According to the WHO, the cause expressed by A and I categories "is the single identified cause of death and it should as specific as possible" [19]. Maternal conditionsexpressed with M1-5 categoriesis defined as "a condition that would reasonably be considered to be part of the pathway leading to perinatal death" [19]. In the light of WHO indications, it can be stated that for ICD-PM system, the definition of unspecified SBs should be expressed only with A6 or I7 categories. Maternal conditions (M1-M5) should not influence this definition [19]. In contrast with WHO statements, Dapoto et al. defined as unspecified only the cases in which fetal ad maternal conditions were both unremarkable [13]. Their unspecified cases (9.3%) come from a different application of ICD-PM. If we consider only the cases in which they identified A6 or I7 categories (as prescribed by the WHO), the percentage of their unspecified cases (22%) is similar to the one yielded by the present study (20.9%).
In 2018, Maducolil et al. reported the 11.6% of unexplained cases and the 7.5% of unspecified ones respectively for ReCoDe and ICD-PM [25]. In Maducolil's manuscript it is not specified if the authors classified SBs as unspecified when A6 or I7 categories are assigned (as suggested by the WHO) or when fetal and maternal conditions are both unremarkable (A6/I7 and M5) [25]. For this reason, it is difficult to compare the present results with Maducolil et al. However, Maducolil et al. studied a different SB sample than our present study: only 9.1% of their cases had autopsy information; they considered SBs with a gestational age ≥24 weeks [25]. In the present manuscript: the cases were characterized by gestational ages ≥22 weeks; all cases underwent full autopsy and microscopic examination of fetus, placenta, and umbilical cord.
For ICD-PM classification in the scientific literature different percentages of unspecified SBs are available: they vary from 4.14% to 37.89% [11,13,26]. As already stated, this variability can be referred to the differences in populations from whom studies' samples come from and in quality/quantity of information available for each SB. Even if several authors pointed out meaningful considerations about SB unspecified/unexplained cases, it is important to know that there is an intrinsic difficulty to compare results and considerations of different studies. These differences constitute a significant limitation in this field because they limit the applicability of studies' considerations to other populations and the comparison of results. In addition, it is important to note that current SB classification systems do not provide twin-specific categories. This can cause the loss of information regarding the cause of death, affecting the number of unexplained/unspecified SBs of the studies in whichas for the present onetwin pregnancies are included [27].

Considerations on intra/inter-rater agreements
In the absence of common methodologies to study differences in unexplained/unspecified cases between classifications, the determination of intra and inter-rater agreements can be fundamental requirement to understand which method can be used more consistently. This was highlighted by the Delphi paper by Wojcieszek et al. who reported high inter-and intra-rater reliability among the functional characteristics which a global classification system should have [16].
In the scientific literature, the only available data about the two classifications used in the present manuscript are related the article by Flenady et al. in which ReCoDe inter-rater agreement is moderate (K 0.51) [20][21][22][23]. ReCoDe intra-rater agreement and ICD-PM intra and interrater agreements are not reported by other authors. For ReCoDe system, the present study yielded an inter-rater agreement of 0.58 allowing to sustain the same considerations of the abovementioned article.
Considering ICD-PM classification, the results of the present study represent the only available in the scientific literature. The use of ICD-PM yielded K 0.54, K 0.76, and K 0.71 respectively for intra-rater agreement, OB#1 inter-rater agreement, and OB#2 inter-rater agreements. These data are similar with the ones obtained using ReCoDe system. Since it appears that ReCoDe does not perform better than ICD-PM (and vice versa), it can be stated that intra and inter-rater agreement results do not seem to be meaningful tools which can justify the use of one of these methods instead of the other one.

Causes of poor agreement and corrective proposals
In the scientific literature, some authors reported interrater agreements for major categories evaluated by different SB classifications [23]. Their agreements ranged from excellent to good (high variability) [28][29][30][31][32]. The study of Flenady et al. identified suboptimal Ks suggesting that the reasons can be the following: "the majority of classification systems failed to provide sufficient instructions on use" [23]. According to the authors of the present manuscript, the absence of specific instructions is the main cause of low statistical agreements. This negatively affects especially inter-rater agreement that is calculated between two different observers whodue to the abovementioned absencedo not have specific indications that can guide their activity. Intra-rater agreement yielded higher values of K because the two observations were performed by one observer who had the same knowledge of how applying the two classifications. For these reasons, the authors suggest that all classifications should have specific guidelines to reduce errors caused by personal interpretations. The guidelines should also provide a legend for all terms and definitions used.
Poor agreement can be also caused by the lack of practical indications. Indeed, SB classifications methods do not systematically provide exemplificative cases which can facilitate understanding. According to the authors of this manuscript, all classifications should include an exemplificative case report for each main group/category. This can easily solve interpretation issues.
Moreover, in each center which deals with SB diagnosis, all healthcare operators should be continuously trained to increase their knowledge on SB and align diagnoses. It could be also useful to program weekly meetings in which all operators revise some cases blindly extracted from the internal database. This approach could improve inter-and intra-reliability.
The scientific literature describes that "a recent systematic review of perinatal death classification systems reported that 81 new or modified classification systems were used between 2009 and 2014 across 40 countries" [16]. This trend is not changed in the last years. For this reason, consensus studiesas the one proposed by Wojcieszek et al.should be undertaken in order to make stricter the approach to SB diagnosis, reducing the diffusion of classifications that are not object of extensive scientific review.

Limitation and future perspectives
Limitations of the present study are related to its retrospective nature. In addition, the results reflected the specific characteristics of the population from whom the sample came from. This can limit the applicability of studies' considerations to other populations and the comparison of results. These limitations suggest the need for a uniform approach to this field. International agencies and/ or organizations should propose and organize supranational studies in order to clarify which and when specific SB classification systems should be used. This approach can solve the numerous contradictions among the different articles available in the scientific literature.

Conclusions
The present study supports that there is not a significant difference between ReCoDe and ICD-PM classifications in minimizing unexplained/unspecified cases. Comparison of this data with other studies was difficult as: there are few studies in which ReCoDe and ICD-PM were compared; there are differences in populations from whom studies' samples come from; quality/quantity of information available for each SB are different for each study.
Inter-and intra-rater agreements were low for both ReCoDe and ICD-PM. For this reason, the authors proposed some suggestions such as the implementation of specific guidelines and illustrative case reports to easily solve interpretation issues.