The influence of sampling time on indirect reference limits, decision limits, and the estimation of biological variation of random plasma glucose concentrations

Abstract Objectives Plasma glucose concentrations exhibit a pronounced daytime-dependent variation. The oscillations responsible for this are currently not considered in the determination of reference limits (RL) and decision limits. Methods We characterized the daily variation inherent in large-scale laboratory data from two different university hospitals (site 1 n=513,682, site 2 n=204,001). Continuous and distinct RL for daytime and night were estimated. Diurnal characteristics of glucose concentrations were further investigated by quantile regression analyses introducing age and cosinor-functions as predictors in the model. Results Diurnal variations expressed as amplitude/Midline Estimating Statistic of Rhythm (MESOR) ratio, averaged 7.7% (range 5.9–9.3%). The amplitude of glucose levels decreased with increasing concentrations. Between 06:00 and 10:00 h an average decrease of 4% has to be considered. Nocturnal glucose samples accounted for only 5% of the total amount but contributed to 19.5% of all findings over 11.1 mmol/L. Partitioning of RL between day and night is merely justified for the upper reference limit. The nocturnal upper RLs for both genders differed from those obtained during the day by 11.0 and 10.6% at site 1 and by 7.6 and 7.5% at site 2. Conclusions We conclude that indirect approaches to estimate upper RL of random plasma glucose concentrations require stratification concerning the time of sample collection.


Introduction
The biological variation in plasma glucose levels results from the complex interaction of genetically anchored metabolic processes that are subject to strict hormonecontrolled regulation. From the evolutionary point of view, humans were enabled to sleep through the night as a prerequisite for brain consolidation [1]. Physiological stimuli are set by the environment which function as synchronizer or so-called Zeitgeber (literally "time giver") but can be modified by pathophysiological factors [2]. As a result, plasma glucose levels exhibit pronounced daytime-dependent fluctuations that are not solely due to nutritional intake [3,4].
The time-dependence of glucose levels affects laboratory diagnostics at various aspects. The most obvious is the relation between blood sampling time and time of food intake, i.e., fasting and postprandial status as well as randomly collected samples [5]. Clinical decision limits exist for all the respective prandial states. In addition to this extrinsic factor, there are intrinsic regulatory mechanisms that have developed evolutionarily and determine a centrally anchored diurnal rhythm of glucose metabolism independent of food intake [6]. These intrinsic mechanisms contribute to a certain extent systematically and predictably to the biological variability of glucose levels in a population. Therefore, we have processed large-scale laboratory data to characterize the predictable, i.e., non-random, variability of plasma glucose levels and to estimate their influence on reference intervals, decision limits and biological variability.

Materials and methods
Data collection and processing A retrospective study was conducted at two central laboratories located on the campus of the University Hospital Schleswig-Holstein (UKSH, site 1) and the University Hospital Cologne (UKK, site 2). Anonymized exports from the laboratory information systems served as data sources. Data sets were restricted to those with complete and integer information and patients age ≥18 years. Subsequently, data have been reduced to the first value measured from each patient to prevent biases due to autocorrelation between repeated measurements.
Site 1 comprised first values from 513,682 plasma glucose measurements ordered between 2015 and 2019 by ambulatory care and hospital facilities without any exclusions. Age ranged from 18 to 107 years.
Site 2 provided another 204,001 first values from plasma glucose measurements from 95 distinct units of the entire hospital ordered between 2015 and 2020. Age ranged from 18 to 108 years. All patients were categorized by age into three classes, i.e., 18-39, 40-59 and ≥60 years of age.

Measurement of glucose concentrations
Heparin plasma samples were analyzed by Cobas clinical chemistry analyzer applying the hexokinase method and corresponding reagents from Roche Diagnostics (Basel, Switzerland). We did not include any results from point-of-care-testing. Measurements were performed in compliance with the Guidelines of the German Medical Association on Quality Assurance in Medical Laboratory Examinations (RiliBAEK). Analytical within run variation did not exceed 2%. In the present study, the time of sample arrival at the laboratory was recorded. The time between blood collection and arrival at the laboratory usually varies from one to a maximum of 2 h. Therefore, there is a similar offset for all time specifications.

Statistical methods
Estimation of time-dependent reference limits (RL): Glucose values were separated by gender and further subdivided into hourly classes. Data of all subgroups were checked for trends and drifts prior to further analyses. Estimation of RL based on the truncated maximum likelihood (TML) approach [7,8] employing the Reference Limit Estimator (RLE) software developed by the Working Group on Guide Limits of the German Society of Clinical Chemistry and Laboratory Medicine (DGKL). Time-specific RLs were estimated either for predefined time slots after visual inspection of aggregated data employing Box-and-Whisker plots or as continuous RLs over a period of 24 h. The relevance of partitioning RLs at predefined time slots was evaluated using the permissible uncertainty approach as suggested by Haeckel et al. [9].
Quantile regression analysis: As the number of nocturnal data were not sufficient to explain the dynamic of the glucose variations and indirect methods that base on maximum likelihood estimations require larger data sizes per hour, we introduced an additional approach by estimating the trend of various quantiles of glucose values of a subpopulation. For this aim, data were restricted to glucose values between 2.8 and 10 mmol/L. Quantile regression analyses (QRA) on 2.5, 25, 50, 75 and 97.5 percentiles were applied on these data. The time of day and age were used as predictors in the model. The circadian rhythm of glucose values has been intensively studied in the past [3]. This rhythm is properly modeled by the cosinor model with 24 h period. Therefore, we included a cosinor-function to the time axis [10]. The resulting wave-shaped rhythm is characterized by three variables, i.e., (i) MESOR (literally "Midline Estimating Statistic of Rhythm"), which indicates the rhythm adjusted average about which oscillation occurs, (ii) the amplitude defined as half the difference between the highest and lowest value of the fitted cosinor curve and (iii) the acrophase which corresponds to the time of peak values. Age classes served as additive predictors to estimate age-specific MESORs. These classes were also included as interaction terms to estimate agespecific acrophases and amplitudes as part of the cosinor applied, if significant. Non-significant terms (α-error=0.05) of this maximum model are stepwise backward eliminated to establish a minimum adequate model. This procedure was implemented for each quantile mentioned above and separately for male and female subjects.
Briefly, a summary of the most important steps: Categorize the age by class Let p be the applied quantile, then the maximum model is defined by for j ∈ 0, 1, 2 and t ∈ 0, 1, 2, …, 23 h, where Y j is the glucose concentration at time = t and age = j, and M p j , A p j , and ϕ p j denote MESOR, amplitude and acrophase at the quantile p and age = j, accordingly. This procedure was performed for different quantiles (2.5, 25, 50, 75, and 97.5%, respectively) and separately for each gender. The model can be expressed as a linear function with interaction term between age and cosine function. For more details see Bingham et al. [11]. The 'quantreg' package of the statistical software R (release 3.6.1) was used to perform QR analysis.

Results
Hourly distribution of plasma glucose levels Table 1 summarizes the underlying samples sizes and their distribution in the respective time phases. More than two third of all samples are collected between 10:00 h in the morning and 17:59 h in the evening and thus outside the time frame recommend for the collection of fasting plasma glucose which is 07:00-09:00 h [5].
Box plots in Figure 1 illustrate the hourly distribution of glucose levels of site 1 data set stratified by gender and age groups. Comparable pattern can be obtained from data of site 2 (data not shown). All six subgroups displayed in Figure 1 demonstrated similar diurnal fluctuations in glucose levels. The highest glucose levels were recorded consistently across all subgroups during the early morning hours between 02:00 and 06:00 h. As expected, glucose levels differed between age classes and gender with increasing plasma concentrations in the elderly and higher values in males (Table 2), respectively.

Quantile-based cosinor analyses of diurnal glucose variations
The hourly quantiles (2.5, 25, 50, 75% and 97.5 percentile) were modeled applying QR analysis regressed on two predictor variables, t (hour) as cosinor-function and age classes. Figure 2 depicts the resulting wave-shaped regression lines for site 1 over a 24 h period for each respective quantile. In contrast to data from site 1 no statistically significant cosine function  At both sites and in all age groups, acrophases of females tended to occur about 30-60 min earlier as compared to males in the inner interquartile range. According to this model the predictable biological variation inherent in the data sets correspond to the amplitude to MESOR ratio (A/M). Table 2 summarizes the time adjusted quantiles grouped by study site, gender and age classes for median and the upper and lower border of the interquartile range. A/M-ratio steadily increased independent from site, gender and age class from 25 to 75% percentile. Even though a positive correlation between glucose levels and age also exists within each quantile there was no systematic effect detectable on the A/M-Ratio. The overall A/M-ratio achieved 7.7% (range 5.9-9.3%). The highest portion of variability is apparent between 06:00 and 10:00 h. According to the calculated QR model applied glucose levels decrease on average by about 4% in this period, i.e., on average 1% per hour.

Daytime-specific reference limits
According to the dynamics described above, daytimespecific continuous reference intervals were determined. respectively. Similar distributions resulted at site 2 (data not shown). Horizontal colored lines represent clinical decision limits for fasting and random glucose levels, respectively. Glucose values between 5.6 (blue) and 6.9 mmol/L (green) are suspect for impaired fasting glucose [23,24]. Mild fasting hyperglycemia (glucose <8 mmol/L) (orange) in young patients may indicate the presence of maturity onset diabetes of the young (MODY) if there is additional evidence from other criteria [25]. Glucose values above 11.1 mmol/L (red) are suspect for diabetes mellitus. Figure 3 illustrates the resulting upper and lower RL estimated for both gender between 18 and 59 years at both sites, separately. Due to small sample sizes RLs do not cover the whole night phase. Assuming, that published RLs for glucose refer to fasting states the smallest distance between the lower and upper reference limit appear in the morning. The distance between the two limits gradually increases between the afternoon and evening hours. Table 3 provides an overview of the estimated reference limits and medians that would result from the TML approach for day (09:00-17:59 h) and nighttime (18:00-04:59 h). Comparing the distribution of glucose values within each hour of Figure 1 with those in Table 3 and Figure 3, the effect of the RLE procedure is evident. This algorithm is designed to extract from a mixed population the "central" part of the distribution that distinguishes non-diseased from diseased individuals. Thus, significantly tighter intervals result as compared to Figure 1. The differences of the medians and upper RL in Table 3 exceeded the respective permissible analytical uncertainty and would therefore have to be considered relevant. In contrast, the lower RL did not or only to a lesser extent differ between day and nighttime.

Frequency of exceeding clinical decision limits
The RL determined in the morning (06:00-09:59 h) are in good agreement with consensus clinical decision limits determined by independent studies in fasting subjects [12]. Therefore, it was assumed that the glucose values obtained in this time window predominantly are obtained from fasting patients. The second time interval between 10:00 and 17:59 h includes a portion of postprandial glucose values that cannot be precisely quantified, while all other data were declared as nocturnal glucose values. As summarized in Table 1, there was a slight correlation between day and night phases and the relative proportion of glucose levels above 11.1 mmol/L with the highest rate at night and the lowest in the afternoon. As expected, the rate of glucose levels presumptive of diabetes increased steadily in the older age groups. Glucose values measured during the night only represented 4.7% of total measurements of site 1 and 2 but they accounted for 19.5% of all glucose values above 11.1 mmol/L. In absolute numbers, however, such pathological glucose values occurred most frequently in the afternoon. Laboratory findings suggesting impaired fasting glucose (5.6-6.9 mmol/L) demonstrated both age dependence and gender-specific differences at both sites. Although our data were not pre-selected except for first values, the results are in good agreement with systematically collected epidemiological data on the prevalence of disorders of glucose metabolism in Germany [13].

Discussion
Clinical assessment of plasma glucose concentration is based on clinical decision limits. The availability of population-specific reference limits nevertheless has a  justification. In particular, for clinical decision limits, it should be known how they relate to the distribution of an analyte in a population. Age, sex, and, as described here, time-dependent variations are, in our opinion, basic information that merit consideration in the design of studies to determine or adjust clinical decision limits. Patient-based real time quality control is another potential application where time-based population-specific reference limits may play a role. In these contexts, our study contributes, yet not available information on the dynamics and magnitude of time-dependent variations of plasma glucose. The term chronodesm describes time-bound limits within which the measured values of an analyte are physiologically located over a defined period [14]. In clinical practice, this approach is rarely applied to laboratory parameters with known circadian rhythm. So-called monodesms are much more common, albeit this term is not commonly accepted in the context of RL and clinical decision limits. Being a derivate of the term chronodesm, monodesms describe the physiological range or reference interval at a defined point in time or, if applicable, over a circumscribed phase that is located within a period of the principal rhythm. In clinical practice, therefore, monodesms actually reflect a pre-analytical requirement for the time of blood collection.
In the analyses presented here, the diurnal dynamics of random glucose levels were addressed under different aspects. In contrast to time series obtained from single persons large-scale "real-world" laboratory data have been investigated.
The term real-world big data study was coined to distinguish such approaches from conventional controlled trials in well-characterized cohorts [15]. Its aim is to counter the criticism of artificial settings in controlled trials by generating findings that better reflect the situation in everyday clinical practice, in the sense of "real world evidence" [16].
Studies on the time-dependent variation of plasma glucose are mostly restricted to either healthy young volunteers or patients suffering from disorders in glucose metabolism. Studies that apply time series approaches are usually, however, limited to smaller sample sizes. The latter focus mainly on the estimation of day-to-day variability and analyze fasting glucose levels obtained under strict pre-analytical conditions [17]. We, therefore, assume that metrics provided from real-world laboratory data will not fit to any type of variability generated by those studies.
Data included in our analyses represented only first values. In addition to the exclusion of any bias due to autocorrelation or interventions such as medication or hemodialysis, this type of data refinement is decisive for the conclusions drawn. The extent of individual influences on the components of biological variability inherent in the data sets shall be reduced to those that can be considered as population specific. We further assumed that given a sufficient number of data sets the biologically anchored oscillation pattern would predominate.
Our results were comparable between two different university laboratories but as one important finding it is obvious that even though more than 500,000 patients have been included in one data set, sample size was still a limiting factor. The initial exclusion of repeated measurements reduced the size of our data sets by more than 75%.  Further, glucose values at night comprised only 10% of all values. Thus, in general more than two million raw data from one site are needed to perform such data mining approaches on hourly glucose values over a 24 h period. QR analyses were performed, since biological variation may be modulated or differ between different health states or subgroups, such as age classes or gender. The resulting mathematical model is basically compatible with the timedependent, biologically anchored proportion of the variability of glucose measurement within a population. This type of variation can otherwise only be determined under experimental conditions, i.e., as part of a "constant routine protocol" with serial measurements on healthy volunteers. Acrophases for glucose experimentally determined under such conditions are localized between 23:30 and 00:30 h [18]. According to the median glucose values of our own data (Table 2), the acrophases (t 0 ) ranges between 00:16 and 02:14 h. This phase shift would correspond to the time of sample transport.
Notably, the location of the acrophases seemed to be gender-specific in our populations. The circadian rhythm of the glucose level in turn is primarily coupled to the wake-sleep rhythm and does not necessarily follow the course of day and nighttime [3]. Therefore, the phase delay, i.e., later peak, in men might be explained by differences between the sleeping times of men and women, as recently reported in an evaluation of more than 10,000,000 data sets [19]. Strictly interpreted, this literally would imply that before taking a blood sample the time of falling asleep and waking up needs to be recorded in order to have an orientation about the individual cosinor-function.
Further, QRA provided gender and age-specific trajectories at different concentration levels. The shape, i.e., mainly the amplitudes of the respective anticipated time course differed between subgroups. Thus, oscillations of glucose values may not be sufficiently described by a unique so-called "cosinordesm" but comprises several distinct "rhythmodesms". While no age dependency was observed within each quantile a correlation exists between plasma glucose and the respective amplitudes in a concentration dependent manner in relative and absolute term. This finding is relevant insofar as it indicates that the biological variability within the reference interval may be not strictly linear.
The extend of daily variation can be discussed regarding the overall variation within 24 h or restricted to specific monodesms such as the morning time when blood samples are usually collected. Noteworthy, within the time frame from 06:00 to 10:00 h the intrinsic oscillation of glucose levels reaches its inflection point, thus, the highest change of concentration per time occurs. Considering the increasing demands on analytical accuracy, the observed average change of 4% during this time span could be of relevance. Reported measures of biovariability such as within-subject and between-subject variation tended to be higher [20,21] which is consistent with the assumption that additional sources of variation exist.
Within-24 h-variation that may be assigned to regular, i.e., physiological, oscillations ranged between ±5.9 and ±9.3% within the interquartile range of the observed glucose values. This amount may therefore be an estimate for the portion of unalterable, i.e., baseline variability to be considered when random glucose values are evaluated.
Referring to the impact of daytime-dependent variations on clinical decision limits provided for random glucose samples we observed a four-fold higher occurrence of cases above 11.1 mmol/L at night. However, we cannot draw any conclusions from this observation because there was no link between our findings and patient records.

Conclusions
The differences between upper RL estimated at day and nighttime exceeded the permissible analytical uncertainty. Because of the observed diurnal variations, indirectly estimated upper RLs of random glucose concentrations should be stratified for sample collection during early morning (e.g., 02:00-06:00 h am) and late morning (e.g., 06:00-12:00 h am). This is particular important if directly determined upper RLs are compared with indirectly estimated upper RLs. Directly determined upper RLs are usually derived from samples collected during late morning. We further propose to extent our approach to other measurands with known or suspected daytime-dependent variations [22].
Research funding: None declared. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.