Derivation of Passing–Bablok regression from Kendall’s tau

It is shownhowPassing’s and Bablok’s robust regressionmethodmay be derived from the condition that Kendall’s correlation coefficient tau shall vanish upon a scaling and rotation of the data. If the ratio of the standard deviations of the regressands is known, a similar procedure leads to a robust alternative to Deming regression, which is known as the circular median of the doubled slope angle in the field of directional statistics. The derivation of the regression estimates from Kendall’s correlation coefficient makes it possible to give analytical estimates of the variances of the slope, intercept, and of the bias at medical decision point, which have not been available to date. Furthermore, it is shown that using Knight’s algorithm for the calculation of Kendall’s tau makes it possible to calculate the Passing–Bablok estimator in quasi-linear time. This makes it possible to calculate this estimator rapidly even for very large data sets. Examples with data from clinical medicine are also provided.


Introduction
The comparison of measurements obtained with different instruments or test formats are a biometrical problem familiar to every scientist. While disciplines differ considerably with respect to the preferred methods, regression is of general importance [1]. Since both measurements to be compared have some level of imprecision, least squares regression (LSR) is not the best choice, as it assumes the abscissa values to be error free. This drawback is avoided by techniques like Deming regression (also known as major axis regression) or least product regression (LPR) [1][2][3][4][5]. However, the results of these methods may be seriously biased by a small number of outlying measurements. To overcome this problem, in the field of clinical chemistry, Passing and Bablok [6] proposed a regression method, to be called classical Passing-Bablok regression (cPBR) in the following, which may be considered a variant of Theil-Sen regression (TSR) [7,8], adapted to take into account the variation not only of the ordinate but also of the abscissa.
While these authors gave only a heuristic justification for their method, they showed empirically the unbiasedness of the results obtained [9] and the robustness of the method. Although it has become one of the most often used statistical tools in clinical chemistry [5], lamentably, the method remains widely unknown outside this field.
Due to the limited analytical understanding of cPBR, formulas for the variance and confidence intervals of the slope and intercept are also not well developed. Furthermore, the cPBR is limited to slopes near 1, although, in a follow up paper [10], a method applicable for any value of the slope was proposed, which in the following shall be called equivariant PBR (ePBR). cPBR may be shown to be an approximation to ePBR. In the following, if both cPBR and ePBR are in scope, the term Passing-Bablok regression (PBR) will be used.
It is the aim of this paper to show that all the aforementioned regression techniques and specifically PBR result from the condition that either Pearson's or Kendall's correlation shall diminish to 0 upon a suitable slope-dependent scaling and rotation of the original variables. This extends the ideas of Sen [8] on the derivation of TSR to PBR. Along the same lines, a robust variant of the Deming regression procedure is developed, which is found to be equivalent to the determination of the circular median of the doubled slope anglea well-known estimator in the theory of directional statistics.
The paper is structured as follows: in Section 2.1, the methods within scope of the paper are shortly recapitulated. The estimators obtained with the different methods are compared using an example from clinical chemistry ( Figure 1). Section 2.2 contains the theoretical derivation of the Passing-Bablok estimator and is relevant mainly for the reader more interested in the theory. In Section 2.3, the main equations for the calculation of the PB estimators are laid out, while in the following Section 2.4, expressions for the circular median estimator are given. It also contains a numerical comparison of the different methods. In Section 2.5, it is shown how the PB estimators can be calculated efficiently. Section 3 contains a derivation of the confidence interval for the slope and an efficient algorithm for its numerical calculation. Finally, in Section 4, new equations for the confidence intervals of the intercept and bias are derived. Their coverage is then studied by simulations. Generally, examples or simulations more relevant for the practitioner are included at the end of the sections or in the figures and their captions. Recommendations on the use of the methods are put together in the discussion (Section 5). To exemplify the performance of the different regression schemes for measurements with very different ranges, results from a comparison of a clinical chemistry tests, a PCR assay (concentration [IU/ml]) and an immunoassay (cutoff index, COI), respectively, are shown. Red lines are for TSR of concentration on cutoff index (dashed), cutoff index on concentration (dotted) 1 , ePBR (solid). Blue lines are for LSR of concentration on cutoff index (dashed), cutoff index on concentration (dotted) and LPRsolid. Green curve is cPBR. On the left, all curves are forced through the center of mass of concentration and cutoff index. On the right, the intercept for the TSR and PBR curves is obtained as proposed by Passing and Bablok.

Derivation of regression from correlation 2.1 Synopsis of the relevant regression methods
In the following, a caret will designate the estimated value, e.g., of the slope m, and a bar the mean, e.g., It will be shown how the regression schemes mentioned in the introduction, namely, LSR, LPR, Deming regression, and notably PBR, may be derived from the condition that either Pearson's or Kendall's correlation diminishes to 0 for transformed input data.
The following assumptions are made, although for some methods further generalizations are possible. For LSR and Deming regression, both weighted and unweighted variants exist. For the ease of presentation, only the unweighted variants will be discussed.
1. There exists a structural relationship between the expected values of random variables x i and y i , denoted as x * i and y * i , respectively, of the form y * b + mx * . Here, x i x * i + ξ i and y i y * i + η i . For LSR and TSR, ξ i 0. 2. Error terms for points i ≠ j are independent. ξ i and η i are independent, too.
The distribution functions of the error terms may depend on i, but for fixed i, the distributions of ξ i and η i are equal up to scaling with the standard deviation 2 , i.e., ∀i, x : For a detailed description of the estimation procedures to be considered, the reader is referred to the literature [1][2][3][4]6]. Here, only a very condensed synopsis of the main assumptions and formulas for the estimated slope are given for the methods of interest, as follows: -LSR: In least squares regression of Y on X, the estimate of the slope m is obtained by minimizing the sum of the squared deviations of the y i from the estimated values, -LPR: In least product regression, the estimate m is obtained minimizing the sum of the products of the deviations of the y i and x i from their estimated values, where the sign of the root is chosen equal to the sign of Pearson's correlation coefficient of x and y. Sometimes, LPR is called geometric mean regression, as the estimate m may also be expressed as the geometric mean of the slope estimate from a LSR of y on x and the reciprocal slope estimate of regression of x on y. Also the terminology "reduced major axis regression" is common. -Deming Reg.: In major axis, or Deming regression, the slope is determined by minimization of both the sum of the squared deviances of x and y from the estimated true values x * i and y * where the sum is minimized with respect to both m and the x * i . The slope estimate is m 1 2s xy s yy − s xx γ 2 + s yy − s xx γ 2 2 + 4s 2 xy γ 2 , It can be seen that the slope estimators of LSR and LPR arise as special cases of the Deming regression estimate. Namely, γ ∞ corresponds to regression of y on x and γ 0 to regression of x on y. Assuming that γ m, replacement of γ by its estimator m in the last expression, yields the LPR expression for m. -Roos Reg.: As first discussed by Roos [11] and Xu [12], LSR may be considered as a special case from a family of regressions where the squared distance of the points from a regression line is minimized along lines with slope κ, e.g., LSR is a special case with κ ∞. The slope estimate is m κ s yy − κs xy s xy − κs xx .
-TSR: According to Sen [8], in Theil-Sen regression the slope can be estimated from the condition that Kendall's correlation coefficient τ between the X and the intercepts Y − mX shall be 0, N is a normalization factor. When there are no ties, N 2/n(n − 1). The function sgn (x) is the sign of x, if x ≠ 0, and 0, if x 0. If the x i are ordered so that where S ij is the slope of the line connecting points i and j. -cPBR: In classical Passing-Bablok regression [6] proposed to modify the TSR estimator, to take account of the X not being free of error. If K is the number of the slopes S ij < −1 and N n(n − 1)/2 the total number of pairs, then where F S is the empirical cumulative distribution function of the {S ij }. Furthermore, Passing and Bablok made the assumption that m ≈ 1. -ePBR: In a follow-up article Bablok et al. [10] proposed an alternative method to cPBR which is applicable also for slopes ≠1. They gave a recursive definition which can be shown [13] to be equivalent to m med S ij .
In Figure 1, some of these methods are applied to a data set from clinical chemistry containing paired measurements obtained with both PCR assay and an immunoassay. Due to different units, the numerical ranges of the results differ by several orders of magnitude. It can be seen that the LSR, TSR and also the cPBR are sensitive to the choice of the dependent and independent variables, while the LPR and ePBR yield results which are more centered than those of the other methods.

Coordinate transformations
In this section it will be shown that all these estimators may be derived from a common expression. Since LSR and LPR can be derived from Deming regression, the latter will be the focus. The idea is to rotate the coordinate system so that either Pearson's correlation coefficient r, or Kendall's correlation coefficient τ, vanishes [13]. However, this cannot be done directly because, upon rotation, the error terms will in general be no longer independent: which is only consistent if γ 2 1. However, if new x-values are introduced scaling by γ so that x ′ i γx i , the covariance matrix of the new error terms ξ ′ γξ and η will become a multiple of the unit matrix. Upon scaling, the original slope m will be transformed into m ′ m/γ. The coordinate system can now be rotated by the angle ϕ −arctan (m ′ ), whence Dropping irrelevant constant factors, X ″ : {y i + x i γ 2 /m}, and Y ″ : The condition that the Pearson correlation r(X ″ (m), Y ″ (m)) of the transformed points diminishes to 0 for m m becomes which can be simplified to It can be shown that the solution of this equation yields exactly the Deming regression estimate m. When Kendall's τ is substituted for Pearson's r in the preceding derivation, differences Δx ij x j − x i and Δy ij has to be considered. The errors of these differences, Δη ij , and Δξ ij will also be independent with the same ratio γ 2 of the variances. Hence, the equivalent of Eq. (1) becomes This can also be simplified to The solution of this equation yields an estimator 3 which is the equivalent of Deming regression. For this slope estimate to be consistent, it is necessary that the expectation value of Eq. (4) with m replaced by m must be 0, If the x * i are not random variables, generically, the expectation values of each term in the sum has to be 0 on its own. This condition is fulfilled if either: -γ |m| and the distributions of the m Δξ ij and Δη ij are identical. The term whose expectation is taken in Eq.
(5) is anti-symmetric under an interchange of the m Δξ ij and Δη ij whence the expectation must be 0. These are exactly the assumptions made by Passing and Bablok. -Or the m Δξ ij − Δη ij are statistically independent of the corresponding Δy * ij + γ 2 m Δx * ij + Δη ij + γ 2 m Δξ ij and their expectations be 0. For general m and γ, this implies that the distribution of the γξ i and η i has to be rotationally invariant, which is only true if they are independent samples from a common Gaussian distribution with mean E(η i ) E(ξ i ) 0. Although requiring a Gaussian distribution may seem quite restrictive, one should note that the two terms become also approximately statistically independent when the errors are small compared to the mean distance between the points, since it follows that in most of the terms, which may therefore be pulled out of the expectation. The expectation E(sgn(m Δξ ij − Δη ij )) 0, if the distribution of the ξ i and η i is symmetric.
The two cases shall be discussed separately in the following subsections. It is remarked that in both cases there are two possible solutions which only differ in the sign of the resulting slope. As in LPR, the sign may be fixed as that of Kendall's correlation between the original X and Y. In the following descriptions, it will be assumed that this sign is positive, which can always be achieved by eventually changing the sign of the X.

Passing-Bablok regression
The first case leads to ePBR: Setting γ 2 m 2 , finding the m for which the Pearson correlation becomes 0 yields the LPR estimate. Analogously, the condition that Kendall's correlation shall diminish to 0 leads to an equation for m this can also be written as Unlike the cPBR estimator, a scaling of the S ij with a factor will also scale the estimator m by the same factor. Therefore, this estimator will be called ePBR estimator in the following.
Alternatively, this estimator arises on introduction of the angles ϕ arctan m ( ), and ϕ ij arctan(S ij ), as ϕ median ϕ ij .
Therneau [13] could show that this expression corresponds to an estimator proposed by Bablok et al. [10], who gave a recursive approximation for this estimator. This recursion uses the cPBR as a starting point. It can be seen that if in the first bracket of Eq. (6), m is replaced by 1, this will yield exactly the cPBR estimator. If Pearson correlation is used instead of Kendall's, the equivalent of the cPBR corresponds to the Roos regression estimate with parameter κ −1.
While Eq. (6) yields a consistent estimator, the bias of the ePBR estimator for small sample sizes remains to be determined. It is assumed that the true m ≠ 0. Multiplying with (m/ m) 2 yields As m Δ x ij and Δy ij are distributed alike, it can be concluded that m/m and m/ m also follow the same distribution. Therefore, ln( m) is symmetrically distributed around ln(m). As the variance of m is O(1/n) (see below), ln( m) is an unbiased estimator of ln(m), and m is an asymptotically unbiased estimate of m. An equivalent result holds for the LPR estimator [14].
Considering the robustness properties of the ePBR estimator, it is easy to show that it has the same minimal breakdown point as the TSR estimator. For the TSR estimator the breakdown point is reached earliest, when the slopes between any two points of which at least one is an outlier make up 50% of all slopes and are either all larger or all smaller than all the slopes between any two points which are not outliers ( [15], p. 67). Asymptotically, this is the case when the fraction of the outliers reaches 1 − 1/ 2 √ ≈ 29%. The calculation remains true, if, in case of the ePBR, the slopes are replaced by their absolute value, whence the breakdown point of the ePBR is also 29 %.

Circular median of the doubled angle
Considering the general solution of Eq. (4) for fixed and known γ, setting ϕ: arctan (m/γ), ϕ ij arctan (Δy ij /(γ Δ x ij )), and making use of addition theorems for the trigonometric functions, one obtains Here, ⌊x⌋ is the floor function, i.e., the largest integer less than or equal to x. ϕ is a standard estimator of circular or, more generally, of directional statistics [16,17] known as the "circular median" of the doubled angle. Doubling of the angle is standard in circular statistics for data showing diametrically bimodal distributions. The circular median has been shown to be a robust and unbiased location estimator of the slope angle [18]. To visualize the circular median, the (doubled) angles are represented as points on a unit circle and a diameter PQ is drawn which separates the circle into two halves containing the same number of points, cf. Figure 2. The point P, for which the majority of points have less distance than to Q, is then the circular median. For an odd number of points it coincides with a measured angle, while for an even number of points the mean of the nearest two points is used. Estimation of the slope via the circular median seems to be an attractive and robust alternative to the usual Deming regression, especially if the ratio of the standard errors is known and different from the expected slope. Similar to the way the Deming slope estimator interpolates between LSR estimator for the slope of Y versus X and the inverse of X versus Y depending on the ratio γ, the circular median can be seen to interpolate between the corresponding TSR estimates. The calculation of the PBR estimate is also possible using the circular median algorithm.
While finding the circular median is as complex as finding the usual median as required in the practical computation of the PBR, the concept offers the advantage, that it can be extended to find a regression line when comparing more than two sets of data [19].
It is of interest to compare the ePBR and the circular median method numerically. To that end, 100 points were sampled 1000 times with x * y * from a uniform distribution, i.e., with m 1, and a) equal coefficient of variance cv y cv x 0.3 and b) unequal coefficient of variance cv x 0.3 and cv y 0.9, with Gaussian error. In case a), both methods yield practically the same results with mean m 1.005 (sd 0.067) for the ePBR and mean m 1.007 (sd 0.084) for the median circular slope. However, in case b), the ePBR is biased with mean m 1.606 (sd 0.160), while the mean median circular slope estimate is m 1.007 (sd 0.084).

Numerical complexity of the calculation of the PBR and circular median estimators
The question remains how efficiently the TSR, PBR, and circular median estimate can be calculated. In case of the TSR, this problem has long been solved and represents a classical problem of computational geometry [20]. While the brute force approach to find the median of all the pairwise slopes between n points has a numerical complexity of O(n 2 ), better algorithms have only a complexity of O(nlnn). The problem can be shown to be equivalent to that of computing Kendall's tau, for which an efficient solution was given by Knight in 1966 [21]. It is based on the observation that the evaluation of Kendall's τ(X ″ , Y ″ ) is equivalent to that of counting the inversions needed to bring the set Y ″ into the order of X ″ , an operation which can be accomplished in O(nlnn) steps using a sorting algorithm like mergesort [22]. x ″ i and y ″ i can be interpreted as intercepts b i (m) of lines passing through the points (x i , y i ) whose slope m is taken as an independent variable. This lends to an interpretation in terms of a duality transformation which maps points (x i , y i ) to lines (parametrized by (m, b i (m)) and lines to points. In the field of image analysis, this transformation is known as Hough transformation [23].
Since Knight's algorithm has been implemented in many statistical packages and these also take care of the handling of ties, it is an easy task to use these algorithms to calculate the PBR and circular mean estimators. Even in combination with a simple bisection algorithm for root finding it thus becomes possible to find the estimators for data sets with n 10 7 on an ordinary laptop in a few minutes.

Confidence interval of the slope estimate
Passing and Bablok [6] gave arguments to justify a formula for the confidence intervals of the estimated slope. They noted that, in general, the variance of their slope estimator should depend on the distribution of the measured values. Numerical simulations showed that, in practice, their formula led to acceptable coverages. Nevertheless, they concluded that "the empirical derivation of this statement might seem unsatisfactory". Their estimator can be justified by deriving the confidence intervals for the slope from that of Kendall's τ.
Writing τ(m): τ(X ″ (m), Y ″ (m)), where X ″ and Y ″ may be any of the expressions listed in Table 1, the estimator for the slope m is the solution of the equation τ( m) 0. Using the delta method, where in the last step, the differential quotient was regularized with a McKean-Schrader type finite difference approximation [24]. z 1−α/2 is the quantile of level α/2 of the normal distribution. Hence, an asymptotic confidence bound for m is If m is the true value of the slope, the variance of τ(m) is if x i ″ and y i ″ are statistically independent, and all the y ″ i are from the same distribution. With these assumptions, the estimates for the slope's confidence intervals coincide with the result of Passing and Bablok.
However, these assumptions are quite restrictive, since they are only fulfilled if the original η i and ξ i are homoscedastic and from a Gaussian distribution. If the errors are small compared to the Δx * ij or when γ m, a symmetric distribution of the errors will be sufficient, but heteroscedasticity is still an issue. In the Theil-Sen setting, bootstrap, and jackknife techniques have been recommended [25] as an alternative for the analytical confidence interval estimators given by Theil [7].
Alternatively, it is possible to derive analytical expressions for the confidence intervals of the PBR estimators without making restrictive assumptions about the distribution of the X and Y, using an expression given by Daniels and Kendall and Cliff [26] and Charlin [27] for the variance,  Var(τ(m))

Assumed variance structure
where τ ij (n(n − 1)/2) sgn( . The estimator depends explicitly on the measured values. To evaluate this estimator efficiently, the order of the i, j is chosen so that, if x ″ i < x ″ j , then i < j. Then, ∑ j τ ij can be seen to be proportional to the difference of concordant and discordant pairs y ″ i , y ″ j for all j. This number may be evaluated for all values of i in one sorting step of numerical complexity O(nlnn): 1. Initialize a vector q of length n to 0. 2. The vector Y ″ , with elements in order of the i, is then ordered by decreasing size of the y ″ i using mergesort [22] where, during the merging of any two sub-lists, a counter is kept of the number of elements already taken from the left list. 3. When during merging an element y ″ i from the right list is selected, q i is increased by the actual value of the counter. 4. When the complete list is sorted, then ∑ j τ ij 2(n − 2q i )/n(n + 1).
A numerical comparison (cf. the Section 4) shows that the variance calculated with Eq. (11) for heteroscedastic, Gaussian errors with constant coefficient of variation is only 9.4 % greater than the variance for homoscedastic error and the corresponding difference in the standard deviation is 4.6 %. For moderate sample sizes, this difference will be below the statistical uncertainty of Eq. (11) and probably justifies the use of the simpler expression used by Passing and Bablok in most practial situations. Nevertheless Daniels and Kendall [26] derived a tight upper bound 2/n for the variance from Eq. (11) which, asymptotically for large n, is 4.5 times larger than the variance from Eq. (10).
Finally, in the context of estimating the variance of m, the efficiency of the PBR estimation shall be addressed. The additional condition to the general assumptions made in Section 2.1 for the estimate Eq. (10) to hold, is that the x * i , y * i , η i and ξ i follow a normal distribution. Under this condition, the "twin" method LPR of the ePBR, which is derived here from the condition of Pearson correlation diminishing to 0, can also be derived from maximization of likelihood [28] and therefore has maximal asymptotic efficiency. The relative asymptotic efficiency of ePBR is then expressible as the ratio Var( m LPR )/Var( m ePBR ) of the two estimators [8]. The standard deviation of the LPR estimate can be calculated as the corresponding standard deviation of the ePBR, where r(X ″ (m), Y ″ (m)) is Pearson's coefficient of correlation and σ r its standard deviation. Asymptotically [26], σ r 1/ n √ . Also, for normally distributed X and Y, r sin(πτ 2), which allows relating the derivatives of m with respect to r and τ. This yields the relative asymptotic efficiency So, at least for normally distributed measurements and errors, the asymptotic relative efficiency will be nearly optimal. The same relative asymptotic efficiency is also found for Theil-Sen regression [8].

Confidence intervals for the intercept and bias
The estimators for the CI of the intercept proposed by Passing-Bablok b low / up median y i − m up / low x i i 1, ⋯ , n} do not seem acceptable, since with a shift of the origin of the variable x, the two limits can be made to coincide, so that the CI has length 0. One would merely expect the width of the CI to run through a minimum near the center of gravity of the x i , as in simple linear regression. Passing and Bablok propose the following estimator for the intercept: b median y i − mx i |i 1, ⋯, n , or with, as a solution of The intercept b is the bias of y at x 0. Here, F Y″ is the empirical cumulative distribution function of the Y ″ Y − mX. Neglecting a small possible dependence of the y ″ i for different i due to the estimation of m, the variance of In clinical chemistry, the bias at the medical decision point x c , is of special interest. For the calculation of its standard error, the delta method is proposed with V being the covariance matrix of ζ (b, m) and τ(m). It has entries V 11 σ 2 ζ , V 22 σ 2 τ (cf. Eq. (10)) and This expression for the covariance can be derived using the fact that both ζ and τ are U-statistics [29]. It can be seen to quantify heteroscedasticity of the data and can also be expressed in terms of Kendall correlation coefficients, cov ζ , τ 2 where n ± are the numbers of positive and negative y ″ i , respectively, and τ ± ∑ i<j τ ij where either the sum is extended over those i, for which either y ″ i > 0 or y ″ i < 0, respectively. The matrix Q is the Jacobian For all appearing derivatives, McKean-Schrader type regularized approximations are used: -For σ τ d m/dτ σ m , the expression derived in the previous section.
Thus the final expression for the variance of the bias at the medical decision points becomes Here, the variance is minimal for x min c Hence, confidence intervals for the bias at medical decision point can be obtained as The method has been tested in numerical simulations. Results are shown ( Figure 3) for a uniform distribution of 200 "true" x * y * values (i.e., m 1 and b 0) in the range (0, 1000) with a Gaussian error in x and y with constant coefficient of variance cv x cv y 0.1. The simulation was repeated 1000 times to estimate the standard deviations of the estimated slope, intercept and bias at medical decision point, as well as the coverage of the true parameter values, cf. Table 2. Furthermore, in Figure 3, the simulated estimates of the bias are shown together with their standard deviation and the theoretical curve for the standard derivation derived from the mean values of Var( m), Var( B min c ), and x min c .

Discussion
Most classical regression methodsincluding the ones which allow for imprecision in both variables, such as Deming or geometric mean regressioncan be derived from the condition that Pearson's correlation diminishes to 0 for a specific scaling and rotation of the data. The slope of interest is thereby a parameter of the scaling or rotation. It has been shown that a replacement of Pearson's correlation by Kendall's tau yields interesting robust regression estimates. While this is well known for TSR, the argument can be extended to PBR; the ePBR [10] can be derived this way, and it has been shown to be the analog of geometric mean regression. It is easier to calculate than the cPBR estimate, since it is merely the median of the absolute values of the slope m median( S ij ), and it has the advantage that the estimator is equivariant under a scaling of the axes, while the cPBR estimator could only be determined for slopes approximately equal to 1.
The cPBR estimator appears as an approximation to the ePBR and has been shown to be the equivalent of Roos regression. Therefore, use of the ePBR seems to be preferable over that of the cPBR. Furthermore, as the PBR methods are derivable from Kendall correlation, it would be natural to report also Kendall's τ as a measure of correlation rather than Pearson's r.
The corresponding regression method to Deming regression is shown to be equivalent to a standard problem in circular statistics, namely finding the circular median of the doubled slope angles. While this is a standard method in fields like geology or ecology, with proven robustness and un-biasedness, it has apparently not been applied in clinical chemistry. Consistency of this estimator has here only been proven for errors following a normal distribution. The effect of deviations from normality on the bias of the estimator needs to be evaluated in simulation studies; nevertheless it is expected to be lower than in Pearson regression. The method may be especially then a viable alternative to PBR, when the standard deviations are known and their ratio is very different from the expected slope.
Since the PBR estimator is derived from Kendall's tau statistics, which is a member of the class of Ustatistics, it has furthermore been possible to derive an analytical expression for the variance of the intercept. This is of special importance in clinical chemistry, where the intercept which is obtained after shifting the origin of the coordinate axes to the medical decision point is known as bias of the medical decision point.
Finally, the insight that the PBR estimator can be obtained as a root of Kendall's τ yields a fast algorithm of numerical complexity O(nlnn), as compared to O(n 2 ) for the original shifted median algorithm, so that PBR estimation also becomes possible for larger data sets. It is planned to include the algorithms described in a future version of the R [30] package "MCR" [31] to be deposited in the CRAN repository.