On the accuracy of estimating fatigue notch factors

60 (2018) 10 © Carl Hanser Verlag, München Materials Testing quotient of the nominal stress of the endurance limits of unnotched SaL and notched SaLn specimens is defined as the fatigue notch factor Kf, as in Equation (1). The fatigue notch factor Kf is influenced by geometry and material [3, 4]. The S-N curve is represented in the S-N diagram, using the strength amplitude Sa as a contrast to the endurable number of cycles N with a double-logarithmic scale. In the S-N diagram, the fatigue notch factor Kf can be illustrated as the distance between the endurance limits of unnotched and notched specimens, as shown in Figure 1. against the endurance limit are, for example, wheelset axles, con-rods or crankshafts. For several alloys, a distinct endurance limit cannot be observed, but one finds continuously or gradually decreasing strength values with the increasing number of cycles [2]. In these cases, the endurance limit is only valid for a chosen runout number of cycles NG, e. g., NG = 107. Real components usually contain notches, e. g., holes, shoulders and other changes in the cross section. The nominal stress in notched components offer lower endurance limits than unnotched ones. The Components are subject to various kinds of loads, e. g., forces, moments or pressures. These loads could be quasi-static, e. g., special or misuse loads, or cyclic, e. g., the circular bending of a wheelset axle. With respect to quasi-static loads, metallic components exhibit higher strength values than cyclic loads. Components that experience a large number of cycles are usually designed to withstand the endurance limit [1]. A component can theoretically withstand the endurance limit for an infinite number of cycles without failure. Typical components designed The fatigue notch factor is the quotient of the nominal stress endurance limits of unnotched and notched specimens. It represents the notch sensitivity of a material or a component under cyclic loading. For example, the fatigue notch factor is used in analytical strength assessments with nominal stress and is also suitable as a quality criterion for materials and entire components. On the one hand, an analytical approach to rate the reliability of the fatigue notch factor is derived assuming a log-normal distribution for the population of the endurance limits of the unnotched and notched specimens. The analytical approach provides the best achievable result and can be used to rate real experiments. On the other hand, the staircase method, which is often used to determine endurance limits, is simulated using Monte Carlo simulations. From the simulations, information about the reliability of the real test can be gathered, which can be compared to the analytical approach. In conclusion, in this paper the reliability of the fracture of two originally log-normally distributed random variables each used in an own staircase test is examined. The result is compared to an analytical approach. In former articles the scope has only been on single test series, the examination of error propagation, e.g, by using the staircase method and building the fracture of the two results afterwards, is missing in former examinations. The staircase method is a procedure to estimate fatigue notch factors with reliabilities comparable to those of the analytical approach. To increase the number of evaluable tests, runouts should be reused. Christian Müller, Ingolstadt, Michael Wächter, Rainer Masendorf and Alfons Esderts, ClausthalZellerfeld, Germany On the accuracy of estimating fatigue notch factors Dedicated to Professor Dr.-Ing. Harald Zenner on the occasion of his eightieth birthday

quotient of the nominal stress of the endurance limits of unnotched S aL and notched S aLn specimens is defined as the fatigue notch factor K f , as in Equation (1). The fatigue notch factor K f is influenced by geometry and material [3,4]. The S-N curve is represented in the S-N diagram, using the strength amplitude S a as a contrast to the endurable number of cycles N with a double-logarithmic scale. In the S-N diagram, the fatigue notch factor K f can be illustrated as the distance between the endurance limits of unnotched and notched specimens, as shown in Figure 1.
against the endurance limit are, for example, wheelset axles, con-rods or crankshafts. For several alloys, a distinct endurance limit cannot be observed, but one finds continuously or gradually decreasing strength values with the increasing number of cycles [2]. In these cases, the endurance limit is only valid for a chosen runout number of cycles N G , e. g., N G = 10 7 . Real components usually contain notches, e. g., holes, shoulders and other changes in the cross section. The nominal stress in notched components offer lower endurance limits than unnotched ones. The Components are subject to various kinds of loads, e. g., forces, moments or pressures. These loads could be quasi-static, e. g., special or misuse loads, or cyclic, e. g., the circular bending of a wheelset axle. With respect to quasi-static loads, metallic components exhibit higher strength values than cyclic loads.
Components that experience a large number of cycles are usually designed to withstand the endurance limit [1]. A component can theoretically withstand the endurance limit for an infinite number of cycles without failure. Typical components designed The fatigue notch factor is the quotient of the nominal stress endurance limits of unnotched and notched specimens. It represents the notch sensitivity of a material or a component under cyclic loading. For example, the fatigue notch factor is used in analytical strength assessments with nominal stress and is also suitable as a quality criterion for materials and entire components. On the one hand, an analytical approach to rate the reliability of the fatigue notch factor is derived assuming a log-normal distribution for the population of the endurance limits of the unnotched and notched specimens. The analytical approach provides the best achievable result and can be used to rate real experiments. On the other hand, the staircase method, which is often used to determine endurance limits, is simulated using Monte Carlo simulations. From the simulations, information about the reliability of the real test can be gathered, which can be compared to the analytical approach. In conclusion, in this paper the reliability of the fracture of two originally log-normally distributed random variables each used in an own staircase test is examined. The result is compared to an analytical approach. In former articles the scope has only been on single test series, the examination of error propagation, e.g, by using the staircase method and building the fracture of the two results afterwards, is missing in former examinations. The staircase method is a procedure to estimate fatigue notch factors with reliabilities comparable to those of the analytical approach. To increase the number of evaluable tests, runouts should be reused.
Analytical strength assessments with nominal stress [5][6][7], rely on the fatigue notch factor. In these cases, the fatigue notch factor is used to calculate the component's endurance limit based on an unnotched material specimen or to adapt the endurance limit of a component if the alloy is changed.
In addition, the fatigue notch factor can be a quality criterion with respect to quality assurance. Considering specimens with the same construction notch factor K t made of different materials, the fatigue notch factor K f provides information about the notch sensitivity of the alloy [8].
The fatigue notch factor K f can be estimated depending on material and geometry.
In this paper, the experimental estimation of the fatigue notch factor is examined.
The endurance limits of notched and unnotched specimens are random variables with usually unknown distribution functions. In the field of fatigue, log-normal distributions for stress and strength variables are often assumed [9,11,15,16,[19][20][21].
Müller performs a statistical investigation on a large database to validate the assumption of a log-normal distribution in the linear region of the S-N curve (10 4 ≤ N ≤ 10 6 ) [14,22]. Based on the papers mentioned before, the assumption of log-normal distributions for the endurance limits has been made for this study.
The log-normal distribution works similarly to a normal distribution with logarithmized attributes [23]. The endurance limits   (7).
The logarithm of the fatigue notch factor K f is the difference of the two normally distributed random variables (inserting Equation (5) and Equation (6) into Equation (7)).
Assuming statistically independent random variables, the differences in the normally distributed random variables lead to another normal distribution, as shown in Equation (8) [24].  If the estimation is performed, there will be values that overestimate the population and others that underestimate the population. Regarding the example of n = 4, the corresponding areas of 90 %, 50 % and 10 % below of the probability density function are illustrated in Figure 2 (gray lines). Using a sample size n = 4 at the start of the experiment leads to an overestimation of the population with a factor less than 1.13 in the 90 % quantile. In the 50 % quantile, the population will not be over-or underestimated, meaning the estimation is unbiased. The 10 % quantile estimations are less than 0.88, indicating the corresponding value of the population is underestimated by at least a factor of 0.88. This evaluation can be repeated for other sample sizes, and the associated quantiles (here 90 %, 50 % and 10 %) can be calculated. For these, only the quantiles of 90 %, 50 % and 10 % are shown (gray lines) in Figure 2.  notch factor estimation. Considering a sample size of n = 3, the corresponding value of the population will be overestimated by a factor of 1.16 in the 90 % quantile, as seen in Figure 3a. Using an upper bound of 99 %, Figure 3b shows the expectancy value will be overestimated by a factor of up to 1.30. To achieve confident values for a sample size n = 3, safety factors of 1.16 (confidence level 90 %) and 1.30 (confidence level 99 %) are necessary. In a real experiment, conducted according to the staircase method, the expected safety factors will be larger.

The staircase method to estimate the endurance limit
The means (P f = 50 %) of the endurance limits of the unnotched and the notched specimens must be known for the experimental approach of the fatigue notch factor. Different experiments can be designed to estimate the endurance limit, such as the staircase method [9], the boundary method [25] or the probit method [26,27]. The estimation of the mean is accomplished in the most reliable way using the staircase method [14]. Consequently, the other methods will be neglected here. The following results are only valid for a lognormal distributed population of the endurance limit.
Principle of the staircase method. To plan and execute a staircase test, the expected region of the endurance limit has to be separated into discrete stress levels or staircases, as in [1]. If a log-normal distribution is assumed, a constant staircase factor d between the staircases is necessary (see Figure 4). If a logarithmic scale is used, the staircases will appear equally spaced.
The experiment can be started at an arbitrary stress level and has to be conducted until a runout number of cycles N G is reached. If a specimen reaches the runout number of cycles without failure, it will be counted as a runout; otherwise, it is a failure. If the test was completed without failure (runout), the consecutive specimen is tested at the next-highest stress level. On the other hand, if there was a failure, the following specimen is tested on the nextlowest stress level (see Figure 4). The procedure is repeated until all specimens are tested. The value determined for the endurance limit is only valid for the chosen runout number of cycles.
Evaluation of the staircase tests using the IABG method. The IABG method does not distinguish between failures and runouts. Consequently, a fictitious specimen can be added to the end of the staircase (see Figure 5). The stress level of the fictitious specimen is known from the staircase rule.
The first increasing or decreasing branch of the staircase has to be examined. Specimens that are not confirmed by a second test result within the staircase are omitted from the evaluation (see Figure 5). The lowest evaluable stress level is called S a0 with the index i = 0. The stress levels are numbered in increasing order, as seen in Figure 5. The number of evalu-  Figure 4: Principle of the staircase method [14] able specimens f i , including the fictitious one, is summed on each stress level. Using the auxiliary variables F H from Equation (9) and A H from Equation (10), the mean S aL50 %,sample can be derived, as in Equation (11) and Equation (12).
S aL50 %,sample = 10 lg(S aL50 %,sample ) In addition, the IABG method offers the possibility of estimating the logarithmic standard deviation. Because the logarithmic standard deviation is not needed to estimate the fatigue notch factor K f , it is not discussed here.
Simulation of staircase tests. The staircase method and its evaluation methods can be investigated using Monte Carlo simulations [10, 11, 13 -17, 21, 29]. In this paper, the simulation model of Müller [13,14,29] is used. It will only be briefly summarized here with further details given in [13,14,29]. The focus of the investigation is on small sample sizes 3 ≤ n ≤ 50, and it is as-sumed that runouts are reused once. The reuse of runouts is modeled as shown by Müller [30]. In the simulation model, the region of the endurance limit is divided into stress levels with a constant staircase factor d as in the real experiment. The population with its mean S aL50 %,pop and its logarithmic standard deviation s log,aL,pop,50 % is predefined before the simulation. Therefore, the calculation of the precise probabilities of failure P f at every stress level becomes possible. The first specimen can be tested (simulated) at an arbitrary stress level. The strength of the specimen is represented by a uniformly distributed random value on the interval between zero and one, which is compared to the probability of failure P f at the corresponding stress level. If the random value is larger than the probability of failure P f , the specimen will be a runout; otherwise, it is a failure. The stress level of the consecutive specimen is chosen according to the staircase rule. If the consecutive specimen is new, a new uniformly distributed random value will be set to the specimen. In case a runout is reused, the random value of the original runout will be retained; see [13,14,29,30]. Hence, it is assumed that a runout will not experience any damage or training effects; see [30]. If all virtual specimens are used, the test can be evaluated using the IABG method described above.
In the real experiment, it is an ambitious task to correctly choose the initial stress level S aL,init and the staircase factor d, which is dependent on the logarithmic standard deviation s log,aL,pop,50 % of the population. Therefore, separate log-normal distribution functions for the initial level and the logarithmic standard deviation are introduced; see [13,14,29,30]. Figure 6 summarizes all simulation parameters.
The simulation of the staircase tests is a random process that has to be repeated sufficiently often to achieve reliable results. The simulation is repeated for a fixed set of parameters, as seen in Figure 6. For each repetition, the mean values S aL50 %,sample and S aLn50 %,sample are evaluated according to the IABG method, and the fatigue notch factor K f,sample of the sample is calculated. At the end of the simulation, the number of fatigue notch factors K f,sample is equal to the number of repetitions performed. The fatigue notch factors K f,sample are sorted in ascending order. Using order statistics, each fatigue notch factor K f,sample can be assigned to the probability of appearance. The pairs of values can be sketched into the probability plot, which helps to determine the desired quantiles, such as the 10 %, 50 % or 90 % quantiles.

Rating of the experimental approach for the fatigue notch factor
Amount of unevaluable fatigue notch factors. Staircase tests may become unevaluable if the initial level of the staircase is chosen poorly, and as a result, the number of available specimens does not suffice to reach a reversal point of the staircase. The amount of unevaluable staircase tests can increase rapidly if the sample size becomes n ≤ 10; see [30]. If the log-standard deviation is small, the effect is emphasized. With small log-standard deviations, there is a higher risk of choosing defective initial levels, e. g., initial levels in the transition zone to the high cycle fatigue regime (ca. 10 4 ≤ N ≤ 10 6 ). Estimating fatigue notch factors is an ambitious task, especially for small sample sizes of n ≤ 10. To estimate a fatigue notch factor successfully, it is necessary for both unnotched and notched staircase tests to be evaluable. Regarding Figure 7, the following example can be analyzed: sample size n = 3, and log-standard deviations Figure 5: Evaluation of the staircase test using the IABG method [14] Figure 6: Simulation parameters for the staircase tests [14] s log,aL,pop,50 % = s log,aLn,pop,50 % = 0.02. In this example, more than 77 % of all fatigue notch factors cannot be estimated because at least one staircase test (unnotched or notched) was unevaluable. If a reuse of runouts is considered, the performance can be improved (see Figure 7). With a reuse of runouts limited to one (unique reuse of each runout, see [30]), the amount of unevaluable fatigue notch factors can decrease to approximately 56 % (see Figure 7).
The results in the following chapter are valid for the unique reuse of runouts, if not stated differently. If the reuse of runouts is limited to one, the test engineer is protected from testing an increasing number of high strength specimens, which will lead to an overestimation of the mean, as shown in [30].
Reliability of the fatigue notch factor estimation with common log-standard deviations. Assuming that a reused runout will not experience any damage or training effects in the staircase test, the mean of the population will be overestimated, as shown in [30]. This effect is stronger with higher log-standard deviations. If the unnotched and notched specimens own the same logstandard deviation, the overestimation of the mean cannot be recognized in the average. Using the quotient of the two means can even out the equal overestimation, as seen in Figure 8 and Table 1. In this case, the estimation of the fatigue notch factors using the staircase method is unbiased, although runouts are reused. Therefore, the unique reuse of runouts is recommended.
In Figure 8, the abscissa is labeled the "sample size at the start of the experiment". This is because not all tested specimens may be included in the evaluation, as described above. Since the factually evaluated number of specimen is impossible to predict before testing, the actually tested   number of specimens is used instead. In addition, it is mentioned that this is also the information that is of more interest to the test engineer. It is assumed that all available specimens have actually been tested and that the test engineer himself has not been able to predict the outcome of the staircase. Figure 8 again validates the good behavior of the staircase method if the mean is estimated. The results of the staircase test only differ slightly from the analytical solution. Neither the staircase factor d nor the fatigue notch factor have any significant impact on the estimation of the mean, as described in [14]. Therefore, the staircase factor is not discussed further.
To better understand the results in Figure 8, the following example is considered (values chosen randomly). A test engineer wants to examine the quality of the manufacturing process. The target is laid out so as to have a fatigue notch factor of K f,target ≤ 1.70 for a wheelset-axles steel. To perform the staircase test, the sample size is n = 5 for both the unnotched and notched specimens. Any runout should be reused once to minimize the number of unevaluable tests. From these tests, the result K f,sample = 1.90 is achieved. Is there a quality problem with the wheelset-axle steel?
From his experience the test engineer knows the log-standard deviation of the population is usually approximated as s log,aL,pop,50 % = s log,aLn,pop,50 % = 0.06. He formulates the null hypothesis to K f,actual ≤ 1.70 and chooses a significant level of α = 10 %. Using Figure 9a or Table 2, it can be seen that the 90 % = 1 -α quantile of all the fatigue notch factors does not overestimate the population by more than a factor of 1.16. Hence, the quality of the production process is fine as long there are fatigue notch factors K f,sample ≤ 1.16 × 1.70 = 1.97 drawn from the staircase test.
Reliability of the fatigue notch factor estimation with different log-standard deviations. If runouts are reused, the overestimation of the mean in the staircase tests increases with increasing log-standard deviation, as shown in [30]. In the previous sub-chapter, it was mentioned that this effect does not influence the fatigue notch factor if the log-standard deviation of the unnotched and notched specimens are equal.
An example has been designed to examine the influence of different log-standard deviations on the fatigue notch factor, which can serve as a limiting case in fatigue strength. The following boundary logstandard deviations are chosen for the limiting case, referring to [6,31]:   The expected overestimation of the fatigue notch factor with the unique reuse of runouts is almost negligible concerning the mean (50 % quantile); see Figure 10b and Table 4. The observed bias is independent from the sample size and has a value of approximately 1.01. Comparing Figure 10a or Table 3 (without reuse) to Figure 10b or Table 4. (with unique reuse), the overestimation caused by the reuse of runouts becomes more obvious. The ratio of the fatigue notch factor estimation with and without reuse for quantiles unequal to 50 % is now taken into account. Considering very small or very large quantiles, e. g., 1 % or 99 %, see Figure 10 or Tables 3 and 4, the unique reuse adds an overestimation  factor of 1.05 to the 1 %-and the 99 % quantiles, respectively (ratio of "unique reuse" to "without reuse"). Again, this overestimation is almost independent of the sample size.

Conclusions
Assuming log-normally distributed populations for the unnotched and notched specimens, the reliability of the fatigue notch factor estimation has been examined. In the first step, an analytical solution has been developed to represent the ideal case.
From the analytical solution, the evidence of a log-normal distribution for the fatigue notch factor has been deduced. Monte Carlo simulations were performed to simulate the staircase tests. The simulations were used to rate the reliability of the fatigue notch factor estimation, and the results were compared to the analytical solution. Hence, the focus is on small sample sizes, e. g., n ≤ 10, and the reuse of runouts within the staircase test has been considered. The reuse of runouts significantly decreased the amount of unevaluable staircase tests with sample sizes of n ≤ 10. If the log-standard deviations of unnotched and notched specimens are equal, there is no negative influence of the reuse of runouts on the fatigue notch factor estimation. The estimated fatigue notch factors, based on the staircase tests, offer a strong reliability, which is almost comparable to the analytical solution. An example has been offered to demonstrate the benefits of the achieved results.
The influence of significantly different log-standard deviations has been discussed using a limiting case. This limiting case shows an overestimation by an approximate factor of 1.01 to the mean of the fatigue notch factor (50 %-quantile) caused by the unique reuse of runouts. Comparing the results of the staircase tests without runout reuse to the results with reuse for very small or very large quantiles, a slight overestimation caused by the reuse of runouts can be observed. For quantiles of 1 % and 99 %, the overestimation factor is approximately 1.05 with respect to the results without reuse. For all quantiles, the overestimation is almost independent of the sample size.