Let h(r) be the second-order h-curve in the continuous case, its h-core (as h-top) is

$$\begin{array}{}{R}_{T}^{2}={\int}_{1}^{{h}_{T}}h(r)dr,\end{array}$$(11)

which is equal to the integral area of the h-curve in the h-top.

If all units of the h-series are equal to x (number of units), then the second-order h-tail will be

$$\begin{array}{}{\mathrm{t}}_{\mathrm{T}}^{2}={\int}_{{h}_{T}}^{x}h(r)dr.\end{array}$$(12)

Historically, we have three theoretical models for estimating the h-index.

In Hirsch’s original paper (Hirsch, 2005), the mathematical model for the h-index is given as follows:

$$\begin{array}{}{\displaystyle h=\sqrt{\frac{C}{a}},}\end{array}$$(13)

where C is the total number of citations and a is a constant ranging between 3 and 5.

Egghe and Rousseau derived the Egghe–Rousseau model (Egghe and Rousseau, 2006) in the framework of the Lotkaian informetrics, which can be re-written as

$$\begin{array}{}{\displaystyle h={P}^{1/\alpha},}\end{array}$$(14)

where P is the total number of publications and α>1 is the Lotka’s exponent.

Glänzel and Schubert proposed the Glänzel–Schubert model (Glänzel, 2006; Schubert and Glänzel, 2007) with the formula:

$$\begin{array}{}{\displaystyle h=c{P}^{1/3}(C/P{)}^{2/3}}\end{array}$$(15)

in which C/P isassociated with the Journal Impact Factor (JIF) and c is a constant near 1.

Under Heaps’ law of Herdan’s law (Egghe, 2007), the three models can be unified (Ye, 2011), whereby the h-index is linked to total items (such as citations, C) and total sources (such as publications, P) following the formula:

$$\begin{array}{}{\displaystyle h=c{P}^{1/(\alpha +1)}(C/P{)}^{\alpha /(\alpha +1)},}\end{array}$$(16)

where α>1 is Lotka’s exponent and c>0 is a constant.

With H^{2} items and sources X in a second-order h-curve, it results in

$$\begin{array}{}{\displaystyle {h}_{T}=c{X}^{1/(\alpha +1)}({H}^{2}/X{)}^{\alpha /(\alpha +1)},}\end{array}$$(17)

where α>1 is Lotkaian exponent and c>0 is a constant.

In the framework of the Loktaian informetrics, Eq. (21) can be simplified by using the Egghe–Rousseau formula, i.e.,

$$\begin{array}{}{\displaystyle {h}_{T}={X}^{1/\alpha}}\end{array}$$(18)

By using the Egghe–Rousseau formula with α=2 and P=100, we estimate h=10 according to Eq. (14). When α=2 and X=10, we estimate h_{T} ≈3.3 according to Eq. (18). This means that the first-order h-core refers to 10% and the second-order h-top to about 3% of the sources. The ratio of the first-order h-core to the second-order h-top is 3/10=30%. Since the Egghe–Rousseau formula is highly simplified and is used only as a reference in this study, the estimated values can be referenced only.

Let us record h_H, h_E-R, and h_G-S as the Hirsch estimate, the Egghe–Rousseau estimate, and the Glänzel-Schubert estimate of the h-index. Suppose α=2, a=5, and c=1, we obtain the following estimates as theoretical reference values of the h-index (Ye, 2011):

$$\begin{array}{}{\displaystyle {h}_{H}\sim (C/5{)}^{1/2}}\end{array}$$(19)

$$\begin{array}{}{\displaystyle {h}_{E-R}\sim {P}^{1/2}}\end{array}$$(20)

$$\begin{array}{}{\displaystyle {h}_{G-S}\sim {P}^{1/3}(C/P{)}^{2/3}}\end{array}$$(21)

Using our empirical cases, we computed the theoretical estimations based on the original data (*P* and *C*, c.f. Appendix). The results are shown in Figures 4 and 5.

Figure 4 Three estimations upon h-index of Math journals.

Figure 5 Three estimations upon h-index of LIS journals.

Visually, the Glänzel–Schubert estimation and the Hirsch estimation look better than the Egghe–Rousseau estimation. The Egghe–Rousseau formula is strictly limited by á=2 in the fitting. This situation has been discussed by Ye (2011) and can be quantitatively measured by Pearson correlation coefficients. shows that the Glänzel–Schubert estimation and the Hirsch estimation correlate higher with the real h than the Egghe–Rousseau estimation.

Table 1 Pearson correlation coefficients with p-values.

The analytical results in the table reveal that both the Glänzel–Schubert estimation and the Hirsch estimation can be applied as a theoretical reference for computing the h-index.

However, in the second-order case, only sources X show clear numbers, so that it is convenient to apply the Egghe–Rousseau estimation. The comparable results are shown in , where α=2.

Table 2 Egghe–Rousseau estimation of h-top in two cases.

We see that the Egghe–Rousseau estimates are not correct in two cases; both are smaller than the practical values. An important reason for the result is the choice of α=2. Generally, 1<α<3⍰X ≥ 1, and h_{T} ≥ 1 link with α as

$$\begin{array}{}{\displaystyle \frac{\alpha}{1-\alpha}<\mathrm{log}X}\end{array}$$(22)

or

$$\begin{array}{}{\displaystyle \frac{1}{1-\alpha}<\mathrm{log}{h}_{T}.}\end{array}$$(23)

These are static results.

## Comments (0)

General note:By using the comment function on degruyter.com you agree to our Privacy Statement. A respectful treatment of one another is important to us. Therefore we would like to draw your attention to our House Rules.