A multi-power and multi-splitting inner-outer iteration for PageRank computation

Abstract As an effective and possible method for computing PageRank problem, the inner-outer (IO) iteration has attracted wide interest in the past few years since it was first proposed by Gleich et al. (2010). In this paper, we present a variant of the IO iteration, which is based on multi-step power and multi-step splitting and is denoted by MPMIO. The description and convergence are discussed in detail. Numerical examples are given to illustrate the effectiveness of the proposed method.


Introduction
With the fast development of the Internet, web search engines have become one of the most important Internet tools for information retrieval. How to list the relevant web pages plays a significant role in this filed. Among thousands of search engines, Google is one of the most popular and successful. And this is mainly attributed to its effective algorithm, PageRank.
PageRank, a link-based algorithm formulated by Page et al. [1], gives a rank list of importance of pages related to user's query terms. The hyperlink structure of the web can be viewed as a direct graph and modeled by a Markov chain: where ∈ ( ) α 0, 1 is the damping factor, ∈ × P n n is a column stochastic matrix, e is a column vector of all ones, v is called the personalization or the teleportation vector and set as = / v e n. In particular, A is defined as the Google matrix and is both irreducible and aperiodic, which implies that there exists a unique right nonnegative eigenvector x for (1). For more details about the PageRank algorithm, we refer the reader to [1][2][3][4].
Given the huge size and density of the Google matrices, computing PageRank is faced with the big challenge of computational resources, and only a small set of computational tools can come in handy. The power method was first considered to compute the PageRank for its stable and reliable performances. However, when the largest eigenvalue of matrix A is not separated well from the second one, the power method costs more and works less well. Some accelerated techniques have been proposed to speed up its convergence, such as vector extrapolation [5][6][7][8], Arnoldi-type [9,10], aggregation/disaggregation [11], lumping [12,13], adaptive methods [4,14] and inner-outer (IO) iteration methods [15][16][17][18][19][20].
Recently, researchers have focused on the PageRank's linear system and the corresponding iterative methods have raised concerns [3,[15][16][17]20]. Gleich et al. [15] proposed an IO iteration method, which is implemented by solving a linear system with a lower damping factor and similar algebraic structure to the original one, and Gu et al. [17] put forward an improved algorithm, i.e., the power-inner-outer (PIO) iteration. In this paper, we proposed a variant of the IO(PIO) iteration by applying multi-step power and multi-step splitting to combine with the IO iteration to accelerate the computation of PageRank.
The remainder of this paper is structured as follows. In Section 2, we briefly provide the mechanism of the IO iteration for PageRank problem. In Section 3, we introduce the proposed algorithm and investigate the convergence properties in detail. Numerical examples are given to illustrate the effectiveness of the method in Section 4. Finally, some conclusions are drawn in Section 5.

The IO(PIO) iteration
In this section, we begin by briefly introducing the derivation of the IO iteration by Gleich et al. [15].
, the eigenvalue problem = Ax x can be rewritten as As you know, there is a general agreement that the smaller the damping factor is the easier it is to solve the original PageRank problem. In light of this, Gleich et al. [15] reformulated (2) as Then they have the following outer stationary iteration: x v 0 is the initial guess, though other choices are possible too. However, solving this linear system with coefficient matrix − I βP is still computationally difficult, even though β is small. Therefore, the inner Richardson iteration is used to compute the approximation of + x k 1 . Setting the right-hand side of (4) as the inner linear system is defined as which is computed by the inner iteration is the initial guess and the computed approximate solution y l is assigned to new + x k 1 . The stopping criteria of the outer iteration and the inner iteration are, respectively, given as where the parameters τ and η are the outer and inner tolerances, respectively. Based on the above discussion, a basic IO iterative algorithm for PageRank problem has been proposed in [15] and its convergence has also been analyzed therein.
From (2) and by the power method, Gu et al. [17] obtained the Richardson iteration , 0,1,2, , k k 1 (8) and then they proposed the PIO iteration as follows: is the initial guess and ∈ ( ) β α 0, . The first iteration of (9) is easy to implement. In the second iteration of (9), Gu et al. come up with the IO iteration to get the next approximation + x k 1 . They then proved the superiority of the PIO over the power method and the IO iteration and some convergence properties of the PIO method have been given as follows.
Theorem 2.1. [17] The iteration matrix ( ) T α β , of the PIO iteration (9) is given by and the modulus of its eigenvalues is bounded by Therefore, the spectral radius satisfies ( ( )) ≤ < ρ T α β σ , 1 ,i.e., the PIO iteration converges to the unique solution x ⁎ of (2) for any initial vector x 0 .
Theorem 2.2. [17] Suppose the second iteration of the PIO iteration is exact and < < α 0 1. Then the PIO iteration converges for any ∈ ( ) β α 0, . And it has and 3 The multi-power and multi-splitting IO iteration for PageRank

The MPMIO iteration
It is not hard to find that the first step of PIO iteration (9) is the classical power method. In this way, the PIO iteration can be understood as the combination of one-step power method and IO iteration. It inspires us to expand the IO(PIO) iteration to more efficient methods. First, we consider the multi-step power method with the IO iteration. Then on the second iteration of (9), we consider the multi-step splitting iteration. Hence, the corresponding variant of the IO(PIO) algorithm, MPMIO, can be proposed. Section 4 is about the details, and the convergence properties are described in Section 3.2.
Given an initial guess x 0 , we first expand the first step of (9) to the multi-step power as follows: where ∈ ( ) ∈ ( ) = / α β α v e n 0, 1 , 0, , and (≥ ) m 2 is the step of the power method. Now, by introducing two parameters β 1 and β 2 , ∈ ( ) β α 0, 1,2 we expand the second step iteration of (9) to the multi-splitting as follows: Taken together, the proposed multi-power and multi-splitting IO iteration can be shown below.
Multi-power and multi-splitting IO iteration. Given an initial guess x 0 . Compute x k k 0 converges. The first m-step power iterations of (16) can be implemented easily and the IO iteration can be used for the second stage. Setting the right-hand side of the splitting iteration of (16) as we get the first inner iteration to f 2 and get the second inner iteration where we take = + + y x k 0 m m 1 as the initial guess and assign the computed approximation y l to the new + x k 1 . Now we switch to the first m-step power method and repeat the procedure until the desired PageRank vector is obtained. For the whole iteration, we stop the outer iteration if for the inner iteration stopping criterion. Now the main algorithm of this paper is shown in Algorithm 1.
x f β y

Lines 1 and 2 of Algorithm 1 initialize =
x v and = y Px. The m-step iterations of (16) are done in lines 4-7. Lines 8-10 are used for the computation of f 2 . The inner iteration is implemented in lines 11-14. We use the repeat-until clause to ensure that at least one inner iteration is performed. To terminate the algorithm, line 3 checks the residual of the outer linear (2). And in line 14, the stopping criterion is examined for the inner iteration. At the end of Algorithm 1, a single power method step is used for the possible benefits as given in [21].

Convergence analysis of the MPMIO iteration
In this subsection, we devote to the convergence properties of our new algorithm and pay particular attention to its superiority over the power method, IO iteration and PIO iteration, respectively.
Return to (16) and substitute the first iteration + + x k m At the same time, by introducing two parameters β 1 and β 2 and from the last two steps of (16), we get Thus, we obtain the following theoretical results for convergence property of the MPMIO iteration (16).
2 of the MPMIO iteration (16) is given as (24) and the modulus of its eigenvalues is bounded by Meanwhile, the spectral radius satisfies In other words, the MPMIO iteration converges to the unique solution x ⁎ of the linear system (2) for any initial vector x 0 .
Proof. The first part of Theorem 3.1 has been proved. Suppose λ i is an eigenvalue of P, then Back to the right-hand side of (25), and given that ∈ (  where ( ) T α β β , , m 1 2 is the iteration matrix as (24), and x k , + x k 1 refer to the k-step or + k 1-step iteration, respectively.
Taking 1-norms and using the triangular inequality, we obtain The inequality of (28) follows from (32), (33) and (34). Setting (23) for step k, subtracting it from (23) and taking norms in the same way like (32), we can easily derive (29). □ it is easy to see that is the contraction factor of single-parameter IO iteration. Thus, we can deduce that the MPMIO iteration may converge faster than that of the single parameter β 1 or β 2 .

Numerical experiments
In this section, we carry out some numerical examples to test the effectiveness of the MPMIO iteration and compare it with the power method, the IO iteration and the PIO iteration. All the numerical results are obtained by using MATLAB9.7.0(R2019b) on a PC with 3.97 GHz Inter(R)Core(TM)i7 processor with 8GB RAM.
For the sake of justice, we take the same initial guess = = / The Web matrices are listed in Table 1, where "nnz" denotes the number of nonzero elements and "avg nnz per row" denotes the average number of nonzero elements per row. For convenience, in all the tables to follow we have abbreviated the power method, the inner-outer iteration method, the PIO iteration method and our MPMIO iteration method as Power, IO, PIO, MPMIO, respectively. We denote by "ite" the iteration counts, "mv" the number of matrix-vector products and by "CPU" the CPU time used in seconds. Example 1. This example aims at discussing some of the options for parameters β 1 and β 2 . The test matrix is amazon0505 Web matrix. With = m 5, = − τ 10 8 , we run the different methods for PageRank problem with different pairs of values for ( ) β β , 1 2 . Numerical results are presented in Table 2. It is easy to see that MPMIO performs the best both in terms of iteration numbers and CPU time. As for the matrix-vector products, they are approximately equal and with the increase in α, MPMIO gradually reflects its advantages. At the same time, we find that there are still different performances with different values of parameters β 1 and β 2 , and it is currently very hard to get the best choices of β 1 and β 2 . Thus, we choose empirically the parameters as =   Table 3. From Table 3, we find that among the four methods, MPMIO outperforms the other three methods, i.e., Power, IO, PIO, especially in iteration counts and CPU time. Meanwhile, we can find that the performance of MPMIO is sensitive to the choice of m. For example, when m takes a small value MPMIO is outstanding but when m takes a big value, like 20 or 50, the performance of MPMIO is barely satisfactory. Then, based on other similar observations of test matrices, we tend to choose a modest value of m, i.e., = m 5. . We test all the matrices in Table 1 to compare the number of matrix-vector products and CPU time required for convergence to three different outer tolerances τ. To state a speedup of an algorithm a over another one b, we use speedup formula as Numerical results are given in Table 4.
This example shows that our proposed algorithm MPMIO can reduce the number of matrix-vector products obviously and is proved to be efficient. Based on the CPU time, MPMIO iteration performs the best for each test PageRank problem. When = − τ 10 8 , MPMIO achieves a speedup from 1.13× to 1.48× over IO  and from 1.04× to 1.20× over PIO. Taken together, our proposed MPMIO iteration converges faster than the other three methods, and this has indeed been shown by the convergent curves in Figure 1.

Conclusions
In this paper, we have improved the IO iteration for accelerating PageRank computation by introducing multi-step power and multi-step splitting. Our algorithm can be implemented easily, and theoretical results show its efficiency. Numerical experiments on several PageRank problems have indicated that the new  algorithm is superior to the power method and the IO iteration methods, IO and PIO. At the same time, we have also realized that the new algorithm is parameter-dependent and appropriate choice of parameters can be made in our experiments. It is meaningful to explore how to determine the optimal parameters for our algorithm and may be included in the future work.