Rayleigh-Ritz Majorization Error Bounds for the Linear Response Eigenvalue Problem

Abstract In the linear response eigenvalue problem arising from computational quantum chemistry and physics, one needs to compute a few of smallest positive eigenvalues together with the corresponding eigenvectors. For such a task, most of efficient algorithms are based on an important notion that is the so-called pair of deflating subspaces. If a pair of deflating subspaces is at hand, the computed approximated eigenvalues are partial eigenvalues of the linear response eigenvalue problem. In the case the pair of deflating subspaces is not available, only approximate one, in a recent paper [SIAM J. Matrix Anal. Appl., 35(2), pp.765-782, 2014], Zhang, Xue and Li obtained the relationships between the accuracy in eigenvalue approximations and the distances from the exact deflating subspaces to their approximate ones. In this paper, we establish majorization type results for these relationships. From our majorization results, various bounds are readily available to estimate how accurate the approximate eigenvalues based on information on the approximate accuracy of a pair of approximate deflating subspaces. These results will provide theoretical foundations for assessing the relative performance of certain iterative methods in the linear response eigenvalue problem.


Introduction
In computational quantum chemistry and physics, the excitation states and absorption spectra for molecules or surface of solids are described by the random phase approximation (RPA) or the Bethe-Salpeter (BS) equation [1,2]. One important question in RPA or BS equation is to compute a few eigenpairs associated with the smallest positive eigenvalues of the following eigenvalue problem: where A, B ∈ R n×n are both symmetric matrices and A B B A is positive de nite [3][4][5]. The matrix H in (1.1) is a special Hamiltonian matrix whose eigenvalues are real and in pairs {λ, −λ}. Therefore, we can order the 2n eigenvalues of H as −λn ≤ · · · ≤ −λ < λ ≤ · · · ≤ λn . (1.2) Through a similarity transformation, the eigenvalue problem (1.1) can be equivalently transformed into (1.3) where K = A − B and M = A + B are both n × n real symmetric positive de nite matrices. But, in consistent with [3], throughout the rest of paper, we relax the condition on K and M to that they are symmetric positive semi-de nite and one of them is de nite, unless explicitly stated di erently. This means that possibly λ = .
This eigenvalue problem was still referred to as the linear response eigenvalue problem (LREP) [3,6] and will be so in this paper, too. There are immense recent interest in developing new theories, e cient numerical algorithms of LREP and the associated excitation response calculations of molecules for materials design in energy science [7][8][9][10].
As the dimension n is usually very large, LREP (1.3) is generally solved by iterative methods, such as the Locally Optimal Block Preconditioned 4D Conjugate Gradient Method (LOBP4DCG) [6] and its space-extended variation [11], the block Chebyshev-Davidson method [12], and the generalized Lanczos type methods [13,14]. These e cient numerical algorithms are all based on the concept of the so-called pair of de ating subspaces which is a generation of the invariant subspace in the standard eigenvalue problem. For given k-dimensional subspaces X ⊆ R n and Y ⊆ R n , {X, Y} is called a pair of de ating subspaces if they satisfy KX ⊆ Y and MY ⊆ X.
Whenever such a pair of de ating subspaces is available, LREP (1.3) can be projected into a much smaller problem in the form of (1.3) whose spectrum is part of that of H. That means a pair of de ating subspaces can be used to extract the eigenvalues corresponding to the de ating subspace pair. In practice, such a pair of exact de ating subspaces is usually not available, only approximate one, then the projection method computes good the approximate eigenvalues of LREP (1.3). Quantifying how good the approximate eigenvalues is the objective of this paper.
For the standard symmetric eigenvalue problem, there are existing results to estimate how well of such approximate eigenvalues may be; see [15][16][17][18][19]. For LREP (1.3), in [20], residual-based error bounds for approximate eigenvalues computed through the pair of approximated de ating subspaces are obtained. These results bound eigenvalue approximations characterized by the certain residuals. Another set of results also bound the eigenvalue errors in [21] but in terms of the canonical angles between the approximate de ating subspace pair and the exact pair. In this paper, we put forward two improvements to the Rayleigh-Ritz approximation theories in [21] by using weak majorization. The Rayleigh-Ritz majorization type eigenvalue error bounds have been well established in the symmetric eigenvalue problem; see [15,17]. Compared with typical inequality results, majorization type bounds provide a succinct way to express numerous useful inequalities involving two vectors. The major goals of this paper are two-fold: to establish Rayleigh-Ritz majorization error bounds for LREP, and to extend our results by considering the unequal dimension between the exact and approximate pair of de ating subspaces. These improvements are helpful to understand how approximate eigenvalues move towards the associated exact eigenvalues in iterative methods of LREP.
The rest of the paper is organized as follows. In Section 2, some notations and preliminaries including the concept of majorization and the canonical angles of two subspaces are collected for use later. Section 3 contains our main results on how the vector of di erences between the exact eigenvalues and their approximations is weakly majorized by the canonical angles from the exact to approximate pair of de ating subspaces. In Section 4, some numerical examples are presented to support our analysis. Finally, Conclusions are given in Section 5.

Preliminaries . Basic de nitions
R m×n is the set of all m × n real matrices, R n = R n× , and R = R . For X ⊆ R n , dim(X) is the dimension of X. For X ∈ R m×n , X T is its transpose, R(X) is the column space of X, and the submatrix X(:,i:j) of X consists of column i to column j. In is the n × n identity matrix or simply I if its dimension is clear from the context. For x ∈ R n and y ∈ R n , for k ≤ ≤ n, the notation x ≤ y is used to compare x with y componentwise, and x • y = [x y , . . . , xn yn] T denotes the Hadamard product of x and y, in particular, T represents x with its elements rearranged in ascending order. We use λ(A) = λ ↓ (A) to denote the vector of eigenvalues of a symmetric matrix A arranged in descending order. For vectors x, y ∈ R n , the usually inner product and its induced norm are de ned by Consider two subspaces X and Y in R n , and suppose Let X ∈ R n×k and Y ∈ R n× be orthonormal basis matrices of X and Y, respectively, i.e., The vector of cosines of canonical angels from the subspace X to the subspace Y is de ned by cos Θ(X, i.e., θ (X, Y) ≥ · · · ≥ θ k (X, Y). It is clear that Θ(X, Y) so de ned is invariant with respect to the di erent choice of the orthonormal basis matrices X and Y. Therefore, in what follows we sometimes place a matrix in one of or both arguments of θ i (·, ·) and Θ(·, ·), with the understanding that it refers to the subspace spanned by the columns of the matrix argument. . Let X and Y be two subspaces in R n . . Let X and Y be two subspaces in R n satisfying (2.1). Then If k = , we say that the angles between subspaces X and Y [22]. . Let X and Y be two subspaces in R n with dim(X) = dim(Y) = k. Then, λ (ΠX − ΠY) (1:k) = sin Θ(X, Y), where ΠX and ΠY are orthogonal projectors onto subspaces X and Y, respectively.
For any given symmetric and positive de nite matrix W ∈ R n×n , the W-inner product and its induced W-norm are de ned by x, y W = y T Wx, For two subspaces X, Y ⊆ R n satisfying (2.1), let X and Y be the W-orthonormal basis matrices of subspaces X and Y, respectively, i.e., Similarly to the standard canonical angles (2.2), we de ne denotes the vector of the W-canonical angles from X to Y in descending order. Let W = CC T with nonsingular C ∈ R n×n . Then, by [24,Theorem 4.2], Therefore, Lemmas 2.1 and 2.2 in which the standard canonical angles are simply replaced by W-canonical angles still hold.

. Majorization and weak majorization
For x, y ∈ R n , we say that x is weakly majorized by y, in symbols x ≺w y, if If in addition, we say that the vector x is (strongly) majorized by y, written x ≺ y.
To facilitate our discussion, we collect several simple and general properties of the majorization and weak majorization in Lemma 2.4, and the reader is referred to [25] for proofs and more.
In particular, if α is a positive real number, then x ≺w y implies αx ≺w αy.
The following lemmas on the majorization or weak majorization relations of eigenvalues and singular values are critical to our main results.
The weakly majorization relationship (2.4) is obtained by extending B, D and E with zero blocks to n × n matrices. The extension with zero blocks only appends zero singular values and does not change the ranks.

Main results
Many theoretical properties of LREP (1.3) have been established in [3]. In particular, the following theorem presents decompositions on K and M, which is necessary for our later developments. where Λ = diag(λ , λ , . . . , λn), λ ≤ λ ≤ · · · ≤ λ n and Φ = Ψ − T . (b) If K is also de nite, then all λ i > and H is diagonalizable: (c) The eigen-decomposition of KM and MK are respectively.
As we have introduced in Section 1, for two k-dimensional subspaces X and Y in R n , we call {X, Y} a pair of de ating subspaces if KX ⊆ Y and MY ⊆ X. Let X, Y ∈ R n×k be the basis matrices for X and Y, respectively. Then (3.2) implies that there exist KR ∈ R k×k and MR ∈ R k×k such that KX = YKR and MY = XMR, or equivalently, Roughly speaking, most of e cient algorithms for LREP usually generate a sequence of approximate de ating subspace pairs that hopefully converge to or contain subspaces near the pair of de ating subspaces associated with the rst few smallest λ i . Therefore, in the rest of the paper, we focus on the pair of de ating subspaces {R(Φ ), R(Ψ )} where Φ = Φ(:,1:k) and Ψ = Ψ(:,1:k), and let {U, V} satisfying be a pair of approximate de ating subspaces of Let U, V ∈ R n× be any basis matrices for U and V, respectively. Then, the condition θ (U, V) < π/ implies U T V being nonsingular. By [6], the best approximation eigenpairs in sense of the trace minimization principle are obtained via computing the eigenpairs of Notice that HSR so de ned is of LREP type since both KSR and MSR are symmetric and have the same de niteness property as their corresponding K and M. Therefore, we can denote its eigenvalues by −µ ≤ · · · ≤ −µ ≤ µ ≤ · · · ≤ µ , in which some of µ i should be good approximate eigenvalues of H. Moreover, by [3, Theorem 2.9], these eigenvalues of HSR are independence of the basis matrices U and V, and factorization U T V = W T W which is not unique. Now, we would like to bound the errors in µ i as approximations to λ i for ≤ i ≤ k in terms of the distances from {R(Φ ), R(Ψ )} to {U, V}. For the purpose, Zhang, Xue and Li in [21] established an inequality in the case = k, i.e., We rst continue the e ort to extend the result by using the weak majorization replacing the traditional inequality in (3.4). To prove our main results, we also need the following lemma to provide special basis matrices for the pair {U, V} which are used in the proof of [21, Theorem 3.1].

Lemma 3.2.
Let U, V ⊆ R n satisfy (3.3) and = k. These exist basis matrices U, V ∈ R n×k of U and V, respectively, and an orthonormal matrix P ∈ R n×n such that , . . . , Proof. Since the eigenvalues of HSR are unchanged with di erent choices of the basis matrices for U and V, we choose the basis matrices U and V as mentioned in Lemma 3.2. Let U, V, P, and Γ be de ned in Lemma 3.2. Partition P, U and Λ as Then, it follows by (3.5) that, If k > n, we caution the reader that here The last line holds because of Lemmas 2.4(b) and 2.7. Now, we bound separately the two terms in the sum on the right-hand side of the last line. We start with the rst term. By (3.10), and Lemmas 2.4(c), 2.5 and 2.8, we get Considering the second term, by (3.8), and Lemmas 2.4 and 2.7, we have At last, (3.11), (3.12) and (3.13) together give which leads to (3.6).  (c) In some numerical methods for LREP, such as the rst Lanczos method [13,27] and Chebyshev-Davidson method [12], the subspace U is chosen such that U = MV, which leads to cos θ (i) M − (U, MV) = for ≤ i ≤ k and tan θ (1) M − (U, MV) = . Then, the majorization relationship (3.6) reduces to (d) Suppose that K is de nite. Then, a simple modi cation by exchanging the roles of K and M in the above proof leads to the following majorization relationship , . . . ,   I(:,1:k)). (3.22) Then, (3.19), (3.20), (3.21) and (3.22) together yield (3.17). Similarly, to prove (3.18), we consider
Proof. By Lemma 2.1(a), we choose k-dimensional subspaces U ⊂ U satisfying Proof. Similarly to the proof of Theorem 3.3, let k-dimensional subspaces U ⊂ U and V ⊂ V be chosen such that Based on the results in Lemma 2.1 and Theorem 3.2, we have

Numerical examples
In this section, we present some numerical examples to illustrate our results in Theorems 3.1 and 3.3. In particular, we will demonstrate the terms associated with Θ M − (U, MV) in majorization upper bounds of (3.6) and (3.25) are not able to be neglected. . We are interested in bounding the di erences between the eigenvalues λ i for ≤ i ≤ k and their approximations µ i by (3.6). We measure the following errors, for ≤ j ≤ k, Since U = MV , by Remark 3.1(c), then ε ,j = ε ,j for ≤ j ≤ k are the upper bounds on the approximate eigenvalue errors ε ,j associated with {U , V }. Table 4.1 reports the errors ε ,j and ε ,j as de ned in (4.2) and (4.3) on the eigenvalue approximations ε ,j as de ned in (4.1) corresponding to {U , V } and {U , V }, respectively. It follows from Table 4.1 that our upper bounds provided by Theorem 3.1 are rather sharp in this example, and they are comparable to the observed errors ε ,j . In particular, for the pair of approximate de ating subspaces {U , V }, though ε , > ε , and ε , > ε , , ε , in absence of the terms cos θ (1) M − (U, MV) and tan θ (1) M − (U, MV) is a little smaller than ε , in this example. However, for the pair {U , V }, ε ,j for ≤ j ≤ k are always available to bound ε ,j .    Since > k here, by Theorem 3.3, we compute ε ,j ,ε ,j and ε ,j for ≤ j ≤ k in Table 4 Table 4.2 suggests that our boundsε ,j for ≤ j ≤ k are also very sharp in the case > k, while ε ,j for ≤ j ≤ k in the numerical results of {U , V } underestimate ε ,j too much in this example. Results of {U , V } Results of {U , V } j ε ,jε ,j (bound for ε ,j ) ε ,j ε ,jε ,j = ε ,j .

Results of {U
.

Conclusion
The pair of k-dimensional de ating subspaces {R(Φ ), R(Ψ )} due to its ability in recovering the eigenvalues of interest plays a vital role in some e cient numerical methods of LREP. Given a pair of -dimensional subspaces {U, V} with ≥ k which approximates or contains a pair of k-dimensional subspaces approximating {R(Φ ), R(Ψ )}, Zhang, Xue and Li [21] established the Rayleigh-Ritz error bounds of LREP on the di erences between the approximate eigenvalues and the eigenvalues of interest by the canonical angles between the exact and approximate pair of de ating subspaces in the case = k. There are two major contributions in this paper to improve the exist results in [21]. One is to obtain the Rayleigh-Ritz majorization type upper bounds for the approximate eigenvalue errors of LREP, i.e., Theorem 3.1. The other one is to extend Theorem 3.1 to Theorem 3.3 by considering > k. Numerical examples are presented to con rm the sharpness of these upper bounds, and to demonstrate the necessity of the terms on Θ M − (U, MV) in Theorems 3.1 and 3.3. From the point of view that the generalized linear response eigenvalue problem where K and M are n × n symmetric positive semi-de nite and one of them is de nite, E± are n × n nonsingular matrices and E T + = E−, can be equivalently transformed to the standard LREP by decomposing E+ = CD T [11,28]. The development of Theorems 3.1, 3.