Rank Structured Approximation Method for Quasi-Periodic Elliptic Problems

Boris Khoromskij and Sergey Repin

Abstract

We consider an iteration method for solving an elliptic type boundary value problem 𝒜u=f, where a positive definite operator 𝒜 is generated by a quasi-periodic structure with rapidly changing coefficients (a typical period is characterized by a small parameter ϵ). The method is based on using a simpler operator 𝒜0 (inversion of 𝒜0 is much simpler than inversion of 𝒜), which can be viewed as a preconditioner for 𝒜. We prove contraction of the iteration method and establish explicit estimates of the contraction factor q. Certainly the value of q depends on the difference between 𝒜 and 𝒜0. For typical quasi-periodic structures, we establish simple relations that suggest an optimal 𝒜0 (in a selected set of “simple” structures) and compute the corresponding contraction factor. Further, this allows us to deduce fully computable two-sided a posteriori estimates able to control numerical solutions on any iteration. The method is especially efficient if the coefficients of 𝒜 admit low-rank representations and if algebraic operations are performed in tensor structured formats. Under moderate assumptions the storage and solution complexity of our approach depends only weakly (merely linear-logarithmically) on the frequency parameter 1ϵ.

1 Introduction

Problems with periodic and quasi-periodic structures arise in various natural sciences models and technical applications. Quantitative analysis of such problems requires special methods oriented towards their specific features. For perfectly periodic structures, efficient methods are developed within the framework of the homogenization theory (see, e.g., [1, 3, 8] and other literature cited therein). However, classical homogenization methods cover only one class of problems (all cells are self-similar and the amount of cells is very large). In this paper, we use a different idea and suggest another modus operandi for quantitative analysis of boundary value problems with periodic and quasi-periodic coefficients. It generates approximations converging (in the energy space) to the exact solution and provides guaranteed and computable error estimates. The approach is applicable to (see, e.g., Figures 1, 2)

1. periodic structures, in which the amount of cell is considerable (e.g., 103104) but not large enough to neglect the error generated by the respective homogenized model;

2. quasi-periodic structures that contain cells with defects and deformations;

3. multi-periodic structures where the coefficients reflect the combined effect of several functions with different periodicity.

In general terms, the idea of the method is as follows. We consider the problem 𝒫:

(1.1)𝒜u=f,fV*,

where V is a reflexive Banach space with the norm V, V* is the space conjugate to V (the respective duality pairing is denoted by v*,v), and 𝒜:VV* is a bounded linear operator. It is assumed that the operator 𝒜 is positive definite and invertible, so that problem (1.1) is well posed. However, 𝒫 is viewed as a very difficult problem because 𝒜 is generated by a complicated physical structure, which may contain a huge amount of details. Therefore, attempts to solve (1.1) numerically by standard methods may lead to enormous expenditures. Similar difficulties arise if we wish to verify the quality of a numerical solution.

Assume that the operator 𝒜 is approximated by a simplified positive definite operator 𝒜 and the inversion of 𝒜 is much simpler than the inversion of 𝒜. By means of 𝒜, we construct an iteration method based on solving a “simple” problem 𝒫0: 𝒜u=g. In other words, the method is based on the operation g𝒜-1g. It also includes the operation v𝒜v, which can be performed very efficiently by tensor-type decomposition methods provided that physical structures generated 𝒜 have low-rank representations. We prove that iterations generate a sequence of functions converging to the exact solution of (1.1) with a geometrical rate. Furthermore, we deduce explicitly computable and guaranteed a posteriori error estimates adapted to this class of problems. They evaluate the accuracy of approximations computed on each step of the iteration algorithm. These estimates also use only inversion of 𝒜 and operations of the type v𝒜v. In the iteration methods and error estimates inversion of the operator A is avoided.

In this paper, we consider one class of problems associated with divergent type elliptic equations where 𝒜=Q*ΛQ and 𝒜=Q*ΛQ. Here Λ:YY is a bounded operator induced by a complicated quasi-periodic structure while Q:VY and Q*:YV* are conjugate operators, i.e.,

(y,Qw)=Q*y,wfor all yY and wV,

where Y is a Hilbert space with the scalar product (,) and the norm . The operators Q and Q* are induced by differential operators or certain finite-dimensional approximations of them. Henceforth, it is assumed that f𝒱, where 𝒱 is a Hilbert space with the scalar product (,)𝒱. This space is intermediate between V and V*, i.e., V𝒱V*.

The operator 𝒜=Q*ΛQ contains the operator Λ generated by a simplified structure. We assume that the operators Λ and Λ are Hermitian (i.e., (Λy,z)=(y,Λz) and (Λy,z)=(y,Λz)) and satisfy the conditions

λy2(Λy,y)λy2for all yY,
λy2(Λy,y)λy2,λ<λ.

Then, the structural operators Λ and Λ are spectrally equivalent:

(1.2)c1(Λy,y)(Λy,y)c2(Λy,y),

where the constants are the minimal and maximal eigenvalues of the generalized spectral problem Λy-μΛy=0. Obviously, they satisfy the estimates c1λ/λ and c2λ/λ (which may be rather coarse).

Concerning the operator Q, we assume that there exists a positive constant c such that

QwcwVfor all wV.

Generalized solutions of the problems 𝒫 and 𝒫0 are defined by the variational identities

(1.3)(ΛQu,Qw)=f,wfor all wV,

and

(1.4)(ΛQu,Qw)=f~,wfor all wV.

In Section 2, we show that a sequence {uk} converging to u in V can be constructed by solving problems (1.4) with specially constructed right-hand sides f~k generated by the residual of (1.3). In proving convergence, the key issue is analysis of the spectral radius of the operator

(1.5)𝔹ρ:=𝕀-ρΛ-1Λ,

and selection of such relaxation parameter ρ that provides the best convergence rate. Moreover, iteration procedures of such a type become contracting if the iteration parameter is properly selected. This fact is often used in proving analytical results (see, e.g., [20], where classical results on existence and uniqueness of a variational inequality are established by contraction arguments). Also, these ideas were used in the construction of various numerical methods (see, e.g., [7]). However, achieving our goals requires more than the fact of contraction. We need explicit and realistic estimates of the contraction factor (which are used in error analysis) and a practical method of finding Λ with minimal q. The latter task leads to a special optimization problem that defines the most efficient “simplified” operator Λ among a certain class of “admissible” operators. This question is studied in Section 3. In general, Λ and Λ can be induced by scalar, vector, and tensors functions. We show that selection of the optimal structural operator Λ is reduced to a special interpolation type problem, which is purely algebraical and does not require solving a differential problem (therefore a suitable Λ can be found a priori). We discuss several examples and suggest the corresponding optimal (or quasi-optimal) Λ, which guarantees convergence of the iteration sequence with explicitly known contraction factor.

Now, it is worth discussing the main differences between our approach and the classical homogenization method developed for regular periodic structures. This method operates with a homogenized boundary value problem Q*ΛHQuH=f, where ΛH is defined by means of an auxiliary problem with periodical boundary conditions in the cell of periodicity. The respective solution uH contains an irremovable (modeling) error depending on the cell diameter ϵ. Moreover, if ϵ tends to zero, then typically uH converges to u only weakly (e.g., in L2). Getting a better convergence (e.g., in H1) requires certain corrections, which lead to other (more complicated) boundary value problems in the cell of periodicity. The respective “corrected” solution uHc also contains an error. Typically, the error is proportional to ϵ and can be neglected only if the amount of cells is very large. If our method is applied to perfectly periodical structures then setting Λ:=ΛH is one possible option. In this case, the homogenized operator (defined without correction procedures) is used for a different purpose: construction of a suitable preconditioning operator. The latter operator generates numerical solutions converging to the exact solution in the energy norm (i.e., the method is free from irremovable errors) and can be applied for a rather wide range of ϵ. In addition, the theory suggests other simpler ways of selecting suitable Λ. In this context, it is interesting to know whether or not the choice Λ:=ΛH always yields the minimal value of the contraction factor. In Section 3, we briefly discuss this question and present an example of that the best Λ may differ from ΛH.

In Section 4, we deduce a posteriori estimates that provide fully computable and guaranteed estimates of the distance to the exact solution u for any numerical approximation uk,h computed for an approximation subspace Vh. These estimates are established by combining functional type a posteriori estimates (see [25, 22, 26] and references cited therein) and estimates generated by the contraction property of the iteration method (see [24, 29]).

The second part of the paper is devoted to a fast solution method for the basic iteration problem (2.1). The key idea consists of using tensor-type representations for approximations, what is quite natural if both coefficients of the respective quasi-periodic structure and the right-hand side admit low-rank tensor-type representations. We notice that the amount of structures representable in terms of low-rank formats is much larger than the amount of periodic structures covered by the homogenization method. The idea of tensor-type approximations of partial differential equations traces back to [9]. In computational mechanics this method is known as the Kantorovich–Krylov (or extended Kantorovich) method. However, it is rarely used in modern numerical technologies. In part, this is due to restrictions on the shape of the domain imposed by the Kantorovich method. Henceforth, we assume that the domain Ω satisfies these restrictions, i.e., it is a tensor-type domain (e.g., rectangular) or a union of tensor-type domains. This assumption induces certain geometrical limitations. However, they can be bypassed by such methods as coordinate transformation and domain decomposition, which are widely used in modern computational mathematics (e.g., in iso-geometric analysis).

The recent tensor numerical methods (for steady state and dynamical problems) based on the advanced nonlinear tensor approximation algorithms have been developed in the last ten years. Literature survey on the modern tensor numerical methods for multi-dimensional PDEs can be found in [14, 16, 13]. In the context of problems considered in the paper, we are mainly concerned with another specific feature: very complicated material structure. In this case, direct application of standard finite element methods suffers from the necessity to account huge information encompassed in coefficients (especially in multi-dimensional problems). We show that tensor-type methods allow us to reduce computations to a collection of one-dimensional problems, which can be solved very efficiently using low-rank representations with the small storage requests. Similar ideas are applied for computing a posteriori error estimates.

Section 5 discusses numerical aspects of the method and exposes several examples. Typical behavior of quasi-periodic coefficients is described by oscillation around a constant, modulated oscillation around a given smooth function, or oscillation around a piecewise constant function.

Figure 1

Examples of periodic and modulated periodic coefficients in 1D.

Figure 2

An example of modulated piecewise periodic coefficients in 2D.

Figure 1 (1D case) represents examples of highly oscillating (left) and modulated periodic coefficients (right) functions. Figure 2 (2D case) illustrates the well-separable equation coefficient obtained by a sum of step-type and uniformly oscillating functions, namely,

a(x1,x2)=C0sign(x1)+C1+sin(πωa(x1+x2)),

where C0=5, C1=6, a=10, and ω=12.

We show that specially constructed FEM type approximations of PDEs with slightly perturbed or regularly modulated periodic coefficients on d-fold n××n tensor grids in d may lead to the discretized algebraic equations with the low Kronecker rank stiffness matrix of size nd×nd, where n=O(1ϵ) is proportional to the large frequency parameter 1ϵ. In this case, the rank decomposition with respect to the d spacial variables is applied, such that the discrete solution can be calculated in the low-rank separable form, which requires storage size of only O(dn) instead of O(nd)=O(1ϵd) complexity representations which are mandatory for the traditional FEM techniques (the latter quickly leads to the bottleneck in case of small parameter ϵ>0).

The arising linear system of equations can be solved by preconditioned iteration with the simple preconditioner Λ, such that the storage and numerical costs scale almost linearly in the univariate discrete problem size n. Numerical examples in Section 5 demonstrate the stable geometric convergence of the preconditioned CG (PCG) iteration with the preconditioner Λ and confirm the low-rank approximate separable representation to the solution with respect to d spacial variables even in the case of complicated quasi-periodic coefficients.

This approach is well suited for applying the quantized-TT (QTT) tensor approximation [15] to functions discretized on large tensor grids of size proportional to the frequency parameter, i.e., n=O(1ϵ), as it was demonstrated in the previous paper [17] for the case d=1. The use of tensor-structured preconditioned iteration with the adaptive QTT rank truncation may lead to the logarithmic complexity in the grid size, O(logpn), see [14, 16, 23] for the rank-truncated iterative methods, [11, 12, 4, 10] for various examples of the QTT tensor approximation to lattice structured systems, and [2] for tensor approximation of complicated functions with multiple cusps in d.

In Section 6, we conclude with the discussion on further perspectives of the presented approach for 2D and 3D elliptic PDEs with quasi-periodic coefficients.

2 The Iteration Method

Let vV and ρ+. Consider the problem: find uv such that

(2.1)(ΛQuv,Qw)=v(w)-ρv(w)for all wV,

where

v(w):=(ΛQv,Qw)-f,wandv(w):=(ΛQv,Qw).

Obviously, the right-hand side of (2.1) is a bounded linear functional on V, so that this problem has a unique solution uv. Thus, we have a mapping Tρ:VV, which becomes a contraction if the parameter ρ is properly selected. Indeed, for any v1 and v2 in V, we obtain

(ΛQη,Qw)=(ΛQζ-ρΛQζ,Qw)for all wV,

where u1=Tρv1, u2=Tρv2, ζ:=v1-v2, and η:=u1-u2. Hence

η2:=(ΛQη,Qη)=(ΛQζ,Qη)-ρ(ΛQζ,Qη)
=(Qζ,ΛQη)-ρ(Λ-1ΛQζ,ΛQη)=(Qζ-ρΛ-1ΛQζ,ΛQη)
(2.2)η(ΛQζ-ρΛQζ,Qζ-ρΛ-1ΛQζ)1/2.

From (2.2) we find that

η2(ΛQζ,Qζ)-2ρ(ΛQζ,Qζ)+ρ2(Λ-1ΛQζ,ΛQζ)
=(Qζ,ΛQζ)-2ρ(Λ-1ΛQζ,ΛQζ)+ρ2(Λ-1ΛΛ-1ΛQζ,ΛQζ)
=((𝕀-2ρΛ-1Λ+ρ2Λ-1ΛΛ-1Λ)Qζ,ΛQζ)=(Λ𝔹ρ2Qζ,Qζ)
(2.3)(Λ𝔹ρ2Qζ,𝔹ρ2Qζ)1/2(ΛQζ,Qζ)1/2,

where 𝔹ρ is defined by (1.5). If ρ is selected such that

(2.4)(Λ𝔹ρ2Qζ,Qζ)q2ζ2for some q<1,

then (2.3) shows that Tρ is a contractive mapping.

It is not difficult to show that ρ satisfying (2.4) can be always found. Indeed, in view of (1.2),

(Λ𝔹ρ2Qζ,Qζ)=(ΛQζ,Qζ)-2ρ(ΛQζ,Qζ)+ρ2(ΛΛ-1ΛQζ,Qζ)
(2.5)(1-2ρc1)(ΛQζ,Qζ)+ρ2(Λ-1ΛQζ,ΛQζ).

Since Λ and Λ are invertible with trivial kernels, μ and yμ are an eigenvalue and the respective eigenfunction of Λyμ=μΛyμ if and only if they are an eigenvalue and the eigenfunction of the problem ΛΛ-1Λyμ=μΛyμ. This means that

c1(Λy,y)(ΛΛ-1Λy,y)c2(Λy,y)c22(Λy,y).

Hence

(Λ-1ΛQζ,ΛQζ)c22ζ2

and (2.5) implies

(Λ𝔹ρ2Qζ,Qζ)(1-2ρc1+ρ2c22)ζ2.

The minimum of the expression in round brackets is attained if ρ=ρ*:=c1/c22. For ρ=ρ*, we find that

q*2:=1-c12c22q^2:=1-λ2λ2λ2λ2[0,1).

Hence Tρ is a contractive mapping with explicitly known contraction factor q*. Well-known results in the theory of fixed points (see, e.g., [29]) yield the following theorem.

Theorem 2.1.

For any u0V and ρ=ρ*, the sequence {uk}V of functions satisfying the relation

(2.6)(ΛQuk,Qw)=(ΛQuk-1,Qw)-ρ((ΛQuk-1,Qw)-f,w)for all wV

converges to u in V and uk-uq*ku0-u as k+.

Remark 2.2.

From (2.3) we obtain

η21λ0,min𝔹ρ2Qζζ𝔹ρ2λ0,min2ζ2.

This relation yields a simple (but not very sharp) estimate of the contraction factor.

For further analysis, it is convenient to estimate the right-hand side of (2.3) by a different method. Let 𝔹ρ denote the operator norm

(2.7)𝔹ρ:=supyY𝔹ρyy.

Then 𝔹ρy𝔹ρy and

(Λ𝔹ρ2y,y)𝔹ρ2y2.

Hence (2.3) yields the estimate

η𝔹ρζ,

which shows that Tρ is a contraction provided that

(2.8)𝔹ρ<1.

In applications 𝔹ρ is a self-adjoint bounded operator acting in a finite-dimensional space, so that verification of this condition amounts to finding ρ which yields the respective spectral radius of 𝔹ρ (see Section 4).

3 Selection of Λ∘

In this section, we discuss how to select Λ in order to minimize q, which is crucial for two major aspects of quantitative analysis: convergence of the iteration method and guaranteed a posteriori estimates. We assume that V, 𝒱, and Y are spaces of functions defined in a Lipschitz bounded domain Ω (namely y(x)𝕋 for a.e. xΩ where 𝕋 may coincide with , d, or 𝕄d×d) and the operators Λ and Λ are generated by bounded scalar functions, matrices or tensors. In this case,

(Λy,y):=ΩΛ(x)yy𝑑xand(Λy,y):=ΩΛ(x)yy𝑑x,

where denotes the respective product of scalar, vector, or tensor functions. In view of (2.7) and (2.8), the value of ρ should minimize the quantity supyY(Λ𝔹ρy,𝔹ρy)/(Λy,y). This procedure yields the contraction factor

q2=𝒬(Λ,Λ):=infρsupyYΩΛ(x)𝔹ρ(x)y𝔹ρ(x)y𝑑xΩΛ(x)yy𝑑x,

whose computation is reduced to solving algebraic problems at a.e. xΩ, i.e.,

𝒬(Λ,Λ):=infρsupxΩsupτ𝕋Λ(x)𝔹ρ(x)τ𝔹ρ(x)τΛ(x)ττ.

Let 𝕊 be a certain set of “simple” operators defined a priori (e.g., it can be a finite-dimensional set formed by piecewise constant or polynomial functions). Then, finding the best “simplified” operator amounts to solving the following problem: find Λ^𝕊 such that 𝒬(Λ,Λ^) is minimal. In other words, the optimal Λ^ is defined by the problem

(3.1)infΛ0𝕊ρsupxΩτ𝕋Λ(x)𝔹ρ(x)τ𝔹ρ(x)τΛ(x)ττ=q2.

Notice that (3.1) is an algebraic problem, which should be solved (analytically or numerically) before computations. The respective solution Λ^ defines the best operator to be used in the iteration method (2.6) and yields the respective contraction factor. Below we discuss some particular cases, where analysis of this problem generates an optimal (or almost optimal) Λ.

Problem (3.1) is explicitly solvable if Λ and Λ have a special structure, namely,

Λ=a(x)𝕀,Λ=a(x)𝕀,

where 𝕀 is the unit operator and a(x) and a(x) are positive bounded functions defined in Ω. Then,

𝔹ρ(x)=(1-𝐡(x))𝕀,𝐡(x):=a(x)a(x)

and

supτ𝕋(1-ρ𝐡(x))2ττ|τ|2=|1-ρ𝐡(x)|2for all xΩ.

Define 𝐡:=minxΩ𝐡(x) and 𝐡:=maxxΩ𝐡(x). It is not difficult to show that

supxΩ|1-ρ𝐡(x)|=max{|1-ρ𝐡|,|1-ρ𝐡|}.

Minimization with respect to ρ yields the best value ρ*=2𝐡+𝐡 and the respective value

(3.2)𝒬(Λ,Λ)=(𝐡-𝐡𝐡+𝐡)2=(1-𝒥(a,a)1+𝒥(a,a))2<1,𝒥(a,a)=𝐡𝐡.

In accordance with (3.1), the identification of the optimal simplified problem is reduced to the problem

(3.3)supa0𝕊𝒥(a,a),

where 𝕊 is a given set of functions.

We illustrate the above relations by means of several examples.

Example 3.1 (Constant Coefficients).

In the simplest case, we set 𝕊=P0, i.e., a0 is a constant. From (3.3) it follows that q=(a¯-a¯)/(a¯+a¯), where a¯:=minxΩa(x) and a¯:=maxxΩa(x). Then ρ*=2a0/(a¯+a¯) and the iteration procedure (2.6) with ρ=ρ* has the form

ΩQukQw𝑑x=Ω(1-2aa¯+a¯)Quk-1Qw𝑑x+2a¯+a¯Ωfw𝑑x.

From Theorem 2.1, it follows that

Ω|Q(uk-u)|2𝑑xC(a¯-a¯a¯+a¯)2k.

Example 3.2 (Oscillation Around a Given Function).

Consider a somewhat different example. Let a(x) be a function oscillating around a certain mean function g(x) so that

a(x)g(x)[1-ϵ,1+ϵ],ϵ(0,1).

If g is a relatively simple function, then it is natural to set a(x)=g(x). By (3.2), we find that 𝐡=1+ϵ, 𝐡=1-ϵ, and q=ϵ. Hence the method is very efficient for small ϵ (i.e., if a oscillates around g with a relatively small amplitude). Figures 1 and 2 illustrate three examples of quasi-periodic coefficients a and respective a corresponding to the case of oscillation around a constant with smooth modulation, oscillation around a given smooth function, or oscillation around a piecewise constant function.

Example 3.3 (Piecewise Constant Coefficients).

Consider a more complicated case, where Ω is divided into N nonoverlapping subdomains Ωi and Λ(x)=ci𝕀 if xΩi. Define the numbers

a(i)=minxΩia(x),a(i)=maxxΩia(x),
𝐡=min{a(1)c1,a(2)c2,,a(N)cN},𝐡=max{a(1)c1,a(2)c2,,a(N)cN}.

Since the constants ci are defined up to a common multiplier, we can without a loss of generality assume that

(3.4)i=1(N)λi=1,where λi=1ci.

In accordance with (3.3), the maximum of 𝒬(Λ,Λ) is attained if

(3.5)min{λ1a(1),λ2a(2),,λNa(N)}max{λ1a(1),λ2a(2),,λNa(N)}max,

where λi>0 and satisfy (3.4). If N=2, then problem (3.5) has a simple solution, which shows that the ratio λ1/λ2 (i.e., c2/c1) can be any in the interval [ξ1,ξ2], where

ξ1=min{a(2)a(1),a(2)a(1)},ξ2=max{a(2)a(1),a(2)a(1)}.

It is interesting to compare these results with those generated by homogenized models in the case of perfectly periodic structures. For this purpose, we consider a simple one-dimensional problem

(au)-f=0in (0,1)

with

a(x)=a(1)(x)in Ω1=(0,β),β(0,1),
a(x)=a(2)(x)in Ω2=(β,1),

where a(1)(x) is a perfectly periodical function attaining only two values a(1) and a(1). The Lebesgue measure of the set where a(x)=a(1) is κ1|Ω1|, κ1(0,1). Analogously, a(2)(x) is a periodical function attaining only two values a(2) and a(2). The Lebesgue measure of the set where a(x)=a(2) is κ2|Ω2|, κ2(0,1). We assume that a(i)<a(2), i=1,2 and the amount of periods is very large. Then, the homogenization method can be successfully applied. The corresponding homogenized problem has the coefficients (see, e.g., [8])

aH(1):=(1β0β1a(1)(x)𝑑x)-1in Ω1,aH(2):=(11-ββ11a(2)(x)𝑑x)-1in Ω2.

It is easy to see that

aH(1)=a(1)a(1)κ1a(1)+(1-κ1)a(1)(a(1),a(1)),aH(2)=a(2)a(2)κ2a(2)+(1-κ2)a(2)(a(2),a(2)).

Hence

aH(2)aH(1)(ξ1H,ξ2H),where ξ1H:=a(2)a(1)<min{a(2)a(1),a(2)a(1)}=ξ1,ξ2H:=a(2)a(1)>ξ2

and (ξ1,ξ2)(ξ1Hξ2H). Therefore, the coefficients aH(1) and aH(2) may not generate the best piecewise constant a, which produces the smallest contraction factor in the iteration procedure (2.6).

4 Error Estimates

4.1 General Estimate

Since Tρ is a contractive mapping, we can use the Ostrowski estimates (see [24, 29, 26]) of the distance between vV and u (the fixed point). The estimates state that

(4.1)v-u{ϵ1+q(ρ),ϵ1-q(ρ)},where ϵ:=Tρv-v.

This estimate cannot be directly applied because vρ:=Tρv is generally unknown (it is the exact solution of a boundary value problem). Instead, we must use a numerical approximation v~ρ (in our analysis, we impose no restrictions on the method by which the function v~ρV was constructed). Thus, the difference ηρ:=v-v~ρ is a known function and the quantity δρ=ηρ is directly computable. It is easy to see that

(4.2)δρ-v~ρ-vρvρ-vδρ+v~ρ-vρ.

To deduce a fully computable majorant of the norm v~ρ-vρ we use the method suggested in [25, 26]. First, we rewrite (2.1) in the form

(4.3)(ΛQvρ,Qw)=(ΛQv,Qw)-ρ((ΛQv,Qw)-f,w).

For any yY and wV0, we have

(4.4)(ΛQ(vρ-v~ρ),Qw)=(ΛQ(v-v~ρ),Qw)-ρ((ΛQv,Qw)-f,w)
=(ΛQ(v-v~ρ)-ρΛQv+y,Qw)-Q*y-ρf,w.

We estimate the first term on the right-hand side of (4.4) as follows:

(ΛQ(v-v~ρ)-ρΛQv+y,Q(vρ-v~ρ))=(Q(v-v~ρ)-ρΛ-1ΛQv+Λ-1y,ΛQ(vρ-v~ρ))
(ΛQ(v-v~ρ)+τy,Q(v-v~ρ)+Λ-1τy)1/2vρ-v~ρ,

where τy:=y-ρΛQv. The second term meets the estimate

Q*y-ρf,vρ-v~ρQ*y-ρfvρ-v~ρ1(λ)1/2Q*y-ρfvρ-v~ρ,

where w*=supwVw*,w/w is the dual norm. Hence

(4.5)vρ-v~ρ(ΛQηρ+τy,Qηρ+Λ-1τy)1/2+1λQ*y-ρf=:M(ηρ,τy).

Notice that

infyYM(ηρ,τy)=vρ-v~ρ.

Indeed, set y=ΛQ(vρ-v)+ρΛQv (in this case, τy=ΛQ(vρ-v)). In view of (4.3), Q*y-ρf,w=0, and the majorant is equal to vρ-v~ρ2. Thus, estimate (4.5) has no irremovable gap and a properly selected y yields a sharp upper bound of the error.

Remark 4.1.

It is not difficult to show that the last term of M(ηρ,τy) can be estimated via an explicitly computable quantity provided that y has the same regularity as the true flux (see [26]). However, in our subsequent analysis Q*y-ρf=0 and these advanced forms of the majorant are not required. In this case, the majorant has a simpler form:

M2(ηρ,τy)=(ΛQηρ,Qηρ)+(Λ-1τy,τy)+2(Qηρ,τy).

It is important that the computation of the majorant M does not require inversion of the operator Λ associated with a complicated quasi-periodic problem.

Now, (4.1), (4.2), and (4.5) yield the following result.

Theorem 4.2.

The error e=v-u is subject to the estimate

(4.6)e[max{0,δρ-M(ηρ,τy)1+q(ρ)},δρ+M(ηρ,τy)1-q(ρ)],

where M is defined by (4.5), τy:=y-ρΛQv, and y is a function in Y.

Remark 4.3.

Here ηρ and δρ are directly computable and q(ρ) is defined in accordance with relations presented in the previous section. Hence the cost of (4.6) is mainly related to the quantity M(ηρ,τy), which is an a posteriori error majorant of the functional type. The derivation of such estimates is performed by purely functional methods and does not exploit special features of approximations (e.g., Galerkin orthogonality), numerical method, and exact solution (e.g., extra regularity). Properties of the majorants are well studied (see [25, 26] and the literature cited therein). Numerous tests performed for different boundary value problems have confirmed high practical efficiency of error majorants of the functional type. It was shown that M is a guaranteed and efficient majorant of the global error and generates good indicators of local errors if y is replaced by a certain numerical reconstruction of the exact dual solution. There are many different ways to obtain suitable reconstructions with minimal expenditures (concerning this point we refer to [21] where the reader will find a systematic discussion of computational aspects in the context of various boundary value problems). Error majorants of this type can be also used for the evaluation of modeling errors (see [28, 27]).

Usually, the cost of a good estimate (with the efficiency index between 1 and 2) is comparable with the cost of a numerical solution. However, the proportion essentially depends on the numerical method used. For the classical FEM schemes the expenditures are maximal (because this method generates rather coarse approximations of fluxes). For the dual mixed method, finite volume method, isogeometric approximations, and other methods producing locally equilibrated fluxes, the expenditures may be two to three times smaller than for the numerical solution.

4.2 Examples

Now we shortly discuss applications of Theorem 4.2 to problems where Q and Q* are defined by the operators and div, respectively, Λ=a(x)𝕀, Λ=a(x)𝕀, xΩ, and V=H̊1(Ω).

d=1.

Let Ω=(0,1). Equation (1.1) has the form (a(x)u)-f=0. In this case, Qw=w, Q*y=-y, and (4.3) is reduced to

(4.7)01a(vρ-v)w𝑑x+ρ01(avw+fw)𝑑x=0.

In order to apply Theorem 4.2, we set y=ρ(g(x)+μ), where g(x)=-0xf𝑑x and μ is a constant. Then -y-ρf=0 and τ=ρ(g(x)+μ)-ρav=ρ(μ+g-av). The best constant μ is defined by minimization of M2(ηρ,τ), which has the form

01(a(ηρ)2+a-1ρ2(μ+g-av)2+2ηρρ(μ+g-av))𝑑x

Since 01ηρ𝑑x=0, the problem is reduced to minimization of the second term and the best μ satisfies the equation

01a-1(μ+g(x)-av)𝑑x=0.

Hence

μ=μ¯:=01a-1(av-g)𝑑x01a-1𝑑x,

and (4.6) yields the estimate

(4.8)e[max{0,δρ-I(v,v~ρ)1+q(ρ)},δρ+I(v,v~ρ)1-q(ρ)],

where

I2(v,v~ρ)=01a-1(a(v-v~ρ)+ρ(μ¯+g-av))2𝑑x.

Here v and v~ρ are two consequent numerical approximations (e.g., finite element approximations vk,h and vk+1,h computed on a mesh h). Then

ηρ=ηk,h:=vk,h-vk+1,handδρ=δk,h:=vk,h-vk+1,h

are directly computable. Since a is a “simple” function, the integrals

F1=01a-1𝑑x,F2=01a-1g𝑑x,F3=01a(ηk,h)2𝑑x,
F4=01a-1(μ¯+g)2𝑑x,F5=01(μ¯+g)ηk,h𝑑x

are easy to compute. Other integrals

G1=01a-1avk,h𝑑x,G2=01avk,hηk,h𝑑x,
G3=01(μ¯+g)a-1avk,h𝑑x,G4=01a-1a2(vk,h)2𝑑x

contain a highly oscillating coefficient a multiplied by piecewise polynomial mesh functions. If a and f have low QTT rank tensor representations [15], then the integrals can be efficiently computed by tensor-type methods discussed in [17] (see also Section 5 below). We have

I2(v,v~ρ)=F3+2ρG2+2ρF5+ρ2(F4-2G3+G4)=:εk,h2,μ¯=G1-F2F1.

Notice that the quantity εk,h is the value of the majorant, where the flux has been selected in the best way. It is not difficult to show that

a(vρ-v)=ρ(g+μ¯)-av

and, therefore,

I2(v,v~ρ)=01a(vρ-v~ρ)2𝑑x.

In other words, this term coincides with the error of the Galerkin solution related to the simplified boundary value problem (4.7), where v=vk,h. In accordance with Section 3, we set ρ=2/(𝐡+𝐡) and find that q=(𝐡-𝐡)/(𝐡+𝐡). Now (4.8) yields easily computable lower and upper bounds of the error encompassed in vk,h:

(4.9)δk,h-εk,h1+qvk,h-uδk,h+εk,h1-q.

Remark 4.4.

It is worth adding comments on convergence properties of the quantities δk,h and εk,h entering (4.9). If the mesh 𝒯h is fixed, then vk,h tends to the Galerkin approximation uh of problem (1.1) on this mesh. This fact follows from Theorem 2.1 applied to the case where the iterations are performed on the respective finite-dimensional space Vh (considered as the space V). Then, vk,h-uhqkv0,h-uh and for any h the term δk,h tends to zero with the geometric rate. The quantity εk,h is equal to the error of the Galerkin solution to the simplified problem. It has different asymptotic properties. It mainly depends on 𝒯h, and for a given mesh it does not tend to zero when k+. However, for any given v (which in our example is defined by vk,h) this term goes to zero if h0 provided that the mesh satisfies the standard regularity conditions. The problem with a is assumed to be much more regular than the problem with a. Therefore, in terms of h the approximation error εk,h (associated with a) will decrease faster than the analogous error in the original problem (e.g., for a=const, the term εk,h is proportional to h).

Since both quantities δk,h and εk,h are explicitly known, estimate (4.9) (and other analogous estimates) contains a very useful information unavailable in the context of purely asymptotic error analysis. Using this information, we can organize the computational process in the best possible way by comparing iteration and discretization errors. In this process, rapidly converging iterations with respect to k should be continued until δk,h>εk,h. If δk,hεk,h, then further iterations on the mesh 𝒯h are unable to essentially improve the numerical solution. Instead, we should refine 𝒯h, project uk,h on the refined mesh, and use it as a starting point for a new series of iterations generated by problem (4.7). For each step, we compute the right-hand side of (4.9) and stop the process when it becomes smaller than the desired tolerance.

d=2.

The computation of M for 2D problems can be also reduced to the computation of one-dimensional integrals. Certainly, on the multidimensional case the amount of integrals is much larger. However, the basic tensor decomposition methods remain the same. Below we briefly discuss them with the paradigm of a simple case where

f=f(1)(x1)f(2)(x2)anda=a(1)(x1)a(2)(x2).

Assume that approximations are represented in the form of series formed by one-dimensional functions ϕi(1) and ϕj(2) (which may be supported locally or globally), so that

v=i=1n1j=1n2γijϕi(1)(x1)ϕj(2)(x2),v~ρ=i=1n1j=1n2γ~ijϕi(1)(x1)ϕj(2)(x2).

In this case,

ηρ=(i=1n1j=1n2ςijϕi(1)x1ϕj(2),i=1n1j=1n2ςijϕi(1)ϕj(2)x2),where ςij=γij-γ~ij.

We define another set of one-dimensional functions Wk(1)(x1) and Wl(2)(x2), which form the vector function

(4.10)y=Υ0+k=1m1l=1m2σklΥkl,Υkl={Wk(1)Wl(2)x2;-Wk(1)x1Wl(2)}.

Here Υ0 is a given function, which can be defined in different ways. In particular, we set

Υ0={W0(1)(x1)W0(2)(x2);0},W0(1)(x1)=0x1f(1)𝑑x1,W0(2)=-ρf(2).

The functions Υkl must satisfy the usual linear independence conditions in order to guarantee unique solvability of the respective approximation problem. For any smooth function w vanishing on Ω, we have

Ω(Υ0w-ρfw)𝑑x1𝑑x2=0andΩΥklwdx1dx2=0.

Thus, Q*y-ρf=0 and we can use the simplified form of M.

In the simplest case Λ=a𝕀, where a is a constant. The best y minimizes the quantity

M2(ηρ,τ)=Ωaηρηρdx+Ωa-1yy𝑑x+ρ2Ωa-1a2vvdx
(4.11)-2Ω(ρa-1av+ηρ)y𝑑x+2ρΩaηρηρdx,

which shows that y must satisfy the relation y=ρav+aηρ. We select σkl that defines the Galerkin approximation of this function, and we arrive at the system

k=1m1l=1m2σklΩΥklΥst𝑑x1𝑑x2+ΩΥ0Υst𝑑x1𝑑x2
(4.12)=i=1n1j=1n2Ω(ρaγij+aςij)(ϕi(1)x1ϕj(2),ϕi(1)ϕj(2)x2)Υst𝑑x1𝑑x2.

Introduce the following matrices:

D(1)={Dkl(1)},Dkl(1)=0aWk(1)x1Wl(1)x1𝑑x1,W(1)={Wkl(1)},Wkl(1)=0aWk(1)Wl(1)𝑑x1,
D(2)={Dkl(2)},Dkl(2)=0bWk(2)x2Wl(2)x2𝑑x2,W(2)={Wkl(2)},Wkl(2)=0bWk(2)Wl(2)𝑑x2,
F(1)={Fik(1)},Fik(1)=0aϕi(1)x1Wk(1)𝑑x1,G(1)={Gik(1)},Gik(1)=0aϕi(1)Wk(1)x1𝑑x1,
F(2)={Fjl(2)},Fjl(2)=0bϕj(2)Wl(2)x2𝑑x2,G(2)={Gjl(2)},Gjl(2)=0bϕj(2)x2Wl(1)𝑑x2,
F^(1)={F^ik(1)},F^ik(1)=0aa1(x1)ϕi(1)x1Wk(1)𝑑x1,G^(1)={G^ik(1)},G^ik(1)=0aa1(x1)ϕi(1)Wk(1)x1𝑑x1,
F^(2)={F^jl(2)},F^jl(2)=0ba2(x2)ϕj(2)Wl(2)x2𝑑x2,G^(2)={G^jl(2)},G^jl(2)=0ba2(x2)ϕj(2)x2Wl(1)𝑑x2,

and vectors

𝐠(1)={gk(1)},gk(1)=0aW0(1)Wk(1)𝑑x1,𝐠(2)={gl(2)},gl(2)=0bW0(2)Wl(2)x2𝑑x2.

Notice that all coefficients are presented by one-dimensional integrals, which can be efficiently computed with the help of special (tensor-type) methods (see, e.g., [15, 16, 17, 18, 19]).

It is not difficult to see that

Yklst:=ΩΥklΥst𝑑x=Wks(1)Dlt(2)+Dks(1)Wlt(2)

and

ΩΥ0Υst𝑑x1𝑑x2=ΩW0(1)Ws(1)W0(2)Wt(2)x2𝑑x1𝑑x2=gs(1)gt(2),

where Y={Yklst} is the fourth-order tensor. Hence the left-hand side of the system (4.12) has the form Y𝝈+𝐠(1)𝐠(2). In the right-hand side we have the term

Ωaςij(ϕi(1)x1ϕj(2),ϕi(1)ϕj(2)x2)Υst𝑑x1𝑑x2=aH𝝇,

where H={Hijst}, Hstij=Fis(1)Fjt(2)-Gis(1)Gjt(2). Another term is

Ωρaγij(ϕi(1)x1ϕj(2),ϕi(1)ϕj(2)x2)Υst𝑑x1𝑑x2=H^𝜸,

where H^={H^ijst}, H^stij=F^is(1)F^jt(2)-G^is(1)G^jt(2).

Now (4.12) implies

𝝈=Y-1(H^𝜸+aH𝝇-𝐠(1)𝐠(2))

and the value of M is obtained by (4.6), (4.10), and (4.11).

5 Low-Rank Solution of the Discrete Equation

We consider the following elliptic diffusion equation with quasi-periodic coefficient a(x)>0 (whose oscillations are characterized by the parameter ϵ):

(5.1)𝒜u=-div(a(x)u)=f(x),x=(x1,,xd)Ω=(0,1)2,u|Γ=0,

where the function f corresponds to the modified right-hand side in problem (4.3). In this case Γ=Ω, Λ=a𝕀, Qw=w, and Q*y=-divy.

In what follows we assume that f and a admit low-rank representation i.e.,

f=i=1Rff1i(x1)f2i(x2),a=j=1Raa1j(x1)a2j(x2),

where the parameters Rf and Ra are called the separation rank. Then one may assume that the exact FEM solution can be well approximated by uK(x)=j=1Ku1j(x1)u2j(x2), where K depends on the separation rank of f and a. In some cases this important property can be rigorously proven (e.g., for the Laplacian and other closely related operators). Similar low-rank approximations can be observed for the QTT tensor approximations (see [17]). Existence of a low-rank solution means that for some moderate K we have uKu up to the rank truncation threshold.

First, we sketch the rank-structured computational scheme. In our set of examples the original problem is to find u such that

Ωa(x)uwdx=Ωfw𝑑xfor all wV0:=H01(Ω).

It is approximated by the following Galerkin problem for the low-rank representation uK:

(5.2)Ωa(x)uKwKdx=ΩfwK𝑑xfor all wKV0K,

where V0K is a subset of V0 formed by functions of the type

wK(x)=j=1Kϕ1j(x1)ϕ2j(x2).

Therefore, in terms of the general scheme exposed in the introduction, the problem 𝒫 is now problem (5.2) and we solve it by iterations with the help of the simplified (preconditioned) problem

(5.3)Ωa(x)ukKwKdx=Ωfk-1wK𝑑xfor all wKV0K,

where fk-1 depends on uk-1K and a is a simple function (i.e., it is representable by a sum of terms a1(x1)a2(x2) with very simple multipliers).

Given the right-hand side, problem (5.3) is much simpler than the initial problem and the stiffness matrix associated with (5.3) is computed much easier and has a simpler (low Kronecker rank) form that allows a rank-structured representation of its inverse.

5.1 Kronecker Product Representation of the Stiffness Matrix

Figure 3 illustrates a 2D example of the L×L periodic coefficient with L=6 corresponding to the choice ϵ=1L. In this example, the scalar coefficient is represented by the separable function

a(x)=C+a1(x1)a1(x2),C>0,

where the generating univariate function a1(x1) has the shape of six uniformly distributed bumps of height 1 as shown in Figure 3, right. Figure 3, left, presents the oscillating part of a 2D coefficients function, which is a1(x1)a1(x2). Here, the coefficient bumps are displaced on the coarse grid of size 8L×8L in such a way that bumps occupy the 4×4 central box in each of the 8×8 cells, which compose the whole L×L lattice-type decomposition of Ω (we have L=6 in Figure 3, i.e., the size of the coarse grid is 48×48, while L=12 in Figures 4 and 5, i.e., the size of the coarse grid is 96×96). Hence the axis scale 20,40,60,80 denotes the coarse grid in both x1 and x2 that describes the construction of coefficient in detail.

Figure 3

Example of the 2D periodic oscillating coefficients (left) and the respective 1D factor a1(x1).

The examples of other possible shapes of the equation coefficient corresponding to the cases (i), (ii) and (iii) specified in Section 1 are presented in Figures 1 and 2.

We apply the FEM Galerkin discretization of equation (5.1) by means of tensor-product piecewise affine basis functions (instead of “linear finite elements”)

{φ𝐢(x):=φi1(x1)φid(xd)},𝐢=(i1,,id),i={1,,n},=1,,d,

where φik are 1D finite element basis functions (say, piecewise linear hat functions).

We associate the univariate basis functions with the uniform grid {νj}, j=1,,n, on [0,1] with the mesh size h=1/(n+1). In this construction we have N=n1n2nd basis functions φ𝐢. Notice that the univariate grid size n is of the order of n=O(1ϵ) designating the total problem size N=O(1ϵd).

For ease of exposition we first consider the case d=2, and further assume that the scalar diffusion coefficient a(x1,x2) can be represented in the form

a(x1,x2)=k=1Rak(1)(x1)ak(2)(x2)>0

with a small rank parameter R.

The N×N stiffness matrix is constructed by the standard mapping of the multi-index 𝐢 into the N-long univariate index i representing all degrees of freedom. For instance, we use the so-called big-endian convention for d=3 and d=2:

𝐢i:=i3+(i2-1)n3+(i1-1)n2n3,𝐢i:=i2+(i1-1)n2,

respectively. Hence all matrices and vectors are defined on the long index i as usual, however, the special Kronecker structure allows the low-storage and low-complexity matrix vector multiplications when appropriate, i.e., when a vector also admits the low-rank Kronecker form representation. In particular, the basis function φ𝐢 is designated via the long index, i.e., φi=φ𝐢.

First, we consider the simplest case R=1 and let d=2. We construct the Galerkin stiffness matrix A=[aij]N×N in the form of a sum of Kronecker products of small “univariate” matrices. Recall that given p1×q1 matrix A and p2×q2 matrix B, their Kronecker product is defined as a p1p2×q1q2 matrix C via the block representation

C=AB=[aijB],i=1,,p1,j=1,,q1.

We say that the Kronecker rank of the matrix A in the representation above equals 1. Now the elements of the Galerkin stiffness matrix take the form

aij=𝒜φi,φj=Ωa(1)(x1)a(2)(x2)φi(x)˙φj(x)𝑑x
=01a(1)(x1)φi1(x1)x1φj1(x1)x1𝑑x101a(2)(x2)φi2(x2)φj2(x2)𝑑x2
+01a(1)(x1)φi1(x1)φj1(x1)𝑑x101a(2)(x2)φi2(x2)x2φj2(x2)x2𝑑x2,

which leads to the rank-2 Kronecker product representation

A=[aij]=A1M2+M1A2,

where denotes the conventional Kronecker product of matrices. Here A1=[ai1j1]n1×n1 and A2=[ai2j2]n2×n2 denote the univariate stiffness matrices and M1=[mi1j1]n1×n1 and M2=[mi2j2]n2×n2 define the corresponding weighted mass matrices, e.g.,

ai1j1=01a(1)(x1)φi1(x1)x1φj1(x1)x1𝑑x1,mi1j1=01a(1)(x1)φi1(x1)φj1(x1)𝑑x1.

By simple algebraic transformations (e.g., by lamping of the tri-diagonal mass matrices, which does not effect the approximation order of the FEM discretization) the matrix A can be simplified to the form

(5.4)AA=A1D2+D1A2,

where D1,D2 are the diagonal matrices. The matrix A corresponds to the FEM discretization of the initial elliptic PDE with complicated highly oscillating coefficients.

The simple choice of the spectrally equivalent preconditioner A corresponds to the operator Laplacian. In this case the representation in (5.4) is simplified to the discrete Laplacian matrix in the form of rank-2 Kronecker sum

(5.5)AA=A1I2+I1A2,

where I1 and I2 denote the identity matrices of the corresponding size. Here the simple tree-diagonal matrices A1 and A2 represent the FEM/FDM Laplacian in 1D. This matrix will be used in what follows as a prototype preconditioner for solving the linear system of equations

(5.6)A𝐮=𝐟.

The matrix A is constructed in general for the R-term separable coefficient a(x1,x2) with R1 which leads to the rank-2R Kronecker sum representation

A=k=1R[A1,kD2,k+D1,kA2,k],

with matrices of the respective size.

5.2 On Existence of the Low-Rank Solution

In this paper we discuss the approach based on the low-rank separable ϵ-approximation of the solution to equation (5.6) that is considered as the d-dimensional real-valued array 𝐮n1××nd. In general, for the case R>1 this favorable property is not guaranteed by the low Kronecker rank representation to the Galerkin system matrix A, discussed in Section 5.1.

Let R=1 and d=2. The existence of the low-rank approximation to the solution of equation (5.6) with the low-rank right-hand side

𝐟=k=1Rf𝐟k(1)𝐟k(2),𝐟k()n,

and with the system matrix in the form (5.5) can be justified by plugging the representation (5.5) in the sinc-quadrature approximation to the Laplace integral transform [5]

(5.7)Λ-1=+e-tΛ𝑑tBM:=k=-MMcke-tkΛ=k=-MMcke-tkA1e-tkA2,

taking into account that the matrices A1 and A2 commute with I1 and I2, respectively. Hence equation (5.7) represents the accurate rank-(2M+1) Kronecker product approximation to the preconditioner Λ-1 which can be applied directly to the right-hand side to obtain

𝐮=Λ-1𝐟BM𝐟=k=-MMckm=1Rfe-tkA1𝐟m(1)e-tkA2𝐟m(2).

The numerical efficiency of the representation (5.7) can be explained by the fact that the quadrature parameters tk,ck can be chosen in such a way that the low Kronecker rank approximation BM converges to Λ-1 exponentially fast in M. For example, under the choice tk=ekh, ck=htk with h=π/M there holds [5]

Λ-1-BMCe-βMΛ-1,

in the Frobenius norm, which means that the approximation error ϵ>0 can be achieved with the number of terms RB=2M+1 of the order of RB=O(|logϵ|2).

Figures 4 and 5 demonstrate the singular values of the discrete solution on the n×n grid for n=95,143 and 191, indicating very moderate dependence of the ϵ-rank on the grid size n. As in the case of Figure 3, in Figures 4 and 5 we only represent the oscillating part of the coefficients and omit the small constant C>0.

Figure 4

Rank decomposition of the solution for the 12×12 periodic coefficient.

Figure 5

Rank decomposition of the solution for the 12×12 modulated periodic coefficient.

Further enhancement of the tensor approximation can be based on the application of the quantized-TT (QTT) tensor approximation which has been already applied in [17] to the 1D equations with quasi-periodic coefficients. The power of the QTT approximation method is due to the perfect low-rank decompositions applied to the wide class of function-related tensors [15]. See [17] for a more detailed discussion and a number of numerical examples.

One can apply QTT approximations to problems with quasi-periodic coefficients, which can be described by oscillation with smooth modulation around a constant value, oscillation around a given smooth function, or oscillation around a piecewise constant function, see Figure 1 and examples in [17].

Let the vector 𝐱N, N=2L, be obtained by sampling a continuous function fC[0,1] (or even piecewise smooth functions), on the uniform grid of size N. For the following examples of univariate functions the explicit QTT-rank estimates of the corresponding QTT tensor representations are valid uniformly in the vector size N, see [15]:

1. r=1 for complex exponentials, f(x)=eiωx, ω.

2. r=2 for trigonometric functions, f(x)=sinωx, f(x)=cosωx, ω.

3. rm+1 for polynomials of degree m.

4. For a function f with the QTT-rank r0 modulated by another function g with the QTT-rank r (say, step-type function, plain wave, polynomial) the QTT rank of a product fg is bounded by a multiple of r and r0,

rankQTT(fg)rankQTT(f)rankQTT(g).
5. Furthermore, the following result holds [11]: the QTT rank for the periodic amplification of a reference function on a unit cell to a rectangular lattice is of the same order as that for the reference function.

The rank of the QTT tensor representation to the 1D Galerkin FEM matrix in the case of oscillating coefficients was discussed in [4, 17].

5.3 Numerical Test on the Rank Decomposition of u

Figure 6 represents the right-hand side f1(x1,x2) and the respective solution for the discretization to equation (5.1) (with the coefficient depicted in Figure 3) on a 400×400-grid, where

f1(x1,x2)=sin(2x1)sin(2x2).

The PCG solver for the system of equations (5.6) with the discrete Laplacian inverse as the preconditioner demonstrates robust convergence with the rate q1. The next example demonstrates the rank behavior in the singular value decomposition (SVD) of a matrix representing the solution vector 𝐮n1×n2 to equation (5.6) with the 12×12 periodic coefficient shown in Figure 4, left. Figure 7 represents the rank behavior in the SVD decomposition of the solution in the case of the 8×8 periodic coefficient.

Figure 6

The right-hand side and solution for periodic oscillating coefficients shown in Figure 3.

Comparing Figures 4 and 7 indicates that the exponential decay of the approximation error in the rank parameter is stable with respect to the size of the L×L lattice structure of the coefficient, i.e., the behavior of the singular values remains almost the same for different parameters ϵ=1L.

Figure 7

Accuracy of the rank decomposition of the solution vs. rank parameter for the 8×8 periodic coefficient and grid size n×n.

Our iterative scheme includes only the matrix-vector multiplication with the stiffness matrix A that has the small Kronecker rank 2R, and the action of the preconditioner defined by the approximate inverse to the Laplacian type matrix. The latter has low Kronecker rank of order RB=O(|logε|2) as shown above.

Given rank-1 vector 𝐮=𝐮1𝐮2, the standard property of the Kronecker product matrices

A𝐮=A1𝐮1M2𝐮2+M1𝐮1A2𝐮2

indicates that the matrix-vector multiplication enlarges the initial rank by the factor of 2, and similar with the action of the preconditioner. Hence each iterative step should be supplemented with certain rank truncation procedure which can be implemented adaptively to the chosen approximation threshold or fixed bound on the rank parameter.

Remark 5.1.

Notice that for d=3 the transformed matrix A takes the form

A=A1I2I3+I1A2I3+I1I2A3,

and it obeys the d-term Kronecker sum representation. Hence in the general case of d2 and R1 the Kronecker rank of the matrix A is given by

rankKron(A)=dR.

6 Conclusions

We present a preconditioned iteration method for solving an elliptic type boundary value problem in d with the operator generated by a quasi-periodic structure with rapidly changing coefficients characterized by a small length parameter ϵ. We use tensor product FEM discretization that allows to approximate the stiffness matrix A in the form of a low-rank Kronecker sum. The preconditioner A is constructed based on certain averaging (homogenization) procedure of the initial equation coefficients such that the inversion of A is much simpler than the inversion of A. We prove contraction of the iteration method and establish explicit estimates of the contraction factor q<1. For typical quasi-periodic structures we deduce fully computable two-sided a posteriori estimates which are able to control numerical solutions on any iteration.

We apply the tensor-structured approximation which is especially efficient if the equation coefficients admit low-rank representations and algebraic operations are performed in tensor structured formats. Under moderate assumptions the storage and solution complexity of our approach depends only weakly (merely linear-logarithmically) on the frequency parameter 1ϵ. Numerical tests demonstrate that the FEM solution allows the accurate low-rank separable approximation which is the basic prerequisite for application of the tensor numerical methods to the problems of geometric homogenization.

The approach allows further enhancement based on the quantized-TT (QTT) tensor approximation which is the topic for future research work. Another direction is related to fully tensor structured implementation of the computable two-sided a posteriori error estimates. The interesting question arises how far the presented approach can be extended to the numerical analysis of elliptic equations with rather unstructured jumping coefficients arising in stochastic homogenization, see, e.g., [6].

Acknowledgements

SR appreciates the support provided by the Max-Planck Institute for Mathematics in the Sciences (Leipzig, Germany) during his scientific visit in 2016. The authors are thankful to Dr. V. Khoromskaia (MPI MIS, Leipzig) for the help with numerical experiments.

References

[1] N. S. Bakhvalov and G. Panasenko, Homogenisation: Averaging Processes in Periodic Media. Mathematical Problems in the Mechanics of Composite Materials, Springer, Berlin, 1989. 10.1007/978-94-009-2247-1Search in Google Scholar

[2] P. Benner, V. Khoromskaia and B. N. Khoromskij, Range-separated tensor formats for numerical modeling of many-particle interaction potentials, preprint (2016), http://arxiv.org/abs/1606.09218. Search in Google Scholar

[3] A. Bensoussan, J.-L. Lions and G. Papanicolaou, Asymptotic Analysis for Periodic Structures, North-Holland, Amsterdam, 1978. Search in Google Scholar

[4] S. Dolgov, V. Kazeev and B. N. Khoromskij, The tensor-structured solution of one-dimensional elliptic differential equations with high-dimensional parameters, Preprint 51/2012, MPI MiS, Leipzig, 2012. Search in Google Scholar

[5] I. P. Gavrilyuk, W. Hackbusch and B. N. Khoromskij, Hierarchical tensor-product approximation to the inverse and related operators in high-dimensional elliptic problems, Computing 74 (2005), 131–157. 10.1007/s00607-004-0086-ySearch in Google Scholar

[6] A. Gloria and F. Otto, Quantitative estimates on the periodic approximation of the corrector in stochastic homogenization, ESAIM Proc. 48 (2015), 80–97. 10.1051/proc/201448003Search in Google Scholar

[7] R. Glowinski, J.-L. Lions and R. Trémolierés, Analyse Numérique des Inéquations Variationnelles, Dunod, Paris, 1976. Search in Google Scholar

[8] V. V. Jikov, S. M. Kozlov and O. A. Oleinik, Homogenization of Differential Operators and Integral Functionals, Springer, Berlin, 1994. 10.1007/978-3-642-84659-5Search in Google Scholar

[9] L. V. Kantorovich and V. L. Krylov, Approximate Methods of Higher Analysis, Interscience, New York, 1958. Search in Google Scholar

[10] V. Kazeev, O. Reichmann and C. Schwab, Low-rank tensor structure of linear diffusion operators in the TT and QTT formats, Linear Algebra Appl. 438 (2013), no. 11, 4204–4221. 10.1016/j.laa.2013.01.009Search in Google Scholar

[11] V. Khoromskaia and B. N. Khoromskij, Grid-based lattice summation of electrostatic potentials by assembled rank-structured tensor approximation, Comp. Phys. Commun. 185 (2014), no. 12, 3162–3174. 10.1016/j.cpc.2014.08.015Search in Google Scholar

[12] V. Khoromskaia and B. N. Khoromskij, Tensor approach to linearized Hartree–Fock equation for lattice-type and periodic systems, preprint (2014), https://arxiv.org/abs/1408.3839. Search in Google Scholar

[13] V. Khoromskaia and B. N. Khoromskij, Tensor numerical methods in quantum chemistry: From Hartree–Fock to excitation energies, Phys. Chem. Chem. Phys. 17 (2015), 31491–31509. 10.1039/C5CP01215ESearch in Google Scholar

[14] B. N. Khoromskij, Tensor-structured preconditioners and approximate inverse of elliptic operators in d, J. Constr. Approx. 30 (2009), 599–620. 10.1007/s00365-009-9068-9Search in Google Scholar

[15] B. N. Khoromskij, O(dlogN)-quantics approximation of N-d tensors in high-dimensional numerical modeling, Constr. Approx. 34 (2011), 257–280. 10.1007/s00365-011-9131-1Search in Google Scholar

[16] B. N. Khoromskij, Tensors-structured numerical methods in scientific computing: Survey on recent advances, Chemometr. Intell. Lab. Syst. 110 (2012), 1–19. 10.1016/j.chemolab.2011.09.001Search in Google Scholar

[17] B. N. Khoromskij and S. Repin, A fast iteration method for solving elliptic problems with quasiperiodic coefficients, Russian J. Numer. Anal. Math. Modelling 30 (2015), no. 6, 329–344. 10.1515/rnam-2015-0030Search in Google Scholar

[18] B. N. Khoromskij, S. Sauter and A. Veit, Fast quadrature techniques for retarded potentials based on TT/QTT tensor approximation, Comput. Methods Appl. Math. 11 (2011), no. 3, 342–362. 10.2478/cmam-2011-0019Search in Google Scholar

[19] B. N. Khoromskij and G. Wittum, Numerical Solution of Elliptic Differential Equations by Reduction to the Interface, Lect. Notes Comput. Sci. Eng. 36, Springer, Berlin, 2004. 10.1007/978-3-642-18777-3Search in Google Scholar

[20] J.-L. Lions and G. Stampacchia, Variational inequalities, Comm. Pure Appl. Math. 20 (1967), 493–519. 10.1002/cpa.3160200302Search in Google Scholar

[21] O. Mali, P. Neittaanmaki and S. Repin, Accuracy Verification Methods. Theory and Algorithms, Springer, New York, 2014. 10.1007/978-94-007-7581-7Search in Google Scholar

[22] P. Neittaanmaki and S. Repin, Reliable Methods for Computer Simulation. Error Control and a Posteriori Estimates, Elsevier, Amsterdam, 2004. Search in Google Scholar

[23] I. V. Oseledets and S. V. Dolgov, Solution of linear systems and matrix inversion in the TT-format, SIAM J. Sci. Comput. 34 (2012), no. 5, A2718–A2739. 10.1137/110833142Search in Google Scholar

[24] A. Ostrowski, Les estimations des erreurs a posteriori dans les procédés itératifs, C. R. Acad. Sci Paris Sér. A–B 275 (1972), A275–A278. Search in Google Scholar

[25] S. Repin, A posteriori error estimation for variational problems with uniformly convex functionals, Math. Comp. 69 (2000), no. 230, 481–500. 10.1090/S0025-5718-99-01190-4Search in Google Scholar

[26] S. Repin, A Posteriori Estimates for Partial Differential Equations, Walter de Gruyter, Berlin, 2008. 10.1515/9783110203042Search in Google Scholar

[27] S. Repin, T. Samrowski and S. Sauter, Combined a posteriori modeling-discretization error estimate for elliptic problems with complicated interfaces, ESAIM Math. Model. Numer. Anal. 46 (2012), no. 6, 1389–1405. 10.1051/m2an/2012007Search in Google Scholar

[28] S. Repin, S. Sauter and A. Smolianski, A posteriori estimation of dimension reduction errors for elliptic problems on thin domains, SIAM J. Numer. Anal. 42 (2004), no. 4, 1435–1451. 10.1137/030602381Search in Google Scholar

[29] E. Zeidler, Nonlinear Functional Analysis and Its Applications. I: Fixed-Point Theorems, Springer, New York, 1986. 10.1007/978-1-4612-4838-5Search in Google Scholar