Optimal convergence rates for goal-oriented FEM with quadratic goal functional

We consider a linear elliptic PDE and a quadratic goal functional. The goal-oriented adaptive FEM algorithm (GOAFEM) solves the primal as well as a dual problem, where the goal functional is always linearized around the discrete primal solution at hand. We show that the marking strategy proposed in [Feischl et al, SIAM J. Numer. Anal., 54 (2016)] for a linear goal functional is also optimal for quadratic goal functionals, i.e., GOAFEM leads to linear convergence with optimal convergence rates.


INTRODUCTION
Let Ω ⊂ R d , d ≥ 2, be a bounded Lipschitz domain. For given f ∈ L 2 (Ω) and f ∈ [L 2 (Ω)] d , we consider a general linear elliptic partial differential equation where A(x) ∈ R d×d , b(x) ∈ R d , and c(x) ∈ R. As usual, we assume that A, b, c ∈ L ∞ (Ω), that A is uniformly positive definite and that the weak form (see (5) below) fits into the setting of the Lax-Milgram lemma. While standard adaptivity aims to approximate the exact solution u ∈ H 1 0 (Ω) at optimal rate in the energy norm (see, e.g., [Dör96,MNS00,BDD04,Ste07,CKNS08] for some seminal contributions and [FFP14] for the present model problem), goal-oriented adaptivity aims to approximate, at optimal rate, only the functional value G(u) ∈ R (also called quantity of interest in the literature). Usually, goal-oriented adaptivity is more important in practice than standard adaptivity and has therefore attracted much interest also in the mathematical literature; see, e.g., [BR03,BR01,EEHJ95,GS02] for some prominent contributions. Unlike standard adaptivity, there are only few works, which aim for a mathematical understanding of optimal rates for goal-oriented adaptivity; see [MS09,BET11,FGH + 16,FPZ16]. While the latter works consider only linear goal functionals, the present work aims to address, for the first time, optimal convergence rates for goal-oriented adaptivity with a nonlinear goal functional. More precisely, we assume that our primary interest is not in the unknown solution u ∈ H 1 0 (Ω), but only in the functional value G(u) := Ku , u H −1 ×H 1 0 (2) with a quadratic goal functional stemming from a bounded linear operator K : H 1 0 (Ω) → H −1 (Ω). Here, · , · H −1 ×H 1 0 denotes the duality between H 1 0 (Ω) and its dual space H −1 (Ω) := H 1 0 (Ω) with respect to the extended L 2 -scalar product. Possible examples for such goal functionals include, e.g., G(u) = Ω g(x)u(x) 2 dx for a given weight g ∈ L ∞ (Ω) or G(u) = Ω u(x) g(x) · ∇u(x) dx for a given weight g ∈ [L ∞ (Ω)] d .
In this work, we formulate a goal-oriented adaptive finite element method (GOAFEM), where the quantity of interest G(u) is approximated by G(u ) for some FEM solution u ≈ u such that (3) Moreover, if K is compact, then the convergence (3) holds even with optimal rates. Outline. The paper is organized as follows: Section 2 formulates the finite element discretization together with two goal-oriented adaptive algorithms, A and B. Moreover, we state our main results: Proposition 1 and Proposition 4 guarantee convergence of Algorithm A and Algorithm B, respectively. If the operator K is even compact, then Theorem 2 yields linear convergence with optimal rates for Algorithm A, while Theorem 5 yields convergence with almost optimal rates for the simpler Algorithm B. Section 3 provides some numerical experiments which underline the theoretical predictions. Sections 4-7 are concerned with the proofs of our main results. (4) According to the assumptions made, a(·, ·) fits into the setting of the Lax-Milgram lemma, i.e., a(·, ·) is continuous and elliptic on H 1 0 (Ω). The weak formulation of (1) reads According to the Lax-Milgram lemma, (5) admits a unique solution u ∈ H 1 0 (Ω). Given w ∈ H 1 0 (Ω), the same argument applies and proves that the (linearized) dual problem admits a unique solution z[w] ∈ H 1 0 (Ω), where we abbreviate the notation by use of b(v, w) := Kv , w H −1 ×H 1 0 . We note that b(·, ·) is, in particular, a continuous bilinear form on H 1 0 (Ω). Throughout, we denote by |||v||| 2 := Ω A∇v · ∇v dx the energy norm induced by the principal part of a(·, ·), which is an equivalent norm on H 1 0 (Ω).

Finite element method.
For a conforming triangulation T H of Ω into compact simplices and a polynomial degree p ≥ 1, we consider the conforming finite element space We approximate u ≈ u H ∈ X H and z[w] ≈ z H [w] ∈ X H . More precisely, the Lax-Milgram lemma yields the existence and uniqueness of discrete 2.3. Linearization of the goal functional. To control the goal error |G(u)−G(u H )|, we employ the dual problem. Note that With the dual problem and the Galerkin orthogonality, we rewrite the second bracket as With continuity of the bilinear forms a(·, ·) and b(·, ·), we thus obtain that 2.4. Mesh-refinement. Let T 0 be a given conforming triangulation of Ω. We suppose that the mesh-refinement is a deterministic and fixed strategy, e.g., newest vertex bisection [Ste08]. For each triangulation T H and marked elements M H ⊆ T H , let T h := refine(T H , M H ) be the coarsest triangulation, where all T ∈ M H have been refined, i.e., M H ⊆ T H \T h . We write T h ∈ T(T H ), if T h results from T H by finitely many steps of refinement. To abbreviate notation, let T := T(T 0 ).
We further suppose that each refined element has at least two sons, i.e., and that the refinement rule satisfies the mesh-closure estimate where C mesh > 0 depends only on T 0 . This has first been proved for 2D newest vertex bisection in [BDD04] and has later been generalized to arbitrary dimension d ≥ 2 in [Ste08]. While both works require an additional admissibility assumption on T 0 , this has been proved unnecessary at least for 2D in [KPP13]. Finally, it has been proved in [CKNS08,Ste07] that newest vertex bisection ensures the overlay estimate, i.e., for all triangulations T H , T h ∈ T, there exists a common refinement For meshes with first-order hanging nodes, (10)-(12) are analyzed in [BN10], while Tsplines and hierarchical splines for isogeometric analysis are considered in [MP15,Mor16] and [BGMP16,GHP17], respectively.

Error estimators.
For T H ∈ T and v H ∈ X H , let We suppose that the estimators η H and ζ H satisfy the following axioms of adaptivity from [CFPP14]: There exist constants C stab , C rel , C drel > 0 and 0 < q red < 1 such that for all T H ∈ T and all T h ∈ T(T H ), the following assumptions are satisfied: (A4) discrete reliability: For all w ∈ H 1 0 (Ω), the Galerkin solutions u H , z H [w] ∈ X H and u h , z h [w] ∈ X h to (8) satisfy that We note that the axioms (A1)-(A4) are satisfied for, e.g., standard residual error estimators. Due to the homogeneous Dirichlet data in (1), for all w ∈ H 1 0 (Ω) there exist functions g[w] ∈ L 2 (Ω) and g ∈ [L 2 (Ω)] d such that With this, the residual error estimators in our case read for v H ∈ X H as ] denotes the jump across edges and n is the outwards-facing unit normal vector.
2.6. Adaptive algorithm. We consider the following adaptive algorithm, which adapts the marking strategy proposed in [FPZ16].
(iii) Determine sets M u , M uz ⊆ T of up to the multiplicative constant C mark minimal cardinality such that With Algorithm B below, we give and examine an alternative adaptive algorithm that is seemingly cheaper in computational costs.
Our first result states that Algorithm A indeed leads to convergence.
To formulate our main result on optimal convergence rates, we need some additional notation. For N ∈ N 0 , let T N := T ∈ T : #T − #T 0 ≤ N denote the (finite) set of all refinements of T 0 , which have at most N elements more than T 0 . For s, t > 0, we define In explicit terms, e.g., u As < ∞ means that an algebraic convergence rate O(N −s ) for the error estimator η is possible, if the optimal triangulations are chosen.
The constants C lin , q lin , and 0 depend only on θ, q red , C stab , C rel , the bilinear form a(·, ·), and the compact operator K. The constant C opt depends only on θ, C mesh , C mark , C lin , q lin , 0 , and (A1)-(A4).
Remark 3. (i) We note that, according to the considered dual problem (6), the goal functional (2) is linearized around u in each step of the adaptive algorithm. Hence, we must enforce that the linearization error satisfies that |||z This is guaranteed by Proposition 1(ii) and Theorem 2(i), since both factors of the product involve the primal error estimator η (u ).
(ii) For a linear goal functional and hence z [u] = z [u ], the work [FPZ16] considers plain ζ 2 (instead of η 2 + ζ 2 ) for the Dörfler marking (13b) and then proves a convergence behavior |G(u) − G(u )| η ζ = O (#T ) −α for the estimator product, where α = s + t with s > 0 being the optimal rate for the primal problem and t > 0 being the optimal rate for the dual problem. Instead, Algorithm A will only lead to O (#T ) −α , where α = min{2s, s + t}; see Theorem 2(ii).
(iii) The marking strategy proposed in [BET11], where Dörfler marking is carried out for the weighted estimator the present results and the analysis in [FPZ16] make it clear that this strategy implies convergence with rate min{2s, s + t}. Details are omitted.
2.7. Alternative adaptive algorithm. From the upper bound (14) in Proposition 1(i), we can further estimate the goal error by This suggests the following algorithm, which marks elements solely based on the combined estimator.
First, we note that Algorithm B also leads to convergence.
Proposition 4. For any bounded linear operator K, there hold the following statements (i)-(ii): (i) There exists a constant C rel > 0 such that (ii) For all 0 < θ ≤ 1 and 1 < C mark ≤ ∞, Algorithm B leads to convergence The constant C rel depends only on the constants from (A1)-(A3), the bilinear form a(·, ·), and the boundedness of K.
The following theorem proves linear convergence of Algorithm B with almost optimal convergence rate, where we note that β ≤ α for the rates in (17) and (24). By abuse of notation we use the same constants as in Theorem 2.
The constants C lin , q lin , and 0 depend only on θ, q red , C stab , C rel , the bilinear form a(·, ·), and the compact operator K. The constant C opt depends only on θ, C mesh , C mark , and (A1)-(A4).
Note that Algorithm B has slightly lower computational costs than Algorithm A, but achieves only a lower rate in general. However, if there holds s ≤ t both algorithms achieve rate 2s.

NUMERICAL EXPERIMENTS
In this section, we underline our theoretical findings by some numerical examples. As starting point of all examples, we use equation (1) with A = I, b = 0, and c = 0 on the unit square Ω = (0, 1) 2 . The initial mesh T 0 on Ω is obtained from certain uniform refinements from the mesh shown in Figure 1. All examples are computed with conforming finite elements of order p = 1 and p = 2, as outlined in Section 2.2.
In the following, we consider the marking strategies of Algorithm A and Algorithm B (denoted by A and B, respectively), as well as the marking strategies outlined in Remark 3(iii), i.e., Dörfler marking for (18) and (19), which will be denoted by BET1 and BET2, respectively. If not stated otherwise, the marking parameter is θ = 0.5 for all experiments.
3.1. Weighted L 2 norm. Suppose g ∈ L ∞ (Ω) some weight function with g ≥ 0 a.e., whose regions of discontinuity are resolved by the initial mesh T 0 (i.e., g is continuous in the interior of every element of T 0 ). Then, we consider the weighted L 2 -norm as goal functional. Note that gu ∈ L 2 (Ω) → H −1 (Ω), where the embedding is compact, so that this functional fits in the setting of Theorem 2 and Theorem 5. We choose Convergence rates of estimator product and goal error for the problem setting in Section 3.1 with p = 1 (left) and p = 2 (right). Note that the lines marked with A, B, and BET2 are almost identical.

Nonlinear convection.
Suppose w ∈ [L ∞ (Ω)] 2 is some vector field, whose regions of discontinuity are resolved by the initial mesh T 0 . As goal functional we consider the nonlinear convection term where the embedding is compact, so that this functional fits in the setting of Theorem 2 and Theorem 5.
We compute the solutions to the primal and the dual problem for f = 0, The sets U 3 := x ∈ Ω : x 1 − x 2 ≥ 0.25 and U 2 := (0.5, 1) × (0, 0.5) are shown in Figure 1. The numerical results are visualized in Figure 3. Note that the primal problem in this case exhibits a singularity which is not induced by the geometry and thus is not present in the dual problem.
3.3. Force evaluation. Let ε > 0 and let ψ be a cut-off function that satisfies For a given direction w ∈ R 2 , consider a goal functional of the form This approximates the electrostatic force which is exerted by an electric potential u on a charged body occupying the domain U 1 in direction w (the part of the integrand in brackets is the so-called Maxwell stress tensor). Note that this functional does not fit in the setting of Theorem 2 and Theorem 5, since K is not compact, i.e., we cannot guarantee optimal rates for our Algorithms A and B. However, Proposition 1 and Proposition 4 still guarantee convergence of our algorithms. For our experiments, we choose w = 1 √ 2 (1, 1) , f = 1, and f = 0. Furthermore, we choose ψ to be in X 0 for p = 1, i.e., ψ is piecewise linear, and ε is chosen such that ψ falls off to 0 exactly within one layer of elements around U 1 in T 0 .
The results can be seen in Figure 3. Figure 4, where we plot estimator product (and, if available, goal error) for different parameters θ = 0.1, 0.2, . . . , 1.0, we see that this behavior does not depend on the marking parameter θ, generally speaking. It is striking that the strategy BET1 with θ < 1 fails to drive down the estimator product at the same speed as uniform refinement. This is likely due to the fact that the linearization error |||z [u] − z [u ]||| is disregarded; see Remark 3.

Discussion of numerical experiments. We clearly see from Figures 2-3 that our Algorithm A and BET2 outperform BET1 and sometimes even Algorithm B. From
In Figure 5, we plot the cumulative costs where error is either the estimator product η (u )ζ (z [u ]), or the goal error |G(u)−G(u )| in the -th step of the adaptive algorithm. We see that for the setting from Section 3.1, where no singularity occurs, optimal costs are achieved by uniform refinement, as is expected. For the goal error, which is not known in general, the strategy BET1 performs better than our Algorithms A and B. However, for the estimator product, which is the relevant quantity in most applications (since the error is unknown), it is inferior. In the other settings, where there is a singularity, our Algorithms A and B achieve their minimal cost around the value 0.7 for the marking parameter θ.

Axioms of adaptivity. Clearly, z[w]
and z H [w] depend linearly on w (since K is linear and hence b(·, ·) is bilinear). Moreover, we have the following stability estimates. Lemma 6. For all w ∈ H 1 0 (Ω) and all T H ∈ T(T 0 ), it holds that where C 1 > 0 depends only on a(·, ·), while C 2 > 0 depends additionally on the boundedness of K.
Proof. The definition of the dual problem shows that Next, we show that the combined estimator for the primal and dual problem satisfies the assumptions (A1)-(A4), where particular emphasis is put on (A3)-(A4). For the ease of presentation (and by abuse of notation), we use the same constants as for the original properties (A1)-(A4), even though they now depend additionally on the bilinear form a(·, ·) and the boundedness of K.
In the following, we recall some basic results of [CFPP14].
Lemma 8 (quasi-monotonicity of estimators [CFPP14, Lemma 3.6]). Let w ∈ H 1 0 (Ω). Let T H ∈ T and T h ∈ T(T H ). The properties (A1)-(A3) together with the Céa lemma guarantee that Moreover, the properties (A1)-(A3) for the combined error estimator together with the Céa lemma show that The constant C mon > 0 depends only on the properties (A1)-(A3) and on the bilinear form a(·, ·) and the boundedness of K.

PROOF OF PLAIN CONVERGENCE OF ALGORITHM A AND B
5.1. Algorithm A. First, we prove the upper bound for the goal error.
Proof of Proposition 1(i). It holds that The hidden constants depend only on the boundedness of a(·, ·) and K and on the constant C rel from A3. According to the Young inequality, this concludes the proof.
Since Algorithm A linearizes the dual problem around the known discrete solution (i.e., it employs z [u ] instead of the non-computable z [u]), a first important observation is that Algorithm A ensures convergence for the primal solution. In particular, the following proposition allows to apply the quasi-orthogonalities from Section 4.2.

Proposition 14 (plain convergence of errors and error estimators).
Suppose (A1)-(A3). Then, for any choice of marking parameters 0 < θ ≤ 1 and C mark ≥ 1, Algorithm A and Algorithm B guarantee that Moreover, at least one of these two cases is met.
Proof. Since the discrete spaces are nested, it follows from the Céa lemma that there exists u ∞ ∈ H 1 0 (Ω) such that see, e.g., [MSV08,AFLP12] or even the early work [BV84]. More precisely, u ∞ is the Galerkin approximation of u with respect to the "discrete limit space" X ∞ := ∞ =0 X , where the closure is taken in H 1 0 (Ω). Analogously, there exists z ∞ ∈ H 1 0 (Ω) such that Together with (38) and Lemma 6, this also proves that In the following, we aim to show that, in particular, u = u ∞ . To this end, the proof considers two cases: 1 There exists a subsequence (T k ) k∈N 0 such that M k satisfies (13a) for all k ∈ N 0 , 2 There exists a subsequence (T k ) k∈N 0 such that M k satisfies (13b) for all k ∈ N 0 . Clearly, (at least) one of these subsequences is well-defined (i.e., there are infinitely many steps of the respective marking).
Case 2 . We repeat the arguments from case 1 . Instead of η H (u H ) 2 , we consider the combined estimator η H (u H ) 2 + ζ H (z H [u H ]) 2 . For all k ∈ N 0 , this leads to As before, basic calculus reveals that In this case, we thus see that u = u ∞ and z[u] = z ∞ as well as estimator convergence η (u ) + ζ (z [u ]) → 0 as → ∞ (following now from Lemma 8). In any case, this concludes the proof.

Algorithm B.
First, we prove the upper bound.
Proof of Proposition 4(i). From Proposition 1(i) it follows that This proves the claim.
with C := max{1, C stab C rel C 1 C 2 }. The quasi-monotonicity of the estimators (Lemma 8) and Choose the minimal N ∈ N 0 such that u As ( u As + z[u] At ) ≤ ε (N + 1) α .
According to the choice of T h and R H , the overlay estimate (12) yields that Overall, we conclude (40) with C aux = C mon /κ opt .

PROOF OF THEOREM 5
In contrast to the corresponding results for Algorithm A, the proof of Theorem 5 (for Algorithm B) follows essentially from the abstract setting of [CFPP14].