Skip to content
Publicly Available Published by De Gruyter November 28, 2015

Error minimizing relaxation strategies in Landweber and Kaczmarz type iterations

  • Touraj Nikazad EMAIL logo , Mokhtar Abbasi and Tommy Elfving

Abstract

We study error minimizing relaxation (EMR) strategies for use in Landweber and Kaczmarz type iterations applied to linear systems with or without convex constraints. Convergence results based on operator theory are given, assuming exact data. The advantages and disadvantages of these relaxation strategies on a noisy and ill-posed problem are illustrated using examples taken from the field of image reconstruction from projections. We also consider combining EMR with penalization.

MSC 2010: 65F10; 65N21; 65N20

1 Introduction

Discrete ill-posed problems occur in many applications, such as deblurring problems, microscopy, medical imaging, and other areas [21, 27]. We are here in particular interested to study the discrete version of the Radon transform used in modeling of several reconstruction problems, [22, 29, 32], which leads generally to a rank deficient and ill-conditioned linear system of equations. Hence some kind of regularization technique must be employed to stabilize the solution process. For large scale problems a useful option is iterative methods. These methods carry an advantage over (the usually faster) filtered back projection, e.g., when the problems are underdetermined. This happens in limited angle applications (e.g., breast X-ray tomography) and few-projection measurements (where the X-ray dose should be limited). Other advantages include the possibility to introduce constraints in the inversion process, and to more easily adapt to new scanner geometries.

When applying an iterative method one may view the iteration index as a regularizing parameter. Initially, the iteration vectors approach a regularized solution while continuing the iteration often leads to iteration vectors corrupted by noise, so-called semi-convergence, Natterer [32, p. 89]. To explain this further let x* be the sought solution using exact data. Moreover, let xk,x¯k denote the iterate using exact data and perturbed data, respectively. Then it holds

x¯k-x*x¯k-xk+xk-x*.

Hence the error is decomposed into two components, the data error and the approximation error. It is the interplay between these two error terms that explains the semi-convergence of the iteration. For more on semi-convergence, see [16, 19, 21, 26, 27, 29] and references therein.

Iterative methods are often of simultaneous or sequential type. Classically, this is associated with Jacobi and Gauss–Seidel iterations. In the inverse problem area these are often referred to as Landweber type and Kaczmarz type of iteration. In the classical use (e.g., when solving discrete partial differential equations [43]) most often only the approximation error is of interest whereas when solving inverse problems both error terms are of interest.

1.1 Our work and related works

The choice of relaxation parameters can have a big impact on the speed of convergence of an iterative method see, e.g., [10, 34], and it may also control the semi-convergence phenomenon [17]. Possible strategies include the use of training data to find a good value of the parameter [8, 36, 42], as well as the use of extrapolation [2, 37]. The purpose here is to define and study a class of relaxation rules that gives the largest reduction of the error in each iterative step.

We consider three algorithmic structures: fully simultaneous (Algorithm 1), row-block simultaneous (Algorithm 2), and row-block sequential (Algorithm 3). All three algorithms contain weight matrices. As demonstrated in Section 2, several well-known methods emerge by proper choice of weights. We define for each algorithm an error functional called EMR (error minimizing residual) and derive its relaxation parameters. The EMR-functional depends on a nonnegative integer s. When s=0,1,2 respectively the norm of the error, the residual and the residual corresponding to the normal equations are minimized in each step. We further extend the algorithms to handle convex constraints, where we assume, for practical reasons, that the projection onto the constraint set can be easily implemented (e.g., box-constraints). We provide convergence results for the approximation error, under different consistency assumptions, for all cases. In particular, we show that the iterates of Algorithm 1 for s1 converge to a weighted least squares solution, whereas the iterates of the block algorithms for s1 converge to a member of (3.10) (which is a nonempty set if, e.g., the underlying linear system is consistent). The iterates of the constrained versions are shown to converge to members of convex feasibility problems as explained in Section 6.

Our analysis is based on operator theory [1, 5, 20]. We will show that the EMR-based algorithmic operators belong to the class of paracontracting operators. Thereby we can utilize known convergence properties of sequences generated by such operators to handle convergence of our algorithms. The orthogonal projection onto a closed convex set is also a paracontraction. We are using this and the fact that a composition of two paracontractions remains paracontracting to handle convergence of the iterates of the constrained versions.

Since we do not deal with the data error, we have instead added a study of the numerical behavior of the algorithms applied on (noisy) problems arising from image reconstruction applications. In the last part we demonstrate the performance of EMR on a total variation-penalized problem using the superiorization technique [4].

We next relate our results with previous research, and first consider the unconstrained case. When s=0 in Algorithm 1 the method is due to De Pierro [13]. For s=1 it becomes the classical steepest descent method applied to the weighted least squares problem. Convergence results relying on that the nullity of the matrix is zero can be found in [40, 41]. Dax [12] using s=1 considers accelerating basic iterative methods (including Kaczmarz’s method) by line search. For s1, and with no weights, Algorithm 1 is a special case of an algorithm given by Friedlander, Martínez, Molina and Raydan [24, (2.6)]. However, there are some differences. In [24] no connection to EMR is made, and their convergence proof also relies on the assumption that the nullity of the matrix is zero. However, this condition cannot be guaranteed in our applications.

Haltmeier [25] introduced a Landweber–Kaczmarz’s block-iteration, assuming that system (2.1) is consistent. Instead of using weight-matrices (as done here) scalar weights ωt0 are used. Since some of the weights may be zero, the scheme includes both simultaneous and sequential block-iteration. The relaxation parameters are of the form ϕ(α), where α is defined in (3.13) (for simultaneous iteration) and in (3.17) (for sequential iteration), in both cases using the value s=1. However, since it is assumed that ϕ(x) is bounded for all x (so ϕ(x)=x is not allowed), this relaxation parameter rule does not include our rules, even for the case s=1. An interesting feature is the use of a loping strategy which omits the iterative step provided the residual is below a certain threshold. Dezaro, Haltmeier, Leitão and Scherzer presented in [14] the (nonlinear) Kaczmarz method using relaxation parameters equal ϕ(λbseq(s)), s=1, where λbseq is defined in (3.17).

For the constrained case. Bauschke, Combettes and Kruk in [2] treat the consistent convex feasibility problem (in a general Hilbert space setting). Their algorithmic structure is of sequential block-iteration type (which as special cases includes the fully simultaneous case). The algorithmic operators belong to the so-called T-class, e.g., orthogonal projections and subgradient projections. To compare their general scheme with ours we consider the special case [2, (3.34)], with their operators now being orthogonal projectors onto hyperplanes in n. This leads to Cimmino weight matrices. The upper bound [2, (3.35)] is the same as in our paper using the value s=0. There are no results using s1.

1.2 Organization

In Section 2 we present the fully simultaneous weighted Landweber iteration, and the two companion block-iterations. In Section 3 the error minimizing relaxation is described, and the corresponding relaxation parameters derived. In Section 4 the algorithms are expressed in a unified way, and the relevant operators are identified. Further convergence results for iterated paracontracting operators taken from [1, 5, 20] are presented. In Section 5 we show that our algorithmic operators are paracontractions. Thereby we can invoke the convergence theory presented in Section 4 to conclude convergence of all three algorithms. These results are summarized in Theorems 4, 7 and Remarks 5, 8. In Section 6 the constrained versions of the algorithms are presented and analyzed. In Section 7 we present the outcome of numerical experiments using examples taken from image reconstruction from projections (tomography). In Section 8 we consider two examples using EMR with TV-penalization, and finally in Section 9 the main results of the paper are summarized.

2 Algorithms

We consider linear system of equations which arise from discretization of an ill-posed problem,

(2.1)Ax=b,Am×n,bm.

We next introduce some notation which will be used throughout the paper. Let R(Q),N(Q) denote the rangespace and nullspace of a matrix Q respectively. The Euclidean inner product is denoted by x,y and x is the corresponding norm. Further for a symmetric positive definite (SPD) matrix M, xM=Mx,x denotes a weighted Euclidean norm. Also M1/2 and ρ(Q) denote the square root of M and the spectral radius of a matrix Q respectively.

Let the linear system (2.1) be partitioned into p blocks of equations, which may contain common equations but each equation should appear in at least one of the blocks. Denote by At and bt the tth block of A and b respectively. Let {λk}k0,{λt}t=1p denote two sets of positive relaxation parameters, and let {Mt}t=1p, M be given SPD weight matrices. Let x0,z0,y0 be given arbitrary vectors in n.

We will study the following three types of iterations.

Algorithm 1

Fully simultaneous:

(2.2)xk+1=xk+λkATM(b-Axk),k=0,1,.
Algorithm 2

Block simultaneous:

(2.3)zk,t=zk+λtAtTMt(bt-Atzk),t=1,,p,
zk+1=1pt=1pzk,t.

For simplicity when forming zk+1 equal weights are used. However, one may also use convex weights without affecting the subsequent analysis.

Algorithm 3

Block sequential:

yk,0=yk,
(2.4)yk,t=yk,t-1+λtAtTMt(bt-Atyk,t-1),t=1,2,,p,
yk+1=yk,p.

Let a cycle denote one pass through the data. So whereas Algorithms 2 and 3 require p iterations to complete a cycle, Algorithm 1 requires one iteration. The cycles of Algorithm 2 can easily be rewritten as a fully simultaneous method. Indeed, with M=diag(λtMt) (a block diagonal SPD matrix) it is easily shown

(2.5)zk+1=zk+1pATM(b-Azk).

We remark that our relaxation strategies allow the relaxation parameters to be updated by the current iterations. We next present some convergence results taken from the literature. Let

(2.6)b¯=M1/2b,A¯=M1/2A,

and consider the weighted least squares problem

(2.7)minf(x),f(x)=12b-AxM2=12b¯-A¯x22.

Let x* be a solution of (2.7). The following convergence result is well known, e.g., [7], [30, Theorem II.3].

Theorem 1

Let ρ=ρ(ATMA), and assume that 0ϵλk(2-ϵ)/ρ. If ϵ>0, or ϵ=0, and

k=0min(ρλk,2-ρλk)=+,

then the iterates of Algorithm 1 converge to a solution x of (2.7). If x0R(AT), then x is the unique solution of minimal Euclidean norm.

However, as demonstrated in an example (see Section 4) the relaxation parameters in the EMR-based algorithms, although bounded away from zero (Lemma 3), may exceed the upper bound in Theorem 1.

The following convergence results can be deduced from [15, 18].

Theorem 2

Assume that

0<ϵλt2-ϵρ(AtTMtAt),t=1,2,,p.

Then the cycles {yk} defined by Algorithm 3 converge to a point which satisfies a certain linear system of equations. If in addition bR(A), y0R(AT), then {yk} converges towards the solution of Ax=b with minimal Euclidean norm.

We close this section by presenting some well-known examples of weights. Let

(2.8)at,j,t=1,2,,p,j=1,2,,m(t),

be the jth row in block Atm(t)×n.

Example 1

Example 1 (Cimmino)

Let

(2.9)Mt=diag(wtjat,j22),t=1,2,,p,

where wtj>0, j=1m(t)wtj=1 are user defined weights, see, e.g., [7] in which Algorithm 3 is analyzed with the Cimmino and other weights.

Example 2

Example 2 (Block-Kaczmarz)

Let

(2.10)Mt=(AtAtT),t=1,2,,p,

where Q is the pseudoinverse of a matrix Q. Using Algorithm 3, and p=m the method is also known as the algebraic reconstruction technique (ART), [29, 32].

Example 3

Example 3 (Block-Iterative Component Averaging (BICAV))

This method was introduced in [9]. Let sνt be the number of nonzero elements in column ν in At. Then

(2.11)Mt=diag(1at,jSt2),St=diag(sνt),t=1,2,,p.

Assuming equal weights for a fully dense matrix, BICAV and Cimmino become identical.

For more examples of weights see [8, 9].

3 Error minimizing relaxation (EMR)

In this section we will formulate the EMR-principle, and derive the corresponding relaxation parameters in the three algorithms. Let x* be a solution of the least squares problem (2.7), and define

(3.1)B=ATMA.

We first show the following results which will become useful in our analysis.

Lemma 1

Let v=ATM(b-Az), s=0,1,2,, and zRn. Then it holds:

  1. If v,Bsv=0, then v=0.

  2. For s2 and v0,

    v,Bs-1vv,Bsvv,Bs-2vv,Bs-1vv,vv,Bv=v2v,Bv.

Proof.

(a) It holds N(B)=N(ATMA)=N(A¯TA¯)=N(A¯)=N(A). Hence

(3.2)vR(AT)N(B)v=0.

(b) Let

h(t)=x*-z-tv,Bs(x*-z-tv).

Since Bs is symmetric positive semidefinite, we have h(t)0 for any t. Using that

B(x*-z)=ATMA(x*-z)=ATM(b-Az)=v,

we get

h(t)=x*-z,Bs(x*-z)-2tx*-z,Bsv+t2v,Bsv
(3.3)=v,Bs-2v-2tv,Bs-1v+t2v,Bsv.

Let

tmin=v,Bs-1vv,Bsv=argminh(t).

The inequality h(tmin)0 gives after simple calculations

v,Bs-1v2v,Bsvv,Bs-2v,

which provides part (b). ∎

3.1 EMR on Algorithm 1

We will need some further notations. Let

(3.4)u(xk)=uk=ATMr(xk),r(xk)=b-Axk,

and

(3.5)g(x,s)=x*-x,Bs(x*-x),s=0,1,2,.
Definition 2

The strategy EMR is defined by

(3.6)λ(xk,s)=λk(s)=argminλg(xk+λuk,s).

We first discuss specifically the (most important) cases s=0,1,2. For s=0 it holds g(x,0)=x*-x22 so in this case the line search is error-reducing. Now recall that A¯x*=PR(A¯)b¯, see, e.g., [3], and let s=1, then

g(xk+1,1)=x*-xk+1,ATMA(x*-xk+1)=A¯x*-A¯xk+122=PR(A¯)b¯-A¯xk+122.

On the other hand using that I=PR(Q)+PN(QT) for any matrix Q, we have

b-Axk+1M2=b¯-A¯xk+122=PR(A¯)b¯-A¯xk+122+PN(A¯T)b¯22,

which gives

g(xk+1,1)=b-Axk+1M2+const.

So in this case the EMR-strategy minimizes the M-weighted residual norm in each step. Further, we obtain

g(xk+1,2)=ATMA(x*-xk+1),ATMA(x*-xk+1)=ATM(b-Axk+1)22.

So when s=2 the EMR-strategy minimizes in each step the norm of the residual for the M-weighted normal equations.

We next derive the relaxation parameters corresponding to using EMR on Algorithm 1. Using (3.5) and (2.2) we get

g(xk+1,s)=x*-xk+1,Bs(x*-xk+1)
(3.7)=g(xk,s)-2λk(s)uk,Bs(x*-xk)+λk2(s)uk,Bsuk.

Since B(x*-xk)=uk, and for the case s=0, assuming consistency, uk,x*-xk=rk,Mrk, one gets

(3.8)λ(xk,s)=λk(s)=uk,Bs(x*-xk)uk,Bsuk={rk,Mrkuk22,s=0,uk,Bs-1ukuk,Bsuk,s1.

We stress that for s1 consistency of (2.1) is not assumed in (3.8). It follows by Lemma 1 that the relaxation parameters are well defined when uk0 (if uk=0 then xk is a solution). We next note the easy but important result that the relaxation parameters are bounded from below.

Lemma 3

Let λk(s) be given in (3.8). Then

λk(s)1ρ(B).

Proof.

For s1 this follows from the fact that x,Qsxρ(Q)x,Qs-1x for any symmetric positive semidefinite matrix Q. For s=0 we have

uk2=ATMrk,ATMrk=(M1/2AATM1/2)M1/2rk,M1/2rk

which yields

uk2M1/2AATM1/2M1/2rk,M1/2rk=ρ(ATMA)rk,Mrk.

We end this subsection by remarking on the complexity of using s=0,1,2 in Algorithm 1. By forming q=Au, u=ATMr, and recurring r, two matrix vector multiplications are needed both for s=0 and s=1 in each iteration. By forming Bu, and recurring the vectors u and r also only two matrix vector multiplications are needed when s=2 (hence there is a tradeoff between arithmetic operations and memory space).

3.2 EMR on Algorithm 2

Define

(3.9)Bt=AtTMtAt,u(zk,t)=uk,t=AtTMtrk,t,r(zk,t)=rk,t=bt-Atzk,

where t=1,2,,p. Let z* satisfy

(3.10)AtTMtAtz*=AtTMtbt,t=1,2,,p.

Hence z* solves minAtz-btMt, t=1,2,,p.

Denote the solution sets of (2.1), (3.10) and (2.7) by X1, X2 and X3 respectively. By specifying the matrix M in (2.7) we obtain the following result.

Lemma 4

Let M=diag(Mt), t=1,2,,p. Then

X1X2X3.

Proof.

Let xX1. Then since Ax=b if and only if Atx=bt, t=1,2,,p, it follows xX2. The normal equations corresponding to (2.7) are ATMAx=ATMb. These can be written as 1pAtTMtAtx=1pAtTMtbt. Hence using (3.10) we conclude that xX2 implies xX3. ∎

It now follows

(3.11)Bt(z*-zk)=AtTMt(Atz*-Atzk)=AtTMt(bt-Atzk)=uk,t.

We next take, cf. (3.5),

(3.12)gt(z,s)=z*-z,Bts(z*-z),t=1,2,,p,

and apply EMR (3.6) using g=gt(zk,t,s) where by Algorithm 2, zk,t=zk+λtuk,t. We get similarly as in (3.8), and for s=0, assuming consistency of (2.1) the following expressions for the EMR-based relaxation parameters in Algorithm 2:

(3.13)λbsim(s)=λk,t(s)=uk,t,Bts(z*-zk)uk,t,Btsuk,t={rk,t,Mtrk,tuk,t22,s=0,uk,t,Bts-1uk,tuk,t,Btsuk,t,s1.

To make (3.13) well defined we put λk,t(s)=1 if uk,t,Btsuk,t=0. In this case uk,t=0 and we continue with step t+1 in the inner iteration loop of Algorithm 2. Also one easily finds (cf. Lemma 3)

(3.14)λk,t(s)1ρ(Bt).

3.3 EMR on Algorithm 3

Define

(3.15)u(yk,t-1,t)=uk,t=AtTMtrk,t,r(yk,t-1,t)=rk,t=bt-Atyk,t-1,

where t=1,2,,p. Let y* be any solution of (3.10). We next take

(3.16)gt(y,s)=y*-y,Bts(y*-y),t=1,2,,p,

and apply EMR (3.6) using g=gt(yk,t,s), where by Algorithm 3, yk,t=yk,t-1+λtuk,t. We then obtain

(3.17)λbseq(s)=λk,t(s)=uk,t,Bts(y*-yk,t-1)uk,t,Btsuk,t={rk,t,Mtrk,tuk,t22,s=0,uk,t,Bts-1uk,tuk,t,Btsuk,t,s1.

To make (3.17) well defined we put λk,t(s)=1 if uk,t,Btsuk,t=0. In this case uk,t=0 and we continue with step t+1 in the inner iteration loop of Algorithm 2. Note that although expressions (3.17) and (3.13) look identical they are not since rk,t and uk,t have different definitions here than in subsection 3.2. Again it holds

(3.18)λk,t(s)1ρ(Bt).

We finally stress that for s1 it is assumed that X2, and for s=0 that X1 in formulas (3.13) and (3.17).

4 A common iteration framework

Here we will rewrite all three algorithms on the form

(4.1)xk+1=T(xk),T:nn.

Then we will discuss convergence in terms of properties of the operator T of such schemes. This will lead to the convergence results derived in the next section.

Now let

(4.2)T(x)=x+λ(x,s)ATMr(x).

Then Algorithm 1 becomes an instance of (4.1). We call this Algorithm 1-EMR.

Let

(4.3)ψt:nn,ψt(z,s)=z+λt(z,s)u(z,t),t=1,,p,

and define

(4.4)Ψ:nn,Ψ(z,s)=1p1pψt(z,s).

Then, putting T(z)=Ψ(z,s), Algorithm 2 becomes an instance of (4.1), which we call Algorithm 2-EMR.

Finally, let

(4.5)

φt:nn,φt(y,s)=y+λt(y,s)u(y,t),t=1,,p,
Φ:nn,Φ=φpφ1.

Then with

(4.6)T(y)=Φ(y,s)

we retrieve Algorithm 3 which we call Algorithm 3-EMR.

We next discuss convergence of the iteration (4.1). To this end we remind of the following well-known definitions.

Definition 1

(a) The fixed point set of T is defined by fix(T)={q:q=T(q)}.

(b) Assume T(x)-T(y)αx-y. If α=1, the operator T is called nonexpansive (ne), and if α<1, it is called a contraction.

Classical convergence results often rely on these two properties. nFor example, the Banach fixed point theorem states that if T is both continuous and contractive, it admits a unique fixed point, and the iteration (4.1) converges towards this fixed point. However, the following 2×2 example shows that the operator T given in (4.2) is not even ne.

Example

Let a1=(2,1),a2=(1,2) (aj is the jth row of A), b=(1,8)T, x=(4,3)T, y=(1,2)T. Put

q=Tx-Tyx-y.

Then by direct calculation one finds q=1.55 (s=0) and q=1.58 (s=1). An explanation for this fact could be that using EMR the classical upper bound 2/ρ for the relaxation parameter λ (cf. Theorem 1) can be exceeded. For example, here 2/ρ=0.22, whereas λy=1.00, s=0.

To handle this we will consider a more general class of operators.

Definition 2

A continuous operator T:nn is called a paracontraction (pc) if for any qfix(T), and for any xn

(4.7)q-T(x)<q-x

or xfix(T).

This definition originates from [20]. When T is linear this coincides with the definition given in [33]. In the recent monograph [5, pp. 47–48], an operator which has fixed points and satisfies (4.7) is called strictly quasi-nonexpansive (SQNE). In [1, Definition 2.1] an operator which is both SQNE and ne is called an attractive operator. A related concept is strictly nonexpansivity [31]. Such an operator is always pc while the inverse implication does not hold [20, Example 4]. In all these references convergence results are given for such and similar operators. We will show in the next section that the operators in our algorithms all are pc. Hence the following result given in [20, Theorem 1] will become useful.

Theorem 3

Let {Tj}j=1p be a family of paracontracting operators with respect to some norm in Rn. Let {jk}k=1 with 1jkp be admissible, i.e. for any 1ip there are infinitely many integers k such that jk=i, and x0Rn be given. Then the sequence {xk} defined by xk=Tjk(xk-1), k=1,2,, converges if and only if the operators {Tj}j=1p have a common fixed point. Moreover, in this case the limit limkxk=q is one of such common fixed points, namely, Tj(q)=q, j=1,,p.

5 Convergence results

Here we will first show that the operator T given in (4.2) is a pc. Hence we can rely on Theorem 3 for the convergence of the iterates of Algorithm 1. We will further use the facts that an average and product of a finite family of paracontracting operators remain paracontracting to conclude that also the iterates of Algorithms 2 and 3 converge.

5.1 Algorithm 1-EMR

We rewrite (3.7) as

(5.1)g(T(x),s)=g(x,s)-β(s),

where (also using (3.8))

(5.2)β(s)=u(x),Bs(x*-x)2u(x),Bsu(x).
Lemma 1

Let T be defined in (4.2). Then the set fix(T) equals the solution set of (2.1) when s=0 and of (2.7) for s1.

Proof.

The vector x is a solution of (2.7) if and only if ATM(b-Ax)=0. Further fix(T)={x:ATM(b-Ax)=0} (using Lemma 3). If bR(A), then the solution sets of (2.1) and (2.7) coincide. ∎

Lemma 2

Let T be defined in (4.2), and assume s1. Then the operator T is a pc.

Proof.

We have

(5.3)x*-T(x)2=x*-T(x),x*-T(x)=x*-x-λ(s)u(x),x*-x-λ(s)u(x)=x*-x2+λ2(s)u(x)2-2λ(s)x*-x,u(x)=x*-x2+λ(s)R(s),

where

(5.4)R(s)=λ(s)u(x)2-2x*-x,u(x).

We show that R(s)0 for s1, and start by

Ax*-b,M(b-Ax)=Ax*-b,Mb-Ax*-b,MAx
=Ax*-b,Mb-ATM(Ax*-b),x
=Ax*-b,Mb+MAx*-MAx*
=Ax*-b,M(b-Ax*)+Ax*-b,MAx*
=-b-Ax*M2+ATM(Ax*-b),x*
(5.5)=-b-Ax*M2.

We now use (5.5) to rewrite (5.4) as follows:

R(s)=λ(s)u(x)2-2Ax*-b+b-Ax,M(b-Ax)
=λ(s)u(x)2-2b-AxM2-2Ax*-b,M(b-Ax)
(5.6)=λ(s)u(x)2-2r(x)M2+2b-Ax*M2.

Now since x* is a minimizer of b-AxM, we get

b-Ax*M2b-AT(x)M2
=b-Ax-λ(s)Au(x),M(b-Ax-λ(s)Au(x))
=r(x)M2-2λ(s)r(x),MAu(x)+λ2(s)Au(x)M2
=r(x)M2-2λ(s)u(x)2+λ2(s)u(x),Bu(x)
(5.7)=r(x)M2-Γ(s).

Here

(5.8)Γ(s)=λ(s)(2u(x)2-λ(s)u(x),Bu(x)).

Combining (5.6) and (5.7) yields

(5.9)R(s)λ(s)u(x)2-2Γ(s).

Using Lemma 1 we have

Γ(s)=λ(s)(2u(x)2-λ(s)u(x),Bu(x))
λ(s)(2u(x)2-u(x)2u(x),Bu(x)u(x),Bu(x))
(5.10)=λ(s)u(x)2,s2.

Further using (3.8), Γ(1)=λ(1)u(x)2. Hence by (5.9), (5.10) we get

R(s)-λ(s)u(x)20.

Also, using Lemma 3, R(s)=0 if and only if u(x)=0, i.e. when xfix(T). Therefore T fulfills (4.7). Next we show that T is continuous. If xfix(T), then it is easy to see that T is continuous at x. Now let {xj}j=1 be a sequence that converges to x*fix(T). By (5.3), and using that R(s)0 it follows x*-limT(xj)=0 so that limT(xj)=x*=T(x*). The lemma is now proved. ∎

Lemma 3

Let T be defined in (4.2), and assume s=0, and that bR(A). Then the operator T is a pc.

Proof.

Using (3.5) and (5.1) it follows

(5.11)x*-T(x)2=x*-T(x),x*-T(x)=g(T(x),0)=g(x,0)-β(0)=x*-x22-β(0).

Now

β(0)=u(x),x*-x2u(x)20,

and β(0)=0 if and only if x=x*. Thus T satisfies (4.7). It is obvious that T is continuous on n\fix(T) and its continuity at x*fix(T) can be shown using (5.11). ∎

The convergence results for Algorithm 1 can now be formulated.

Theorem 4

Let s=0 and bR(A). The sequence generated by Algorithm 1 using EMR converges to a solution of (2.1). Let s1. The sequence generated by Algorithm 1 using EMR converges to a weighted least squares solution (2.7). If in addition x0R(AT), then xk converges to the solution with minimal Euclidean norm.

Proof.

Follows from Theorem 3, Lemmas 3, 2 and 1. ∎

Remark 5

If λ(s) is exchanged by μ(s)=αλ(s) in Algorithm 1, one finds (cf. (5.10)) Γ(s)μ(s)u(x)2(2-α) when s2, and Γ(1)=μ(1)u(x)2(2-α). One obtains (cf. (5.9))

R(s)μ(s)u(x)2-2Γ(s)μ(s)u(x)2(2α-3).

Hence convergence is maintained for μ(s)=αλ(s) when s1, α[ϵ,32] with 0<ϵ<32. For the case s=0, using (5.11), we have β(0)=(2-α)μ(0) which preserves the convergence result when α[ϵ,2] with 0<ϵ<2.

5.2 The two block-algorithms using EMR

Here we shall use the following result.

Theorem 6

Let Tt:RnRn, t=1,,p, be a finite family of paracontracting operators having a common fixed point. Then S1=1p1pTt and S2=TpTp-1T1 both are pc, and further fix(S1)=fix(S2)=t=1pfix(Tt).

This was shown in [1, Propositions 2.10, 2.12] for attracting operators but is true also for paracontracting operators, as observed in [5, pp. 49ff.].

Now arguing as in Lemmas 3 and 2 we find that ψt,φt, t=1,2,,p, all are pc. Hence using Theorem 6 both operators Ψ,Φ are pc. The following result follows.

Theorem 7

For s=0 assume that the set X1, and for s1 assume that the set X2. Then the whole sequences generated by Algorithm 2-EMR and Algorithm 3-EMR converge to a member of X1 (s=0) and X2 (s1) respectively.

Remark 8

Similar as in Remark 5 we may replace λk,t(s) by μk,t(s)=αλk,t(s), with α[ϵ,32] when s1, and α[ϵ,2] when s=0, in both block-algorithms without ruining convergence.

6 Constraints

The use of a priori information (like nonnegativity) when solving an inverse problem is a well-known technique to improve the quality of the reconstruction. An advantage with projection type algorithms (which includes, e.g., Cimmino and Kaczmarz iterations) is the possibility to adapt them to handle convex constraints. Then usually the iterates can be shown to converge towards a member of the underlying convex feasibility problem [1, 5]. The use of projection type algorithms for treating constraints in tomographic imaging and other applications is also considered in [23, 29, 38, 39].

We now show how the EMR-algorithms can be modified to handle constraints. Let S be a closed convex subset of n and let PS denote the orthogonal projection onto S. We will consider the three iterations

(6.1)xk+1=PST(xk),
(6.2)zk+1=PSΨ(zk),
(6.3)yk+1=PSΦ(yk).

Here the operators T,Ψ,Φ (defined previously) are related to Algorithms 1, 2, 3 respectively.

Proposition 1

Let s=0. Assume that Z1=X1S. Then the iterates of algorithms (6.1), (6.2), (6.3) all converge towards a member of Z1.

Proof.

By [1, Corollary 2.5], the operator PS is a pc. Using [1, Proposition 2.10] all mappings PST, PSΦ, PSΨ are pc. Since fix(PS)fix(T)=fix(PS)fix(Φ)=fix(PS)fix(Ψ)=SX1=Z1, the result follows by Theorem 3. ∎

Similarly it holds:

Proposition 2

Let s1, and assume that Z2=X2S. Then the iterates of (6.2), (6.3) all converge towards a member of Z2.

Proposition 3

Let s1, and assume that Z3=X3S. Then the iterates of (6.1) converge towards a member of Z3.

7 Numerical experiments

We will report from tests using examples from the field of image reconstruction from projections.

7.1 Small test example

Table 1

Small test example: Smallest relative error and kmin with noiseless and noisy data using fully simultaneous iteration (Algorithm 1) with EMR and OPT without(uc) and with constraints(c). Top: the case m<n, bottom: the case m>n.

Algorithmnoiseless, ucnoisy, ucnoiseless, cnoisy, c
s=00.20, 190.27, 90.12, 180.20, 10
s=10.20, 200.27, 100.12, 200.20, 14
s=20.20, 200.27, 100.14, 200.20, 18
OPT0.20, 200.27, 120.12, 200.19, 18
CRAIG(s=0)0.21, 60.35, 3x, xx, x
CGNE(s=1)0.20, 60.24, 4x, xx, x
s=00.10, 200.64, 130.07, 190.26, 10
s=10.10, 200.29, 130.07, 200.22, 16
s=20.11, 200.29, 140.09, 200.22, 19
OPT0.11, 200.36, 200.09, 200.22, 20
CRAIG(s=0)0.13, 60.75, 2x, xx, x
CGNE(s=1)0.10, 80.30, 6x, xx, x

Here we use the SNARK09 software package [11]. We work with the standard head phantom from [29]. The phantom is discretized into 63×63 pixels, and sixteen projections (evenly distributed between 0 and 174 degrees) with 99 rays per projection are used. The resulting matrix A has dimension 1584×3969, so that the system of equations is highly underdetermined. Although in our application iterative methods are usually more competitive for underdetermined systems, cf. [22, p. 153 and p. 227] this might not be the case in other applications. Therefore we also consider taking more rays per projections leading to a matrix of dimension 5430×3969. In addition to A, the software also produces a noise-free right-hand side bsnark and a phantom (translated into vector form) x*. Apart from using noise-free data we also added additive independent Gaussian noise of mean 0 and relative noise-level (δb/bsnark)5% where bnoisy=bsnark+δb. For the choices of M,{Mt} we always use Cimmino’s method with equal weights.

Table 2

Small test example: Smallest relative error, kmin, and spread with noiseless and noisy data using Algorithm 2 with EMR and OPT without (uc) and with constraints (c). Top: the case m<n, bottom: the case m>n.

Algorithmnoiseless, ucnoisy, ucnoiseless, cnoisy, c
s=00.21, 200.27, 170.18, 200.21, 20
spread0.410.360.360.35
s=10.21, 200.27, 180.18, 200.21, 20
spread0.410.360.360.35
s=20.21, 200.27, 190.18, 200.21, 20
spread0.410.360.360.35
OPT0.20, 200.27, 120.13, 200.19, 18
s=00.11, 200.36, 200.10, 200.32, 20
spread0.310.960.320.79
s=10.13, 200.31, 200.13, 200.29, 20
spread0.290.120.290.16
s=20.14, 200.31, 200.14, 200.29, 20
spread0.280.130.280.16
OPT0.11, 200.29, 200.10, 200.22, 20
Table 3

Small test example: Smallest relative error, kmin, and spread with noiseless and noisy data using Algorithm 3 with EMR and OPT without (uc) and with constraints (c).

Algorithmnoiseless, ucnoisy, ucnoiseless, cnoisy, c
s=00.21, 70.37, 30.13, 40.27, 2
spread0.0040.0050.0010.004
s=10.21, 60.35, 30.13,40.26, 3
spread0.0010.0100.0010.004
s=20.21, 60.35, 30.13, 40.25, 3
spread0.0010.0140.0010.006
OPT0.20, 200.27, 180.13, 200.20, 20
OPT/ART0.20, 200.27, 200.12, 80.20, 4
s=1, (0.05+uk,t)λ0.21, 200.27, 160.18, 200.21, 20
s=1, (0.1+uk,t)λ0.20, 200.27, 90.14, 200.20, 18
s=00.15, 50.70, 30.10, 40.50, 2
spread0.070.830.050.07
s=10.12, 50.41, 20.09, 40.27, 2
spread0.070.200.050.16
s=20.12, 50.37, 20.09, 40.23, 3
spread0.080.210.060.19
OPT0.09, 190.21, 40.07, 200.15, 6
OPT/ART0.09, 200.24, 20.07, 200.15, 5
s=1, (0.25+uk,t)λ0.10, 80.29, 40.08, 100.20, 6
s=1, (0.1+uk,t)λ0.09, 200.38, 60.08, 200.29, 14
Figure 1 Small test example: Relative error histories in simultaneous
iteration using noisy data. Top: the case m<n${m<n}$, bottom: the case
m>n${m>n}$.
Figure 1

Small test example: Relative error histories in simultaneous iteration using noisy data. Top: the case m<n, bottom: the case m>n.

Figure 2 Small test example: Relative error histories in simultaneous
block-iteration using noisy data. Top: the case m<n${m<n}$, bottom: the
case m>n${m>n}$.
Figure 2

Small test example: Relative error histories in simultaneous block-iteration using noisy data. Top: the case m<n, bottom: the case m>n.

In Tables 13, one for each algorithm, we list the minimal relative error, emin=mink20x*-xk/x* and corresponding iteration (cycle) number (kmin) when using EMR, with s=0,1,2 and OPT respectively. Here ‘OPT’ refers to the strategy when a constant value of the relaxation parameter is used, chosen such that it gives rise to the smallest relative error within twenty iterations. The training was done using the phantom itself, cf. [8], [36] (for a proton computed tomography application), and [42] (for use in electron microscopic imaging). Apart from the unconstrained case (uc) we also compare with the constrained versions (denoted by (c)) with S the nonnegative orthant, which is a natural constraint in this application. The algorithms are exposed both to noiseless and noisy data. We also include, using noisy data, figures over iteration histories (Figures 13). The top and bottom parts of the tables and figures display the cases m<n, and m>n respectively. For comparison (when using Algorithm 1) we also tested two Krylov type methods: Craig (corresponding to the case s=0) and cgne (s=1) [21, 27]. Both were scaled by M1/2, i.e., using A¯,b¯ instead of A,b. For block-iteration the experiments were organized as follows. Let nb be the number of used blocks. In the tests nb=16,32,43,86,172,344 were used. The partitionings were taken sequentially along the rows (e.g., when nb=16 each projection is taken as a block). We always display the results using the value of nb which resulted in the smallest error (during the first twenty cycles). Further spread is defined as the maximum difference after twenty cycles in relative error using the different values of nb. We remark that using EMR on Kaczmarz’s method (or ART) would result in (as is easily shown) λk,t(s)=1, for all s. This is however not a good choice for ART, see [8, 29]. To improve the behavior of EMR on Algorithm 3 for noisy data we tested also, cf. Remark 8, the choice α=ϵ+uk,t. Here we used training (between 0.05 till 0.45 with step 0.05) to find a good value of ϵ.

Figure 3 Small test example: Relative error histories in sequential
block-iteration using noisy data.
Top: the case m<n${m<n}$,bottom: the case m>n${m>n}$.
Figure 3

Small test example: Relative error histories in sequential block-iteration using noisy data. Top: the case m<n,bottom: the case m>n.

7.2 Large test example

In our second example we used the (Matlab-based) package AIRtools [28] to produce both the phantom, the matrix and the right-hand side (with and without noise). We again used 5% white Gaussian noise. The phantom is now discretized into 365×365 pixels. We take 88 projections (evenly distributed between 0 and 179 degrees) with 516 rays per projection. The resulting projection matrix A has dimension 40892×133225, so that again the system of equations is underdetermined. The outcome of the experiments is seen in Figure 4. Here we used the values of λopt, ϵ and nb obtained by using the big phantom itself when training. To test the influence of the training sample we considered in Figure 5 the results using three different training strategies: the big phantom itself, a 63×63 version of the big phantom and a heuristic. This last approach combines training using the small sample with information of the interval of convergence (int) for the big phantom. Finding ‘int’ requires finding an estimate of the largest singular value of (the large matrix) A. Then we map, in proportion, the OPT-value obtained using the small sample onto ‘int’.

7.3 Discussion of numerical results

Figure 4 Relative error histories in simultaneous iteration and sequential block iteration using the big phantom.
Figure 4

Relative error histories in simultaneous iteration and sequential block iteration using the big phantom.

We find that for noisy data the two methods whose theories rely on consistency (Craig and EMR, with s=0) behave less well (also noted for EMR in [10, pp. 497, 504] and in [17]). For both noise-free and noisy data there are small differences using s=1,2 in all three algorithms. The fully simultaneous algorithm seems more robust against noise than cgne. Indeed, the error curves indicate that it is not so crucial to identify the optimal number of iterations when using Algorithm 1 than when using cgne. For noise-free data, we also find that EMR performs in parity with the ‘OPT’ strategy, and hence are, in this application, viable alternatives to using a training procedure for picking the optimal relaxation parameter. For noisy data however sequential block-iteration do not behave so well using EMR as compared to OPT. To deal with this case we proposed a modification (which does not ruin theoretical convergence) of EMR. With this modification we find that also for sequential iteration it performs quite well. We also remark that the dependence of block-size on the convergence behavior was much larger for simultaneous than for sequential block-iteration. It is further seen that the choice of training samples may be quite critical for the behavior of the OPT alternative. We may conclude (cf. Figure 5) that if a good training-sample is at hand the OPT-strategy works well. In situations when the training-sample is missing (or only a rough estimate is available) the EMR-strategy could be a viable option.

Remark 1

It is known [3] that for cgne both e1=x*-x2,e2=b-Axk+1M decrease monotonically (but not necessarily e3=ATM(b-Axk+1)2). For Algorithm 1 when s=1(2)e2(e3) by construction decreases monotonically. These properties may be important when using stopping criteria. Note however that the monotonicity properties rely on the assumption that ATM(b-Ax*)=0, which may fail to be fulfilled due to noise and/or modelling errors. We will however not discuss stopping criteria in this paper.

8 Numerical experiments with added penalty term

The use of the least squares functional (2.7) as the basis for the iteration can have the side-effect of blurring the reconstruction results. A popular way to overcome this possible smoothing effect is to add a penalty term to (2.7). We remark however that this may in turn introduce new artifacts. In general the choice of penalty term, if any, depends on the intended application area and the character of images arising in that area.

Figure 5 Influence on rate of convergence using three
different training-sample strategies.
Figure 5

Influence on rate of convergence using three different training-sample strategies.

We will here demonstrate the performance of EMR with the TV (total variation) penalty term using so-called superiorization. The general idea is to combine optimization (minimizing penalty) while seeking feasibility (satisfying the linear system). This technique was originally proposed in [4], and later developed and used in several papers, see, e.g., [6]. Let {αk} and β00 be given numbers, and x0 a given starting vector. Consider the following superiorization algorithm, which is similar (but not identical) to the one in [35] for computing xk+1,βk+1 given xk,βk.

Superiorization Algorithm

Let {αk} and β00 be given numbers, and x0 a given starting vector.

  1. y=T(xk)

  2. if ϕ(y)>βk then z=y-ϕ(y)-βky2y, where yϕ(y)βk+1=βk+αk, where αkα0>0

  3. else z=y,βk+1=βk

  4. end if

  5. xk+1=PSz

where T is some algorithmic operator, PS is the orthogonal projector onto a closed convex set S, and ϕ denotes a convex function defined on n, and ϕ(x) is its subgradient set at x.

Proposition 1

Assume that S is a bounded set. Then the Superiorization Algorithm, after a finite number of steps, becomes (cf. (6.1)) xk+1=PST(xk).

Proof.

Since xk+1(=PSz) is bounded for all k, it follows that y=T(xk) remains bounded. Hence step 2 can only occur for a finite number of times (since otherwise βk). ∎

In our numerical tests, we use ϕ=TV(p) where for an image p with K×L pixels, the total variation is defined as

TV(p)=k=1K-1l=1L-1(pk+1,l-pk,l)2+(pk,l+1-pk,l)2,

where pk,l denotes the kth row and lth column of p. We will consider two problems. The first is again a tomography test problem which represents a smooth odf image of a porous material and is taken from the AIR Tools software package, and the second is based on the Lena image. Both noise-free and noisy data (5%) are used. In both cases we chose the convex set S as S=[0,1]nn. Further with s=1 and use β0=0,x0=0 in the tests.

In the first test the original image is discretized into 365×365 pixels and 36 projections (distributed between 0 and 175 degrees) with 138 rays per projection. The resulting matrix A has dimension 6588×133225. The second test is based on the Lena image which has 512×512 pixels. First, the image is reshaped as vector of dimension 262144 and a sparse random matrix of dimension 90000×262144 was produced using Matlab. Finally, the right-hand side of the linear system of equations was taken as the product of the matrix and the reshaped vector (simulating a nonstationary blurring of the original image). We consider two strategies for picking {αk}. The first is obtained by trial and error (super. I). The actual values used was αk=0.1 for the odf-image and αk=40 for the Lena image. In the second strategy (super. II) the last row in step 2 is replaced by

βk+1=βk+max{δ,|ϕ(z)||ϕ(z)-ϕ(y)|},

here 0<δ1 (the value δ=10-10 was used). In Table 4, we give the TV-values of the original images and corresponding reconstructions. The reconstructions were obtained after forty iterations (odf) and twenty iterations (Lena) respectively. In Figures 6, 8 and 7, 9 we display original images, reconstructions and error histories for both examples.

Table 4

TV of original images and reconstructions of odf and Lena images respectively.

Methododf image (noise-free, noisy)Lena image (noise-free, noisy)
(6.1)(5257,7541)(27535,32826)
super. I(10.23,238.1)(731,714)
super. II(20.73,240.6)(175,327)
original image4.028787.4

9 Summary

We have defined and used the EMR-functional to derive relaxation parameters in three algorithms, one fully simultaneous (Landweber), and two of row-block types. To our knowledge the expressions for the relaxation parameters for the block-methods, as given in (3.13) and (3.17), are new, whereas (3.8) appears, as mentioned previously, as a special case in [24, (2.6)] (without reference to EMR). We have also shown that the iterates of algorithm 2.1-EMR converge towards a weighted least squares solution. This result extends the analysis given in [24] to the case when the null space of A is nonempty (e.g., when the matrix A is underdetermined). The convergence results given for the two block algorithms are to our knowledge new. Further it is shown that underrelaxing the EMR-parameters do not ruin convergence. All three methods are further extended to handle convex constaints. Here it is shown that the iterates, assuming certain consistency assumptions, converge to corresponding convex feasibility problems. Our convergence analysis assumes exact data. Therefore we have included a numerical study on their behavior on noisy problems taken from image reconstruction. The conclusions from these experiments can be found in subsection 7.3. Finally, we tested, using superorization, the performance of EMR combined with penalization.

Figure 6 Reconstruction of odf (left) and Lena
(right) images using noise-free data. Original (top), algorithm (6.1)
(2nd row), super. I (3rd row), super. II (bottom).
Figure 6

Reconstruction of odf (left) and Lena (right) images using noise-free data. Original (top), algorithm (6.1) (2nd row), super. I (3rd row), super. II (bottom).

Figure 7 Relative error of algorithms (6.1), super. I
and super. II
using noise-free data. Odf image (left) and Lena image
(right).
Figure 7

Relative error of algorithms (6.1), super. I and super. II using noise-free data. Odf image (left) and Lena image (right).

Figure 8 Reconstruction of odf (left) and Lena (right)
images using noisy data. Original (top), algorithm (6.1) (2rd row),
super. I (3rd row), super. II (bottom).
Figure 8

Reconstruction of odf (left) and Lena (right) images using noisy data. Original (top), algorithm (6.1) (2rd row), super. I (3rd row), super. II (bottom).

Figure 9 Relative error of algorithms (6.1), super. I
and
super. II
using noisy data. Odf image (left) and Lena image
(right).
Figure 9

Relative error of algorithms (6.1), super. I and super. II using noisy data. Odf image (left) and Lena image (right).

Funding statement: The research of the first author was in part supported by a grant from IPM (No. 94650073).

Acknowledgements

We wish to thank anonymous referees for helpful suggestions which improved our paper.

References

[1] Bauschke H. H. and Borwein J. M., On projection algorithms for solving convex feasibility problems, SIAM Rev. 38 (1996), no. 3, 367–426. 10.1137/S0036144593251710Search in Google Scholar

[2] Bauschke H. H., Combettes P. L. and Kruk S. G., Extrapolation algorithm for affine-convex feasibility problems, Numer. Algorithms 41 (2006), no. 3, 239–274. 10.1007/s11075-005-9010-6Search in Google Scholar

[3] Björck Å., Numerical Methods for Least Squares Problems, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 1996. 10.1137/1.9781611971484Search in Google Scholar

[4] Butnariu D., Davidi R., Herman G. T. and Kazantsev I. G., Stable convergence behavior under summable perturbations of a class of projection methods for convex feasibility and optimization problems, IEEE J. Sel. Topics Signal Process. 1 (2007), no. 4, 540–547. 10.1109/JSTSP.2007.910263Search in Google Scholar

[5] Cegielski A., Iterative Methods for Fixed Point Problems in Hilbert Spaces, Lecture Notes in Math. 2057, Springer, Berlin, 2012. Search in Google Scholar

[6] Censor Y., Davidi R. and Herman G. T., Perturbation resilience and superiorization of iterative algorithms, Inverse Problems 26 (2010), no. 6, Aricle ID 065008. 10.1088/0266-5611/26/6/065008Search in Google Scholar

[7] Censor Y. and Elfving T., Block-iterative algorithms with diagonally scaled oblique projections for the linear feasibility problem, SIAM J. Matrix Anal. Appl. 24 (2002), no. 1, 40–58, (electronic). 10.1137/S089547980138705XSearch in Google Scholar

[8] Censor Y., Elfving T., Herman G. T. and Nikazad T., On diagonally relaxed orthogonal projection methods, SIAM J. Sci. Comput. 30 (2007/08), no. 1, 473–504. 10.1137/050639399Search in Google Scholar

[9] Censor Y., Gordon D. and Gordon R., Bicav: An inherently parallel algorithm for sparse systems with pixel-dependent weighting, IEEE Trans. Med. Imaging 20 (2001), 1050–1060. 10.1109/42.959302Search in Google Scholar

[10] Combettes P. L., Convex set theoretic image recovery by extrapolated iterations of parallel subgradient projections, IEEE Trans. Med. Imaging 6 (1997), 493–506. 10.1109/83.563316Search in Google Scholar

[11] Davidi R., Herman G. T. and Klukowska J., Snark09: A Programming System for the Reconstruction of 2D Images from 1D Projections, The CUNY Institute for Software Design and Development, New York, 2009. Search in Google Scholar

[12] Dax A., Line search acceleration of iterative methods, Linear Algebra Appl. 130 (1990), 43–63. 10.1016/0024-3795(90)90205-QSearch in Google Scholar

[13] De Pierro A. R., Methodos de projeç,ão para a resolução de sistemas gerais de equações algébricas lineares Doctotal thesis, UFRJ, Cidade Universitaria, Rio de Janerio, 1981. Search in Google Scholar

[14] Dezaro A., Haltmeier M., Leitão A. and Scherzer O., On steepest-descent-Kaczmarz methods for regularizing systems of nonlinear ill-posed equations, Appl. Math. Comput. 202 (2008), no. 2, 596–607. 10.1016/j.amc.2008.03.010Search in Google Scholar

[15] Eggermont P. P. B., Herman G. T. and Lent A., Iterative algorithms for large partitioned linear systems, with applications to image reconstruction, Linear Algebra Appl. 40 (1981), 37–67. 10.1016/0024-3795(81)90139-7Search in Google Scholar

[16] Eicke B., Iteration methods for convexly constrained ill-posed problems in Hilbert space, Numer. Funct. Anal. Optim. 13 (1992), no. 5–6, 413–429. 10.1080/01630569208816489Search in Google Scholar

[17] Elfving T., Hansen P. C. and Nikazad T., Semiconvergence and relaxation parameters for projected SIRT algorithms, SIAM J. Sci. Comput. 34 (2012), no. 4, A2000–A2017. 10.1137/110834640Search in Google Scholar

[18] Elfving T. and Nikazad T., Properties of a class of block-iterative methods, Inverse Problems 25 (2009), no. 11, Article ID 115011. 10.1088/0266-5611/25/11/115011Search in Google Scholar

[19] Elfving T., Nikazad T. and Hansen P. C., Semi-convergence and relaxation parameters for a class of SIRT algorithms, Electron. Trans. Numer. Anal. 37 (2010), 321–336. Search in Google Scholar

[20] Elsner L., Koltracht I. and Neumann M., Convergence of sequential and asynchronous nonlinear paracontractions, Numer. Math. 62 (1992), no. 3, 305–319. 10.1007/BF01396232Search in Google Scholar

[21] Engl H. W., Hanke M. and Neubauer A., Regularization of Inverse problems, Kluwer, Dordrecht, 2000. 10.1007/978-94-009-1740-8_3Search in Google Scholar

[22] Epstein C. L., Introduction to the Mathematics of Medical Imaging, 2nd ed., Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 2008. 10.1137/9780898717792Search in Google Scholar

[23] Escalante R. and Raydan M., Alternating projection methods, Fundam. Algorithms 8, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 2011. 10.1137/9781611971941Search in Google Scholar

[24] Friedlander A., Martínez J. M., Molina B. and Raydan M., Gradient method with retards and generalizations, SIAM J. Numer. Anal. 36 (1999), no. 1, 275–289. 10.1137/S003614299427315XSearch in Google Scholar

[25] Haltmeier M., Convergence analysis of a block iterative version of the loping Landweber–Kaczmarz iteration, Nonlinear Anal. 71 (2009), no. 12, e2912–e2919. 10.1016/j.na.2009.07.016Search in Google Scholar

[26] Hanke M., Accelerated Landweber iterations for the solution of ill-posed equations, Numer. Math. 60 (1991), no. 3, 341–373. 10.1007/BF01385727Search in Google Scholar

[27] Hansen P. C., Rank-Deficient and Discrete Ill-Posed Problems, SIAM, Philadelphia, 1998. 10.1137/1.9780898719697Search in Google Scholar

[28] Hansen P. C. and Saxild-Hansen M., Air tools – A Matlab package of algebraic iterative reconstruction methods, J. Comput. Appl. Math. 236 (2012), no. 8, 2167–2178. 10.1016/j.cam.2011.09.039Search in Google Scholar

[29] Herman G. T., Fundamentals of Computerized Tomography, 2nd ed., Adv. Pattern Recognit., Springer, London, 2009. 10.1007/978-1-84628-723-7Search in Google Scholar

[30] Jiang M. and Wang G., Convergence studies on iterative algorithms for image reconstruction, IEEE Trans. Med. Imaging 22 (2003), 569–579. 10.1109/TMI.2003.812253Search in Google Scholar PubMed

[31] Koltracht I. and Lancaster P., Contraining strategies for linear iterative processes, IMA J. Numer. Anal. 10 (1990), no. 4, 555–567. 10.1093/imanum/10.4.555Search in Google Scholar

[32] Natterer F., The Mathematics of Computerized Tomography, John Wiley, New York, 1986. 10.1007/978-3-663-01409-6Search in Google Scholar

[33] Nelson S. and Neumann M., Generalizations of the projection method with applications to SOR theory for Hermitian positive semidefinite linear systems, Numer. Math. 51 (1987), no. 2, 123–141. 10.1007/BF01396746Search in Google Scholar

[34] Nikazad T. and Abbasi M., An acceleration scheme for cyclic subgradient projections method, Comput. Optim. Appl. 54 (2013), no. 1, 77–91. 10.1007/s10589-012-9490-ySearch in Google Scholar

[35] Nikazad T., Davidi R. and Herman G. T., Accelerated perturbation-resilient block-iterative projection methods with application to image reconstruction, Inverse Problems 28 (2012), no. 3, Article ID 035005. 10.1088/0266-5611/28/3/035005Search in Google Scholar PubMed PubMed Central

[36] Penfold S. N., Schulte R. W., Censor Y., Bashkirov V., McAllister S., Schubert K. E. and Rosenfeld A. B., Block-iterative and string-averaging projection algorithms in proton computed tomography image reconstruction, Biomedical Mathematics: Promising directions in Imaging, Theraphy Planning and Inverse Problems, Medical Physics Publishing, Madison (2009), 347–368. Search in Google Scholar

[37] Pierra G., Decomposition through formalization in a product space, Math. Program. 28 (1984), no. 1, 96–115. 10.1007/BF02612715Search in Google Scholar

[38] Popa C., Extended and constrained diagonal weighting algorithm with application to inverse problems in image reconstruction, Inverse Problems 26 (2010), no. 6, Article ID 065004. 10.1088/0266-5611/26/6/065004Search in Google Scholar

[39] Popa C., Projection Algorithms-classical results and developments. Applications to Image Reconstructions, Lambert Academic Publishing, Saarbrücken, 2012. Search in Google Scholar

[40] Raydan M. and Svaiter B. F., Relaxed steepest descent and Cauchy–Barzilai–Borwein method, Comput. Optim. Appl. 21 (2002), no. 2, 155–167. 10.1023/A:1013708715892Search in Google Scholar

[41] Saad Y., Iterative Methods for Sparse Linear Systems, 2nd ed., Society for Industrial and Applied Mathematics, Philadelphia, 2003. 10.1137/1.9780898718003Search in Google Scholar

[42] Sorzano C. O. S., Marabini R., Herman G. T. and Carazo J. M., Multiobjective algorithm parameter optimization using multivariate statistics in three-dimensional electron microscopy reconstruction, Pattern Recognit. 38 (2005), 2587–2601. 10.1016/j.patcog.2005.03.013Search in Google Scholar

[43] Young D. M., Iterative Solution of Large Linear Systems, Academic Press, New York, 1971. Search in Google Scholar

Received: 2015-9-1
Revised: 2015-10-21
Accepted: 2015-10-27
Published Online: 2015-11-28
Published in Print: 2017-2-1

© 2017 by De Gruyter

Downloaded on 28.3.2024 from https://www.degruyter.com/document/doi/10.1515/jiip-2015-0082/html
Scroll to top button