Operator learning approach for the limited view problem in photoacoustic tomography

In photoacoustic tomography, one is interested to recover the initial pressure distribution inside a tissue from the corresponding measurements of the induced acoustic wave on the boundary of a region enclosing the tissue. In the limited view problem, the wave boundary measurements are given on the part of the boundary, whereas in the full view problem, the measurements are known on the whole boundary. For the full view problem, there exist various fast and robust reconstruction methods. These methods give severe reconstruction artifacts when they are applied directly to the limited view data. One approach for reducing such artefacts is trying to extend the limited view data to the whole region boundary, and then use existing reconstruction methods for the full view data. In this paper, we propose an operator learning approach for constructing an operator that gives an approximate extension of the limited view data. We consider the behavior of a reconstruction formula on the extended limited view data that is given by our proposed approach. Approximation errors of our approach are analyzed. We also present numerical results with the proposed extension approach supporting our theoretical analysis.


Introduction
Photoacoustic tomography (PAT) is an emerging non-invasive imaging technique. It is based on the photoacoustic effect, and it has a big potential for a successful use in biomedical studies, including preclinical research and clinical practice. Applications include tumor angiogenesis monitoring, blood oxygenation mapping, functional brain imaging, and skin melanoma detection [5,31,47,49] The principle of PAT is the following. When short pulses of non-ionising electromagnetic energy are delivered into a biological (semi-transparent) tissue, then parts of the electromagnetic energy become absorbed. The absorbed energy leads to a nonuniform thermoelastic expansion depending on the tissue structure. This gives rise to an initial acoustic pressure distribution, which further is the source of an acoustic pressure wave. These waves are detected by a measurement device on the boundary of the tissue. The mathematical task in PAT is to reconstruct the spatially varying initial pressure distribution using these measurements. The values of the initial pressure distribution inside the tissue allow to make a judgment about the directly unseen structure of the tissue. For example, whether there are some abnormal formations inside the investigated tissue, such as a tumor. of our approach. We look at the approximation errors for the unknown wave data and for the corresponding reconstructions obtained by explicit reconstruction formulas. We present the numerical results in Section 5. Finally, we finish the paper with conclusion and outlook in Section 6.

Mathematics of PAT
Let Ω ⊆ ℝ d be a bounded domain with a smooth boundary ∂Ω, where d ≥ 2 denotes the spatial dimension. Further, let C ∞ c (Ω) be the set of all smooth functions f : ℝ d → ℝ that are compactly supported in Ω. In PAT, one is interested to recover an unknown function f ∈ C ∞ c (Ω) from the solution of the wave equation given on parts of the boundary of Ω. Let us mathematically specify this reconstruction problem.

Reconstruction Problem
Let Uf : ℝ d × (0, ∞) → ℝ denote the solution of the following initial value problem for the wave equation: Here ∂ t denotes differentiation with respect to the second variable t, and ∆ x is the Laplacian with respect to x. Then the reconstruction problem in PAT consists in recovering the unknown function f ∈ C ∞ c (Ω) from the corresponding wave boundary data u(x, t) = (Uf)(x, t) for (x, t) ∈ Γ 1 × (0, ∞), (2.1) where Γ 1 ⊆ ∂Ω. If Γ 1 = ∂Ω, then (2.1) is called full view problem; otherwise, if Γ 1 ⊊ ∂Ω, we have the limited view problem (LVP). In this paper, we are particularly interested in the limited view case, which we consider in some detail in Section 2.3. Let us denote the unobservable part of the boundary as Γ 2 := ∂Ω \ Γ 1 . We define also the following restrictions of Uf : Let us note that in practice, the reconstruction problem (2.1) arises in PAT in spatial dimensions two and three. The three-dimensional problem appears when the so-called point-like detectors are used (see, for example, [12,26,49]). When one uses linear or circular integrating detectors, then the reconstruction problem (2.1) is considered in two spatial dimensions (see [6,15,38,53]).

Explicit Inversion Formula
The reconstruction problem (2.1) can be approached by various solution techniques. Among these techniques, the derivation of the explicit inversion formulas of the so-called back-projection type is particularly appealing. A numerical realization of these formulas typically gives reconstruction algorithms that are accurate and robust, and at the same time are faster than iterative approaches.
An inversion formula consists of an explicitly given operator G d that recovers the function f from the data u. Such formulas are currently known only for special domains and only for the full view data, i.e. u must be given for all x ∈ ∂Ω. In this paper, we consider the formula that first has been derived in [6,30,48]. In addition to the data u, the formula G d also depends on the boundary ∂Ω of the domain Ω ⊊ ℝ d and on the reconstruction point x 0 ∈ Ω. The structure of the formula further depends on whether the spatial dimension d is even or odd.
If d ≥ 2 is an even integer, then Here κ d := (−1) (d−2)/2 /π d/2 is a constant, ν x denotes the outward pointing unit normal to ∂Ω, and is the differentiation operator with respect to t 2 . Further, ⟨ ⋅ , ⋅ ⟩ and | ⋅ | denote the standard inner product and the corresponding Euclidian norm on ℝ d , respectively. In the case of odd dimension d ≥ 3, the formula G d is defined as follows: The formula G d has been introduced in [48] for dimension d = 3, and in [6] for dimension d = 2. In [30], it has been studied for the case when Ω is a ball in arbitrary dimension. Further, in [17,18,34], it has been shown that for any elliptical domain Ω, the formula G d exactly recovers any smooth function f with support in Ω from data u = Uf . In [20], it was shown that the same result also holds for parabolic domains Ω with d = 2. The formula G d in arbitrary spatial dimension d ≥ 2 on certain quadric hypersurfaces, including the parabolic ones, has been analyzed in [21].
It should be noted that the formula G d can be in fact used for any convex bounded domain Ω. Then, however, the formula does not recover the function f exactly, and it introduces an approximation error. The form of this error has been analyzed in [17,18,34]. Numerical experiments indicate that this error is rather low for domains that can be well approximated by elliptic domains. This is also suggested by the microlocal analysis in [35].
In the following, we will work with functions f ∈ L 2 (Ω 0 ), and we will assume that the domain Ω is such that the formula G d gives exact recovery of the function f from its wave data u = Uf , i.e. it holds that As we already mentioned, this is, for example, the case for circular and elliptical domains. In such a situation, it can be shown that G d is a continuous extension of U −1 to L 2 (∂Ω × (0, ∞)).

Limited View Problem
In practice, the wave data u is frequently given on a subset Γ 1 of the boundary ∂Ω ( Figure 2). This subset Γ 1 , called observation boundary, defines the so-called detection region (Γ 1 ) (see, for example, [26,37]).
To show the left-hand estimate, we decompose where T is larger than the diameter of Ω. Since the operator U 1 is the sum of two Fourier integral operators of order zero (see [19]), we have ‖χ [0,T] U 1 f‖ L 2 ≤ c 1 ‖f‖ L 2 for some constant c 1 . Moreover, the explicit formulas for U 1 f (see, e.g., [8]) imply also that ‖χ (T,∞) U 1 f‖ L 2 ≤ c 1 ‖f‖ L 2 , which gives the left-hand side estimate in (2.4).
The right-hand side estimate can be found in [19,Theorem 3.4]. The required visibility condition is satisfied due to the assumption that f ∈ L 2 (Ω 1 ).
It is worth to mention that despite the boundedness of U −1 1 , no theoretically exact direct solution methods are available. Let us note that if the condition Ω 1 ⊊ (Γ 1 ) is not satisfied, the visibility condition in [19, Theorem 3.4] is also not valid, and the inverse of the operator U 1 is severely ill-posed (see, e.g., [19,27,44]). Denote 2 := L 2 (Γ 2 × (0, ∞)). From the boundedness of the operator U : L 2 (Ω 0 ) → , we can deduce the boundedness of the operator U 2 : L 2 (Ω 1 ) → 2 . We will use this for the data extension operator below.
Recall that in order to give the exact reconstruction, the formula G d requires the full view wave data u, which is given for all x ∈ ∂Ω (see (2.3)). In spite of the above discussed stable recoverability of f ∈ L 2 (Ω 1 ) from equation (2.1), the use of formula G d on the limited view data u given on Γ 1 ⊊ ∂Ω leads to serious artifacts in the reconstruction; see, e.g., [20], where the numerical results of the application of G 2 on finite parabolas are presented. The reconstruction artefacts in the case of the limited view data are also discussed in [4,13,14,36,45,50].
At the same time, the use of formula G d for reconstructing function f can be attractive from various points of view. For example, as we already pointed out, the reconstruction using a numerical realization of G d is faster than iterative reconstruction algorithms. Another point may be connected with the nature of the software development. Namely, having already a tested and trusted computer code of the numerical realization of formula G d , it could be tempting to develop its extensions for the LVP.
An extension of the limited view data u from the observable part of the boundary Γ 1 ⊊ ∂Ω to the whole boundary ∂Ω may give a possibility to improve the reconstruction quality of the formula G d . In this paper, we propose to realize this extension using the operator learning approach, which we consider in the next section.

Data Extension Using Operator Learning Approach
The extension of the limited view data to the whole boundary can be in principle done by the extension operator that we define in the next subsection. This operator is however not explicitly known, and we propose an operator learning approach to construct its approximation in Section 3.2. In Section 3.3, we discuss computational aspects of the proposed learned approximation of the extension operator.

Extension Operator
Let us recall that Γ 1 ⊊ ∂Ω is the observation boundary, (Γ 1 ) is the corresponding detection region defined in Section 2.3, Γ 2 = ∂Ω \ Γ 1 is the unobservable part of the boundary, and Ω 1 is an open set with Ω 1 ⊊ (Γ 1 ). Further, let us remind that the operators U 1 and U 2 are defined in (2.2).
The operator A : 1 → 2 that maps functions U 1 f to functions U 2 f for f ∈ L 2 (Ω 1 ) realizes the extension of the limited view data u 1 = U 1 f to the unobservable part of the boundary Γ 2 . This operator A can be written as A = U 2 ∘ U −1 1 . Because of this representation and the assumptions on Γ 1 and Ω 1 , the operator A is a linear continuous operator as a superposition of linear continuous operators. Recall that the continuity (or boundedness) of the operators U −1 1 and U 2 is discussed in Section 2.3. With the introduced extension operator A, one could extend the limited view data u 1 to the whole boundary ∂Ω, and then use the formula G d on this extended data. In this way, the disadvantages of the use of the formula G d on the limited view data can be eliminated. However, the form of the operator A is not explicitly known.

Proposed Learned Extension Operator
In this paper, we propose to construct an operatorÂ n that approximates the operator A. The role of the parameter n ∈ ℕ ∪ {0} is described below. The approximate operatorÂ n must satisfy the following two requirements. The first requirement concerns the approximation quality:Â n u 1 must be close to Au 1 . The second requirement is related to the computational effort of the numerical evaluation ofÂ n u 1 . This evaluation must be fast such that the evaluation of the formula G d on the extended limited view data with the help ofÂ n remains to be computationally efficient.
Our construction of the approximate operatorÂ n is inspired by the statistical learning approach (see, e.g., [22]). So, how to construct (or, using the terminology of the statistical learning, how to learn) an approxi-mationÂ n u 1 of Au 1 using the training set Z? It should be noted that many statistical learning algorithms are designed for learning a small number of scalar-valued functions. These algorithms are not applicable in our case because the function that we need to learn is an operator. Recently, the development of the statistical learning methods for learning vector-valued functions and also functions with values in function spaces, i.e. operators, has been started (see, e.g., [2,33]). For obtaining good results, these methods require an a priori knowledge of the dependence between different components of the output vector that is given by the function to be learned. This knowledge is not readily available in our case. However, as we observe below, the linear structure of the extension operator A that we want to learn allows to employ a projection operator for the learning.

Computation of Learned Approximation
How to compute the learned approximationÂ n u 1 using the training set Z for n ≥ 1? First of all, observe that since P n u 1 ∈ V n , the projection P n u 1 has the following representation: where the coefficients c j ∈ ℝ can be determined from the conditions ⟨P n u 1 − u 1 , u 1,i ⟩ = 0 for i = 1, . . . , n. These conditions can be written in the form of the system of linear equations for the coefficients c j n ∑ j=1 c j ⟨u 1,i , u 1,j ⟩ = ⟨u 1 , u 1,i ⟩, i = 1, . . . , n. (3.4) Denote the matrix corresponding to the above linear system as P n , i.e. the elements of P n are Further, denote the vector of unknowns as c n , and the right-hand side as u n , i.e.
The matrix P n is the Gram matrix of the functions in U 1,n , and it is invertible if the set U 1,n is linearly independent. Since the operator U 1 is invertible, the set U 1,n is linearly independent if the set {f i : i = 1, . . . , n} is linearly independent, and for the following, we assume that this is the case.
Note that the matrix P n does not depend on the limited view wave data u 1 that we want to extend. Therefore, the inverse matrix P −1 n can be precomputed once the set of the learning inputs U 1,n is given. This will make the determination of the coefficients c j very fast.
Finally, with the coefficients c j in (3.3), i.e. c n = P −1 n u n , the approximationÂ n u 1 is calculated as follows:

Approximate Reconstructions and Their Error Analysis
For obtaining an approximate reconstruction of f using the limited view data u 1 = U 1 f and the formula G d , we can now proceed as follows. First, we extend the limited view data u 1 to the whole boundary ∂Ω using the learned extension operatorÂ n in the following way: Then we apply the formula G d to this extended wave dataû n in order to obtain an approximate reconstructionf n :f Note thatû 0 is obtained by extending the limited view data u 1 to the whole boundary ∂Ω with zero values on Γ 2 . As we already discussed, the corresponding approximate reconstructionf 0 contains significant errors, and it is desirable to have better reconstructions of f using u 1 . Additionally, one may desire that the reconstructionf n improves as n increases.
In the following theorem, we estimate the L 2 -error of the approximation of Au 1 byÂ n u 1 and of the approximation of f byf n . From the derived estimates, we see that the above aims can be realized if the training functions f i , i = 1, . . . , n, are chosen appropriately.

Theorem 2. Let a set of linearly independent training functions {f
. . , n} ⊆ L 2 (Ω 1 ) be given, and denote W n := {∑ n i=1 c i f i : c i ∈ ℝ}, W 0 := {0} ⊆ L 2 (Ω 1 ). Define the training limited view wave data u 1,i := U 1 f i , the corresponding linear subspace V n in (3.1), and the learned extension operatorÂ n in (3.2). Consider a function f ∈ L 2 (Ω 1 ), its limited view wave data u 1 := U 1 f , and its approximationf n defined in (4.1). Then the following L 2 -error estimate for the unobservable data holds: If additionally, the domain Ω is such that (2.3) holds, then we have the following L 2 -error estimate for the reconstruction: Proof. We first prove (4.2). From the definition of the operatorÂ n , we have From the properties of the projection operators, we also have For an element h ∈ V n , there are unique constants c i ∈ ℝ, i = 1, . . . , n, such that and therefore, there exists an element g ∈ W n such that h = U 1 g. Using this fact, we can estimate Thus, the error estimate (4.3) is obtained from (4.7), (4.8), and the error estimate (4.2).

Remark 1.
Let Q n : L 2 (Ω 1 ) → W n be the orthogonal projection on W n in the space L 2 (Ω 1 ). Then, since we have min g∈W n ‖f − g‖ = ‖f − Q n f‖, we can write ‖f − Q n f‖ instead of min g∈W n ‖f − g‖ in (4.2) and (4.3).
As we see from Theorem 2, the estimates of the L 2 -errors given by our learning procedure depend on the minimal distance from the unknown function f to the linear subspace W n defined by the training functions f i . This gives us an indication for the choice of the training functions. Namely, one should choose the training functions f i such that the unknown function f can be well approximated by their linear combination. Estimates (4.2) and (4.3) also allow us to state the condition for the exact approximation given by our learning procedure and for the convergence of the learned approximation when the number of the training functions n goes to infinity. We present these conditions in the following two corollaries. Let us now compare the errors of the approximationsf n with n ≥ 1 andf 0 . The L 2 -error estimates (4.2) and (4.3) for n = 0 become Comparing the error estimates (4.2) and (4.3) for the learned approximations with n ≥ 1 and the error estimates (4.9) and (4.10) for the approximations using zero extension of the limited view wave data, one sees that these error estimates differ regarding the following factors: correspondingly for learned approximations with n ≥ 1 and approximation using zero extension. The factors (4.11) can be seen as indicators for the expected approximation quality of the considered algorithms. For a fixed non-zero function f , the factor E 0 (f) is a fixed non-zero value, while the factor E n (f) can be zero, or can be made arbitrary small, see Corollaries 1 and 2. Therefore, the approximation quality of the learned approximations is expected to be better than of the approximations using zero extension of the data. This expectation will be confirmed by the numerical results in the next section. In fact, one can show (see Remark 2 below) that the factor E n (f) is always less than or equal to the factor E 0 (f), and the strict inequality E n (f) < E 0 (f) holds under rather mild conditions on the function f and the training functions f i . Generally, this condition can be expected to hold in practice.

Numerical Results
In this section, we present results of the numerical realization of the proposed operator learning approach. We consider the spatial dimension d = 2, and we take the elliptical domain with a 1 = 2, a 2 = 1. We use the following parametrization of the boundary: and we assume that the unobservable part of the boundary is (see Figure 3  Thus, approximately 19 % of the angular values are missing. We work with the function f presented in Figure 3 (a). Its numerical full view wave boundary data u = Uf is given in Figure 3 (b), and we use the corresponding limited view wave boundary data u 1 = U 1 f . The observation boundary Γ 1 is discretized such that the distance between two consecutive points is in the interval [0.0099, 0.0101]. We take the time step size as 0.01. We further assume that we know a rectangular region that contains the support of f (Figure 4 (top and bottom)). We use this region K for defining training functions f i . Namely, we consider partitions of the region K into squares K i , i ∈ {1, . . . , n}. The square K i contains points (x 1 , x 2 ) ∈ ℝ 2 such that where w = 1.75 (width of K), h = 0.8752 (height of K), n w = √ 2n, n h = n w 2 (see Figure 4 (middle)). Then we define the training function f i as the indicator function of the square K i . We take the number of the training functions in the form n = n 1 × n 2 , where n 1 and n 2 are the numbers of the partitioning intervals along the coordinate x 1 and x 2 correspondingly. We present the numerical results for n = 4 × 2, 8 × 4, 16 × 8, 32 × 16.
Let us note that we use the rectangular region K for illustration purpose. If the region containing supp(f) is not known, then one may consider squares filling the whole subset Ω 1 of the detection region (Γ 1 ). Further, note that other type of basis functions can be used in a similar manner. Kaiser-Bessel functions, which are frequently used in computed tomography (see, e.g., [32,43,46]), would be another reasonable choice.
The extended limited view dataû n using the learned extension operatorÂ n for the considered values of n are presented in Figure 5. We observe that as n increases, the extended dataû n approaches the full view data u in Figure 3 (a). Note that the chosen training functions f i satisfy the condition of Corollary 2. Therefore, the approach ofû n to the full view data u is in agreement with our theoretical analysis.
The reconstructionsf n using the extended dataû n are presented in Figure 6 (second and third rows). For comparison purpose, we also present the reconstructionf using the full view wave boundary data u, and the reconstructionf 0 using the zero extended dataû 0 ( Figure 6 (first row)). We evaluate the reconstructions at the points from the discrete set Ω h := {(−2.2 + n 1 h, −2.2 + n 2 h) ∈ ℝ 2 : n 1 , n 2 ∈ {0, 1, . . . , 300}} ∩ Ω, with h = 11 750 . We also consider the discrete L 2 -error of a reconstructionf * defined as follows: Let us discuss the reconstructions in Figure 6. First of all, as expected, one observes strong artifacts in the reconstructionf 0 , especially outside of supp(f). These artifacts are considerably corrected in the reconstructionf 4×2 , and as the number of the training functions n increases, the artifacts become weaker such that the reconstructionf 32×16 is very similar to the reconstructionf . This observation is also reflected in E 2 -errors that are presented in Figure 7. Note thatf differs from f due to the discretization error of the numerical realization of the formula G 2 . Thus, as in the case of the dataû n , the approach off n to f is in agreement with Corollary 2.
Finally, in Table 1, we present the calculation times for the parts involved in the proposed reconstruction approach. Our numerical results are performed with MATLAB version R2015b on the PC lenovo e31 with four processors Intel(R) Xeon(R) CPU 3.20 GHz. We see that the most time consuming part is the calculation of the matrix P −1 n , which is used for solving the system of linear equations (3.4). Here, the calculation of u 1,i = U 1 f i is the most computationally expensive. But for a given set of the training functions f i , u 1,i and the matrix P −1 n have to be calculated only once and prior to the actual image reconstruction process.  The calculation of the learned data extensionÂ n u 1 is fast. In particular, for the biggest considered number n = 32 × 16 of the training functions, the calculation time forÂ n u 1 is near the calculation time for the formula G 2 . Thus, our proposed operator learning approach fulfills the requirements that we stated at the beginning of Section 3.2. Namely, the closeness of the approximationÂ n u 1 to Au 1 , and the fast evaluation ofÂ n u 1 are realized.

Conclusion and Outlook
In this paper, we demonstrated that an approximate extension of the limited view data in PAT can be realized using an operator learning approach. Our numerical results show that the learned extension of the limited view data with a good approximation quality and a low computational cost is possible. A good approximation quality is especially achieved for the biggest number n = 32 × 16 of considered training functions. This makes the proposed learned data extension attractive for the algorithms that are designed for the full view data.
As an example, we demonstrated a satisfactory performance of a reconstruction formula with the proposed learned data extension.
It could be interesting to look at the behavior of the proposed learned data extension without knowledge of a rectangular region K containing supp(f). As we already noted, in this case, one could consider partitions of the whole detection region Ω 1 . Also other training functions, such as generalized Kaiser-Bessel functions (see, e.g., [32,43,46]), can be tried. It is appealing to consider a comparison of the reconstruction quality and computation time of the proposed reconstruction approach and iterative reconstruction algorithms. Implementation of the proposed learned extension of the limited view data to three spatial dimensions is an interesting aspect of future research. In this case, the choice of the generalized Kaiser-Bessel functions as the training functions f i is particularly convenient because for them the wave data u 1,i = U 1 f i , u 2,i = U 2 f i are known analytically (see, e.g., [46]). This makes the determination of the entries of the matrix P n fast. Also, the solution of the system of linear equations (3.4) can be done either using iterative methods, such as conjugate gradient method, or an approximate inverse matrix to P n can be determined.
Finally, it seems to be worth to examine applications of the presented operator learning approach to the limited data problems in other tomographic modalities, such as sparse angle or region of interest computed tomography.