Neural Computing with Coherent Laser Networks

We show that a coherent network of lasers exhibits emergent neural computing capabilities. The proposed scheme is built on harnessing the collective behavior of laser networks for storing a number of phase patterns as stable fixed points of the governing dynamical equations and retrieving such patterns through proper excitation conditions, thus exhibiting an associative memory property. The associative memory functionality is first discussed in the strong pumping regime of a network of passive dissipatively coupled lasers which simulate the classical XY model. It is discussed that despite the large storage capacity of the network, the large overlap between fixed-point patterns effectively limits pattern retrieval to only two images. Next, we show that this restriction can be uplifted by using nonreciprocal coupling between lasers and this allows for utilizing a large storage capacity. This work opens new possibilities for neural computation with coherent laser networks as novel analog processors. In addition, the underlying dynamical model discussed here suggests a novel energy-based recurrent neural network that handles continuous data as opposed to Hopfield networks and Boltzmann machines which are intrinsically binary systems.


INTRODUCTION
In the recent years, there has been a growing interest in developing new platforms for general-purpose or application-specific computing that offer an advantage over classical processors in terms of computational time, energy efficiency and scalability [1].Although quantum computing is widely considered as a promising route, it appears that the classical nonlinear systems exhibit a largely under-explored computational capacity that is not properly utilized in conventional digital computers [2].In this regard, there is great interest in developing alternative hardware platforms, which subsequently demand for compatible new algorithms.
Inspired by the biological brain, an interesting computational platform seems to be a network of nonlinear units, i.e., neurons, with a complex architecture that allows dense long-range interactions [3].In such systems, computing is an emergent nonlinear dynamical behavior of the network, and, in principle, can be much more efficient for certain tasks in comparison with the well-established sequential architecture.Interestingly, in the physics community interest in the subject of neural computation was raised at an early stage by the introduction of Hopfield networks [4,5].In these contexts, mainly influenced by spin systems in statistical mechanics, computing is viewed as finding states that minimize a global network energy function.Analog physical implementations of Hopfield networks with optoelectronics [6] and CMOS circuits [7,8] were demonstrated for a small number of neurons at early stages.More importantly, such networks inspired unconventional methods for solving combinatorial optimization problems [9] as well as energy-based models for machine learning [10].On the other hand, interest in physical implementation of unconventional computing with densely connected architectures has recently regained interest in photonics [11][12][13].In fact, energy-efficiency and the possibility of implementing long-range interactions make photonics an attractive candidate for neural computation.Accordingly, there is interest in developing novel methods and algorithms that allow for taking advantage of the existing photonics systems for unconventioanl computing.
Here, we show that coherent laser networks exhibit collective neural computing capabilities, and devise the fundamental requirements for realizing an associative memory for continuous patterns.What facilitates this work is recent experimental progress in creating large networks of coherently coupled photonic oscillators [11,12,[14][15][16].These activities have been primarily centered around solving computationally-hard problems by optical simulation of classical spin models.In particular, coherent laser networks have been used for solving non-convex optimization problems of the form of the classical XY Hamiltonian [17], while numerical simulation of the governing dynamical models have been shown to be an efficient optimization method [18].Here, it is shown that coherent laser networks hold a great potential as a physical energy-based neural computing platform.The present work is timely due to two important recent realizations that make coupled laser systems an attractive choice as a physical neural network.First, is the possibility of implementing dissipative interaction among laser networks which ensures the presence of fixed point attractors for such dynamical systems [14,19].The presence of dissipative coupling is shown to shift the dynamical model governing laser networks toward a class of reaction-diffusion systems that are known to be the host of exotic phenomena most notably pattern formation, which is the core of the present work [20].In contrast, driven by device applications, traditionally the general trend has been centered around dispersive interaction among laser arrays to avoid power loss, which in turn could result in unstable and chaotic behavior.Second, several recent works show the possibility of creating coupling through complex graph topologies, which is essential for implementing and training a recurrent neural network based on laser networks with desired wiring [14,21].In contrast, in the past the emphasis has been on lattice geometries with nearest neighbor couplings.It is because of this latter that we opt to call the system a laser network rather than a laser array.
Figure 1(a) schematically depicts a network of n lasers that are coherently coupled through diffraction engineering.This coherent laser network can be considered as a complex network represented with a graph as shown in Fig. 1(b).Here, each graph node, represents an artificial neuron associated with a laser that is described by its amplitude and phase, a i (t) = |a i (t)| exp(iφ i (t)), as two dynamical variables.In addition, two representative neurons i and j, interact dynamically through rates (w ij , w ji ), which could in general be non-reciprocal, i.e., w ij = w ji .Assuming that all lasers are identical, starting from an initial condition, under proper conditions the network can reach a phase-locking state where the amplitudes are nearly equal and the phases have a fixed pattern [19].In this regime, the system can be viewed as a network of phase oscillators that are governed by an ndimensional energy landscape function as shown schematically in Fig. 1(c).The equilibrium phase patterns of the laser network are associated with the local minima of this energy landscape function.Thus, the laser network can be viewed as an energybased neural network.The use of such an energy-based model can be best demonstrated through an associative memory functionality.In such a system, by properly choosing the weight matrix, one can suitably engineer the landscape function such that desired patterns are located at its local minima as illustrated in Fig. 1(d).In this manner, the network memorizes a given pattern which can be retrieved when it is suitably initialized.
In this work, first, it is shown that the conservative reciprocal coupling allows for the formation of binary patterns.We show that by using the Hebbian learning desired patterns can be mem-orized by the network, although the storage capacity is limited to only two images.Next, it is shown that these restrictions can be uplifted by considering non-reciprocal coupling that allows for treating continuous patterns, while increasing the storage capacity.A simple learning rule for training such coherent laser networks is introduced, which is based on simultaneously embedding a number of patterns as fixed point solutions of the dynamical models governing laser networks.These results are justified by numerical simulation of the dynamical equations governing laser networks.

FORMULATION A. A Single Laser
Given the importance of a single laser oscillator as an artificial neuron and a building block of the coherent laser network, first we discuss it in the following.Here, laser oscillations is modeled through a second-order nonlinear oscillator as: [22,23] where, a is the complex modal amplitude of the electric field in laser cavity, g 0 is the small signal gain, b represents the complex amplitude of a drive laser, and ξ represents fluctuations.Here, the oscillation frequency ω 0 is gauged out for simplicity, the laser is assumed to be frequency-locked with the drive, and the time is normalized to the photon lifetime, 1/γ, where γ is the passive cavity decay rate.This model, which is similar to the single-sideband Van der Pol [24] or the so-called Staurat-Landau oscillator [25], represents a class-A laser, in which the field decay rate is much less than the decay rates of the atomic degrees of freedom, i.e., atomic polarization and population inversion [26].The analysis presented in this work is based on this minimal model which facilitates integrability.However, it is later discussed that the results are applicable to a more general class of laser systems.
In the absence of seeding, i.e., b = 0, equation 1 admits a stable fixed point at ā = 0 for g 0 < 1.For g 0 > 1, the stationary solution at zero becomes unstable, while the oscillator stabilizes at ā = (g 0 − 1)/g 0 exp(iφ), where, φ is an arbitrary phase.In the presence of seeding with a complex amplitude b = |b| exp(iϕ), the equilibrium state becomes ā = | ā| exp(iϕ).Therefore, while in the absence of seeding the phase is random, seeding can fix the phase of the laser to that of the drive laser, irrespective of the initial conditions and fluctuations.This aspect is best described in the phase space.By considering a Lyapunov function function is plotted in Fig. 2 for three different scenarios of operating below threshold, above threshold, and in presence of seeding.

B. Laser Networks
The extension of the dynamical model to the case of n coupled laser oscillators is straightforward.Considering n identical oscillators, the evolution equations can be written as: ( In this relation, '•' shows entry-wise product, a = (a is the seeding vector, and ξ(t) = (ξ 1 (t), • • • , ξ n (t)) t contains the fluctuation terms.It is important to note that in a passive coupled cavity arrangement the coupling coefficients are subject to the power conservation and reciprocity relations, which respectively demand In the following, it is first assumed that the coupling is of pure dissipative nature, thus the matrix elements w ij are taken to be real, and the coupling matrix is assumed to be restricted to the aforementioned conservation relations.A more general case that involves complex coupling coefficients is discussed later.The symmetry of the coupling matrix allows for writing the dynamical model in terms of the gradient of a Lyapunov function, i.e., ȧi = −∂F/∂a * i + ξ i (t), where [19] It is straightforward to show that along the trajectories of Eq. ( 2) the time derivative of F is negative.This guarantees that starting from a given set of initial conditions, the evolution of the dynamical system (2) is toward the local minima of the multivariate cost function It is important to note that the governing cost function F can be greatly simplified in the strong pump regime, where the amplitudes tend to become uniform and the phase degrees of freedom become the key players in the phase space [19].This can be seen from Eq. ( 3), which shows the pump parameter g 0 as a penalty for intensity inhomogeneity across the laser network.By directly enforcing the condition of equal equilibrium intensity, i.e., |a i | = (g 0 − 1)/g 0 , the cost function reduces to the XY Hamiltonian for the phase degrees of freedom: It is worth recalling that φ i (i = 1, • • • , n) represent the phases of the lasers as dynamical variables that describe the phase space of the system, while ϕ i (i = 1, • • • , n) are constants that represent the phases of the drives.In the following, the attention is focused on the case of the large gain limit, which concerns only the phase degrees of freedom.In addition, for simplicity, the drive term is not considered.

ASSOCIATIVE MEMORY
The cost function of Eq. ( 4) is in general a non-convex function, thus, a coherent laser network with a given weight matrix W could have numerous local minima with different basins of attractions in the phase space.In this case, if the initial point in the phase space is located within the attractor basin of a local minimum, the network will evolve toward the associate stationary state, say Φ = ( φ1 , • • • , φn ) t .For memorizing a given pattern in the network, the inverse problem is of interest.In this case, the weight matrix W should be devised such that a desired pattern Θ = (θ 1 , • • • , θ n ) t becomes a local minimum of the energy function governing the network.In addition, when more than one patterns are to be memorized, of interest is to find a weight matrix W that guarantees the local minima associated with the patterns are located far apart in the phase space such that they can be successfully retrieved.These aspects form the core of training an associative memory, and are discussed in the following.
The cases of binary and continuous patterns are to be treated separately.First, the case of binary pattern, e.g., Θ = (θ 1 , • • • , θ n ) t , where each pixel is limited to two discrete values with contrast π, say θ i = ±π/2, is considered.Next, the analysis is extended to the general case that can treat continuous phase patterns, e.g., Θ = (θ 1 , • • • , θ n ) t , where each pixel takes continuous values, −π ≤ θ i < +π.

A. Binary Patterns
As mentioned earlier, the goal of the training is to find the coupling matrix W that results in the presence of local minima of the energy landscape function f (Eq.4) at desired points.To draw this connection, it is easier to start with identifying the stationary points of the energy landscape function f .Enforcing the condition of stationary solutions ∇ f ≡ 0, results in the following stationary phase relations for the fixed points: Clearly, the stationary state condition is satisfied for any binary pattern φi = θ i = ±π/2 ; i = 1, • • • , n, for any weight matrix.This, however, does not guarantee the presence of stable local minima at such stationary points.On the other hand, a proper weight matrix can be identified that ensures a desired pattern Θ is a local minimum.This is given by: for i = j.This weight matrix clearly respects the reciprocity condition, i.e., w ij = w ji , while the energy conservation can be enforced by choosing the diagonal elements as w ii = ∑ j |w ij |.
It can be shown that for the weight matrix given by Eq. ( 6), the desired pattern is a local minimum.This can be shown by using this weight matrix in the XY Hamiltonian of Eq. ( 6), which results in f = − ∑ i,j ).Now, one can show that the associated Hessian matrix H, with matrix elements which, in turn ensures that the desired pattern is a stable local minimum of the XY Hamiltonian (see Methods).
Figure (3) depicts the reconstruction of a binary pattern in a coherent laser network trained according to Eq. ( 6).Here, a binary 512 × 512 pixel image is considered (Fig. 3(a)).A truncated version of the image is considered as the initial phases of the oscillators (Fig. 3(b)), and the network successfully retrieves the original image after reaching an equilibrium (Fig. 3(c)).It is worth noting that in practice, the initial phases might not be controllable, while instead seeding can be used to suitably drive the network toward the memorized pattern.
It is worth mentioning the similarity of the laser network with the Hopfield network in case of binary patterns.For binary values with π contrast, the XY Hamiltonian of Eq. ( 4) becomes equivalent with the Ising Hamiltonian ∑ i,j w ij s i s j (s i = ±1), which forms the basis of the Hopfield network.Similarly, the weight matrix given by Eq. ( 6) becomes equivalent to the Hebbian learning rule of the Hopfield network, i.e., w ij = − 1 n s i s j [4].However, this similarity could be misleading given that the phase model discussed above is fundamentally different from the Hopfield network.In fact, in the dynamical model proposed by Hopfield, often called the Hopfield-Tank network, nonlinear activation functions enforce binary operation of the underlying neurons, which instead allows for physical realization of a combinatorial model [5,9].On the other hand, in the phase model discussed above, the neurons individually operate in continuous phases.In fact, here the formation of a binary pattern is solely a collective behavior that happens as a result of embedding such a pattern as a local minimum of the XY Hamiltonian through a proper design of the weight matrix.This aspect results in a fundamental challenge in using the XY model with real-valued weights as an associative memory as discussed in the following.
The Hebbian learning of Eq. ( 6) can be readily generalized to store more than one pattern.In this case, for k given patterns {Θ (1) . However, it is shown that the learning capacity of such a network of phase oscillators is very limited [27,28].In fact, using a mean-field formalism it is proven that in a network trained with the aforementioned weights, the exhibits a large number of local minima [27].However, these local minima have significant overlap which prevents successful retrieval of the memorized patterns.
It is worth noting that in case of non-binary patterns the weight matrix of equation ( 6) does not guarantee that a desired continuous pattern is a stationary point.However, it guarantees local convexity of the landscape function at that point (see Methods).Accordingly, a network trained with relation ( 6) can evolve into a nearby local minimum, which, given the highly non-convex nature of the landscape function could be close to the desired pattern.The exact reconstruction of continuous pat- terns is possible by utilizing complex coupling as discussed next.

B. Continuous Patterns
The challenge with embedding a continuous pattern as a stable local minimum of the XY Hamiltonian of Eq. ( 4) can be resolved by making a simple change in the form of the Hamiltonian as suggested in Ref. [27].This is done by considering the training parameters as a phase factor in the sinusoidal function according to f = ∑ i,j cos(φ i − φ j − ψ ij ), where the network can be simply trained to exhibit a stable local minimum at the desired contin- It is important to note that this simple change in the phase cost function demands for complex and non-reciprocal coupling among the lasers that is to be discussed later.In addition, its generalization to storing more than one patterns, according to suffers from large overlap between the memories [27,28].
Inspired by the clock model proposed in Ref. [27], here the following modification of the XY Hamitlonian is suggested: where, w ij = |w ij | exp(iψ ij ) are complex weights.This energy function contains additional parameters, i.e., the amplitudes and phases of the weight matrix elements, which can be trained to store multiple patterns.In the following, it is shown that this phase cost function can be effectively mapped onto a coherent laser network by uplifting the physical limitations of the coupling matrix.
Considering a given continuous pattern (θ 1 , • • • , θ n ) t as the equilibrium phase pattern of a laser network, and assuming that the lasers reach uniform intensities, the associated stationary state complex field amplitude is ā ≡ (e iθ 1 , • • • , e iθ n ) t .To make this a fixed point of the dynamical model governing the coherenet laser network, i.e., d ā/dt ≡ 0, one needs to ensure W ā = 0.This relation can be solved for W, which gives result to W = C(I − ā ā+ ), where, C is an arbitrary n × n matrix, I represents an n × n identity matrix, and ā+ = ā † / ā † ā is the pseudoinverse of ā.For the straightforward choice of C = I, the elements of the weight matrix, , are complex and respecting w ij = w * ji .In this case, apart from the diagonal elements, the elements of the coupling matrix have uniform amplitudes.However, as discussed next, the amplitudes |w ij | play an important role when more than one patterns are involved.Next, consider memorizing k patterns {Θ (1) To make these patterns stationary states of the laser network, one needs to enforce the condition of WA ≡ 0, which can be satisfied by the choice of W = C(I − AA + ) where, again, C is an arbitrary matrix and I is the identity matrix.A convenient choice is C = I which results in W = I − AA + .Assuming that the target k patterns are linearly independent vectors, the weight matrix W is of rank k.Therefore, its physical implementation requires n × k independent matrix elements.In addition, similar to the case of a single pattern, it is straightforward to show that this weight matrix is generally complex but Hermitian, i.e., W † = W.
It is important to note that the presence of non-reciprocal coupling (w ij = w ji ) does not generally rule out the possibility of phase locking of the network [29].In fact, the Hermiticity of the weight matrix allows the system to admit a Lyapunov function, which guarantees asymptotic stability of the laser network.In this case, due to the Hermiticty of the coupling matrix, w * ij = w ji , the Lyapunov function is the same as relation (3).In addition, by taking a i = |a i | exp(iφ i ) and assuming homogeneous amplitudes, the energy function of relation (3) reduces to the desired phase function of relation (7).It is worth stressing that the main difference of the energy function of Eq. ( 7) with the clock Hamiltonian proposed in Ref. [27] is the presence of the amplitudes of the coupling elements |w ij |.This additional degree of freedom allows for increasing the storage capacity of the network through the learning rule of Eq. ( 8).
The proposed learning is tested with a dataset of k = 64 continuous patterns of n = 64 × 64 pixels, shown in Fig. 4(a).These grayscale images are selected from a collection of dog faces from the downsampled ImageNet dataset [30].The amplitude and phase of the complex weight matrix of Eq. ( 8) are plotted in Figs.4(b).Here, the network successfully stores and retrieves all the 64 training patterns.For demonstration purposes, the reconstruction of two exemplary images from their corrupted versions is depicted in Figs.4(c,d).

CONCLUSION AND DISCUSSION
In summary, in this paper the potential of using coherent laser networks for neural computing was proposed.The coherent laser network is governed by a non-convex energy landscape function that can contain a large number of fixed point attractors.The use of the coherent laser network as an energy-based neural network model was demonstrated through an associative memory functionality.It was shown that using non-reciprocal coupling between lasers allows for going beyond binary data and adding the capability of handling continuous patterns.This work, outlines the great potential of coherent laser networks for optical neural computing.In addition, the proposed dynamical model could have applications as a novel continuous-time neural network for conventional digital computing.
It is worth mentioning that the results presented above were built on the idealistic assumption of identical oscillators, while in practice, individual laser cavities can have deviations in their resonance frequencies and linewidths.However, simulation results show that the system exhibits self-organizing behavior and can reach phase-locking in presence of tolerable perturbations.
To explore this aspect, the network of Fig. 4 is simulated under the presence of random frequency and linewidth detunings of the individual network elements.It is found that the laser network reaches phase-locking and the associative memory functionality is preserved.The results are shown in the supporting Supplementary Material [31].
The results presented in this work were based on the the socalled class-A laser model, where the gain can be considered a constant, while many practical lasers fall in the category of class-B lasers, where the gain evolves dynamically [26].The simplified model used here admits a Lyapunov function, which allows for an analytical treatment of the laser network and finding a training method.However, it should be noted that the proposed training method concerns solely the stationary behavior of the network through the coupling matrix.Therefore, the dynamics of the gain is not expected to violate the associative memory functionality, so long as the stability of the fixed points are guaranteed.As shown recently, a large gain lifetime, compared to the photon lifetime, can give rise to destabilization of shallow local minima or metastable states such as vortex states in a lattice of coupled lasers [20].In this case, however, numerical simulations indicate that the patterns embedded through the learning rule of Eq. 8 remain stable even for large gain lifetimes.This is justified by repeating the simulations of Fig. 4 with a class-B laser model as described in the Supplementary Material [31].
Finally, the present work is focused on the associative memory functionality as a generic task for energy-based models, while it remains to examine the full capacity of coherent laser networks as energy-based models in different network architectures and for various functionalities [32].In addition, the proposed pseudoinverse learning is a simplistic approach, which is suitable for experimental realization given that it requires a lowrank weight matrix.However, of interest would be to develop advanced training algorithms that allow for harnessing the full capacity of the coherent laser networks for machine learning.

A. Numerical Simulations
The coherent laser as described by equations ( 2) is in essence a continuous-time energy-based recurrent neural network.Considering the potential importance of the proposed model for unconventional computing through simulations of the underlying model with digital computers, in the following, numerical simulations are briefly discussed.The numerical simulations of Eqs. ( 2) are performed with a forward-difference Euler method according to: For the simulations discussed in this paper, the network converges rapidly (after ∼ 100 steps).In general, the most computationally-costly process in Eq. ( 9) is the matrix-vector multiplication.It should also be noted that Eq. ( 9) deal with complex numbers, which require double-precision floating point format.To consider noise, uncorrelated delta noise is generated for each oscillator, i.e., ξ i (t ) * ξ j (t) = Dδ ij δ(t − t ).The effect of detuing is considering by changing the first term of Eq. ( 9) according to −a(t) → −(1 + δγ + iδω) • a(t), where, δω = (δω 1 , • • • , δω n ) t , and δγ = (δγ 1 , • • • , δγ n ) t .Here, δω i , δγ i ∼ N (0, σ) with σ ∼ 0.05.

B. The Hessian Matrix
The Lyapunov function of Eq. ( 3) is a function of 2n variables, which can be cast in a vector as e = (a 1 , • • • , a n , a * 1 , • • • , a * n ) t .The Lyapunov function near an arbitrary point can be expanded as: where, ∇F is the gradient vector and H is a 2n × 2n Hessian matrix.In this representation, stationary states are points associated with ∇F( ē) = 0.The Hessian matrix can be represented as: where, H d = (g 0 − 1)I − 2g 0 diag(a * • a) − W, and H o = −g 0 diag(a • a).
For the phase cost function of Eq. ( 4), the Hessian matrix is an n × n matrix with elements h ij = ∂ 2 f /∂φ i ∂φ j , which are found to be: For the choice of the weight matrix of Eq. ( 6) for a given pattern Θ = (θ 1 , • • • , θ n ) t , evaluating the Hessian at this pattern, results in the off-diagonal elements h ij = − 1 n cos 2 (θ i − θ j ), and diagonal elements h ii = 1 n ∑ j cos 2 (θ i − θ j ).In this case, the Hessian matrix is of the form of the Laplacian matrix of a weighted graph with adjacency matrix elements 1 n cos 2 (θ i − θ j ) [33].It is straight- forward to show that this Hessian matrix is positive semidefinite given that it is symmetric and diagonally dominant [33].This result is valid for both choices of binary and continuous patterns, however, one should recall that the training of Eq. ( 6) is limited to patterns that pass the stationary test of Eq. ( 5), that is limited to binary patterns.

Fig. 1 .
Fig. 1.(a) A schematic of the coherent laser network, and (b) the associated network graph, and (c) the multivariate energy function.By locating a desired pattern at a local minimum of the energy function, it can be retrieved when the network is suitably initialized to start from the attractor basin of the embedded fixed point.

Fig. 2 .
Fig. 2. The Lyapunov function of a single laser (a) un-pumped, (b) pumped, (c) pumped and seeded, in the in-phase and quadrature phase coordinates, x = (a + a * )/ √ 2 and y = (a − a * )/ √ 2i.(d) The amplitude of the equilibrium solution of a single oscillator versus the pump parameter g 0 .

Fig. 3 .
Fig. 3. Reconstruction of binary images by CLNs with proper weight matrices.(a) A binary 512 × 512 pixel image memorized by the CLN.(b) A portion of the memorized binary image is used as initial phase distribution of the trained CLN.(c) The original image is reconstructed as the CLN evolves to an equilibrium state.

2 Fig. 4 .
Fig. 4. Reconstruction of binary and gray-scale images by CLNs with proper weight matrices.(a) A binary 512 × 512 pixel image memorized by the CLN.(b) A portion of the memorized binary image is used as initial phase distribution of the trained CLN.(c) The original image is reconstructed as the CLN dynamically evolves to an equilibrium state.(d-f) Similar to (a-c) for a grayscale 512 × 512 pixel image.