Identifying topology of leaky photonic lattices with machine learning

We show how machine learning techniques can be applied for the classification of topological phases in leaky photonic lattices using limited measurement data. We propose an approach based solely on bulk intensity measurements, thus exempt from the need for complicated phase retrieval procedures. In particular, we design a fully connected neural network that accurately determines topological properties from the output intensity distribution in dimerized waveguide arrays with leaky channels, after propagation of a spatially localized initial excitation at a finite distance, in a setting that closely emulates realistic experimental conditions.


I. INTRODUCTION
Machine learning holds great promise for solving a variety of problems in nanophotonics.Rather than attempting to model the system of interest exactly from first principles (e.g., by solving Maxwell's equations), machine learning techniques aim to discover or reproduce key features of a system by optimizing parametrized models using a set of training data [1].A trained model can often predict the properties of a device faster than conventional simulation techniques [2,3].Machine learning can also be used to solve the inverse problems of how to design a nanophotonic structure with desired functionalities, and how to reconstruct the parameters of a device using indirect measurements [4][5][6][7][8].The latter is particularly important for nanophotonic devices, since structural parameters may differ substantially from the nominal design due to fabrication imperfections.
Recently developed topological photonic systems provide a useful testbed for better understanding the capabilities and limitations of machine learning approaches in nanophotonics [9,10].Topological photonic structures host robust edge states which are protected against certain classes of fabrication imperfections.This robustness is explained by the bulkboundary correspondence, which relates the existence of localized boundary modes to nonlocal topological invariants expressed as integrals of a connection or curvature of the bulk modes [11].While the direct measurement of a topological invariant entails the reconstruction of both the intensity and phase profiles of the bulk modes of a structure, machine learning models can perform supervised classification of topological phases using a limited set of observables [9].
In general, the performance of machine learning depends on both the quality and quantity of the data used to train the model.Supervised learning approaches, such as deep neural networks, typically require a huge quantity of labelled training data, which may be hard to come by.This has motivated recent interest in the use of unsupervised learning techniques such as manifold learning, which do not re-quire labelled training data to distinguish topological phases [12][13][14][15][16]. Broadly speaking, these techniques are sensitive to sharp changes to observables that occur in the vicinity of topological phase transition points, and thus perform best when one has access to measurements from a large set of different model parameters, which is most feasible when the parameter controlling the phase transition is continuously tunable [14].
The above methods also rely on prior knowledge of the characteristics of the physical system (such as its sizes, its internal structure and the parameters of the initial excitation), therefore, being not in line with a realistic experimental framework.Data quality and feature selection can have a significant impact on the machine learning-based reconstruction of topological phase diagrams [17].For example, missing data arising from incomplete measurements or local perturbations to the data can act as adversarial attacks that fool neural network-based classifiers of topological phases into making incorrect predictions [18].The existence of adversarial examples highlights the importance of taking platform-specific uncertainties and disorder into account in the selection and design of machine learning classifiers of topological phases.
The aim of this study is to investigate how common obstacles encountered in the characterization of nanophotonic devices -disorder, imperfect alignment, and access to a limited set of output observables -affect the performance of machine learning-based classification and clustering methods for topological phases.Specifically, we focus on the case of onedimensional waveguide arrays which have provided a versatile platform for the investigation of topological effects in nanophotonics [19][20][21], considering the problem of predicting the existence or absence of edge states based on bulk intensity measurements.First, we show that while curated input data can improve the performance of clustering, ambiguity in the training data (in the form of uncertainty in the alignment of the input waveguide) leads to incorrect cluster assignments, requiring the use of supervised learning techniques.We compare the performance of several supervised classification models, including a convolutional neural network, demonstrating the ability to predict the existence of different edge state configurations with high accuracy using bulk intensity measurements.Finally, we show the feasibility of transfer learning for sufficiently weak disorder strengths, i.e. maintaining accurate predictions of topological edge states using a model trained on disorder-free data.Our numerical results reveal the feasibility using machine learning techniques to distinguish nanophotonic topological phases using incomplete measurements.
The outline of this article is as follows: Section II reviews the properties of the leaky Su-Schrieffer-Heeger (SSH) tight binding model and introduces the datasets which will be used in our study.Section III presents the results of unsupervised clustering according to the edge state configuration using the t-distributed stochastic neighbor embedding (t-SNE) method.We compare the performance of different supervised learning techniques in Sec.IV.As an example of the feasibility of transfer learning we consider in Sec.V the classification performance for disordered waveguide arrays.We conclude with Sec.VI.The Supplementary Materials contain additional details on the tight binding model parameters, training data, and the employed machine learning models.

II. MODEL AND DATASET PREPARATION
We consider light propagation in waveguide arrays governed by the paraxial wave equation, where E is the envelope of the optical wavepacket propagating along the z (waveguide) axis, r ⊥ = (x, y) are the transverse coordinates, k 0 = 2πn 0 /λ is the wave number, n L (r ⊥ ) is a perturbation of the refractive index forming the waveguide lattice, and n 0 is the background refractive index of the medium.Formally, the final state after a propagation distance L can be obtained by projecting the input (z = 0) state E(0, r ⊥ ) onto the propagation-invariant modes of the array φ n (r ⊥ ) with propagation constant β n , i.e.
where A n = dr ⊥ φ * n (r)E(0, r ⊥ ) are the amplitudes of the modes excited at the input (z = 0).The intensity of the final state (3) is sensitive to both the modal excitation amplitudes A n and the propagation length L, so intensity measurements at a single L are generally insufficient to uniquely reconstruct the modal profiles, propagation constants, and topological invariants of the system.Conventional schemes for predicting topological properties of the modes φ n (r ⊥ ) based only on measuring intensity profiles require either the large L limit [22,23] or measuring the evolution as a function of z [24,25].On the other hand, machine learning approaches can in principle infer topological properties center-to-center distances d1,2 between waveguides along the vertical axis; center-to-center distance ρ between waveguides along the horizontal axis.Arrays of auxiliary waveguides are set aside from the main array at a distance dϵ.Here, λ is the operating wavelength, n0 is the background refractive index of silica glass, nA,B are the perturbations of the refractive index inside the waveguides of the main array and arrays of the environment, respectively.
using intensity measurements at a fixed propagation distance [26][27][28], at least given access to a sufficient amount of high quality training data.As a specific example, in the following we consider the leaky Su-Schrieffer-Heeger waveguide lattice shown in Fig. 1(A), a dimerized array composed of N leaky waveguides with elliptical cross-sections of semiaxes a x,y induced by the refractive index perturbations of magnitude n A [23].With increasing coupling between the structural elements, some supermodes of the lattice become radiative, acquiring a finite lifetime.The radiation losses can be fine-tuned by optimizing the effective potential of the environment and radiation channels.This will allow us to study how changes to the input dataset affect the performance of machine learning-based classification of the different topological phases of this lattice.One possible implementation of the radiation channels is by coupling the main array to auxiliary arrays, each consisting of N env equidistantly spaced single mode waveguides with an index contrast n B , as shown in Fig. 1(B,D).Examples of feasible parameters close to those employed in the experimental work Ref. [29] are given in Table I.
Provided only one band of the main array overlaps with the dispersion curve of side-coupled leaky channels, an initially localised excitation with a broad transverse wavenumber spectrum would undergo gradual radiation and decay during propagation.Therefore, only the top branch will remain populated after a certain propagation distance, making it possible to calculate the topological invariant of the band using the projector of the output field distribution following the method used in Ref. [23].However, this recipe generally requires knowing the complexvalued field, whereas phase retrieval could be a challenging task.We will demonstrate the possibility to unravel topology of the sample lattice based solely on the output intensity profile in a roughly centerpositioned floating window with the use of machine and deep learning methods.
To simplify propagation simulations, we constructed the tight binding model (TBM) corresponding to the schematic in Fig. 1(B) and determined parameters of the effective Hamiltonian in compliance with the paraxial modeling, where ψ m and c ml are the amplitudes of the optical field in the main array and in the leaky channels, respectively, Ĥ0 is the N × N Hamiltonian of the main array, made of the alternating nearest-neighbor (NN) coupling coefficients J 1,2 , ϵ is the coupling strength between the main array and the environment, J env is the NN hopping coefficient in leaky channels, and ∆ is a detuning of the propagation constants.
The dispersion characteristics of the disconnected (at ε = 0) uniform lattices representing the main While preparing the datasets, J1,2 were uniformly sampled from within the specified intervals for each vector.
(SSH) array and environment (env) are given by and plotted in Fig. 1(C).As deliberately ensured by design, the environmental array's dispersion curve fully intersects the lower band of the SSH lattice, meaning that only the lower band becomes lossy.Given dimerization, the main array is known to be topologically nontrivial for J 1 < J 2 and topologically trivial for To prepare a dataset, the TBM equations (4) were solved numerically.At the input, we excite a single waveguide designated as i in Fig. 1(A).The use of a single-element input is justified by its wide spectrum, which allows populating both bands of the lattice.By iterating over parameters of the photonic lattice in the ranges indicated in Table II, we accumulated data for the analysis of topology of the main array.We take into account that the lattice ends can be different, so that N can be odd.We select a sample window composed of a finite number N c of the central waveguides in the main array.Thereby, we aim to solve the classification problem for a finite lattice sample, i.e., to distinguish between different configurations of the two edges based on the intensity distribution measured at the output of N c central waveguides.The edge of the SSH main array can be either trivial (0) or non-trivial (1), depending on the lattice termination by strong or weak bond.The nontrivial edge supports a midgap topological edge state.This yields four classes in total: 00, 11, 10, 01.The four possible configurations are visualized in Fig. 1(E): 01 (left trivial, right nontrivial), 11 (left non-trivial, right non-trivial), 10 (left non-trivial, right trivial), 00 (left trivial, right trivial).Note that such setup of the problem is different from that in Ref. [23], where both edges of the lattice had the same termination.Also, to calculate the field projector, the field distribution over all elements of the main array was used, that is N c = N with N even.Our previous work [23] presented a proposal for calculating the topological invariant (Zak phase) for this lattice (of classes 00 or 11) using the field projector of the output distribution.This procedure is summarized in Fig. 2. By analyzing the complex-valued field distribution [note Fig. 2(C,D) only shows the intensity], we compute the Zak phase, which asymptotically approaches π in the nontrivial configuration [see Fig. 2(A)], provided the leaky channels are introduced.At distances 4 cm < z < 9 cm the upper band is completely depopulated as a result of leakage.This depopulation is also evident in the total wavepacket norm, which converges towards 1/2.However, when the propagation distance is increased beyond z > 9 cm, reflections occur from the ends of the finite environment array and the main lattice, resulting in an increase in the total wavepacket norm [see Fig. 2(B)], rendering the method inapplicable.Thus, accurate reconstruction of the topological invariant requires either a large lattice or a well-controlled propagation length to avoid reflections off the ends.

III. UNSUPERVISED LEARNING
To begin, we perform the preliminary analysis of the prepared datasets using the t-SNE (t-distributed Stochastic Neighbor Embedding) method.t-SNE is a nonlinear dimensionality reduction algorithm which learns a low-dimensional embedding of the input data; points within the input data set that are close to each other will remain close to each other in the embedded space.Ideally, a vector will be most similar to others obtained from the same lattice configuration, resulting in visible clustering in the low-dimensional embedding.
In this approach, we work with the intensity distribution within N c = N elements (N = 22 or 23, to be more specific), and assume that the pumped waveguide can be shifted from the center of the lattice.Figure 3 shows t-SNE maps of the system with fixed L = 7.6 cm, N = 22 (23) and two different positions of the initially excited waveguide.In the Hermitian case (leakage disabled), the different classes become mixed up in the embedded space; whereas in the case of a lattice with leaky channels, they do not.This qualitatively agrees with the theory in Ref. [23], specifically that the different phases will exhibit distinct intensity distributions in their bulk.
However, as soon as we introduce uncertainty, such as the position of the initial excitation, the topological classes are no longer clearly separable: in the Hermi-tian case different classes become mixed up [Fig.3(C)], whereas in the leaky lattice too many clusters are obtained [Fig.3(F)].Consequently, unsupervised methods are no longer applicable.
Figure 4 presents the statistic analysis of the data used for (C,F) panels of Fig. 3.This visualization shows that classes 01 and 00, 10 and 11 can be grouped pairwise.However, the classes with dissimilar edge topologies (01 and 10) are differentiated from the classes with the identical edge topologies (00 and 11) by odd N , due to distinct input vector lengths (the 23th waveguide for which case is shown shaded).This postprocessing also reveals significant overlaps of the intensity bars for 00 and 11 classes in each waveguide of the Hermitian SSH lattice, while the bars overlap less in the leaky lattice forming shifted dimerized patterns, a feature to be noticed by the neural network.

IV. SUPERVISED LEARNING
For supervised classification of the four topological classes, we apply machine (K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree) and deep (Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN)) learning methods (see details in the Supplementary Materials, Section III).The numerical experiments were carried out with varying parameters: propagation distance L, total number of waveguides N , number of the central waveguides in a sample window N c .The input waveguide i can be shifted by 1 from the center of the array, according to the expression ceil(N/2 + l), where l can be 0 or 1.For each L we obtain a dataset of 32,000 intensity vectors.Accordingly by a parameter, subsets from the whole data set can be grouped.Let us examine the accuracy of classification depending on different parameters.The metric we use for this nonbinary classification problem is the accuracy, defined as the percentage of correct model predictions, where p i and y i are the predicted and the correct answer, respectively, and 1 is an indicator function equal to one if the condition is met and zero otherwise.Figure 5(A) illustrates how the accuracy of the supervised learning techniques varies with the parameter L. The accuracy increases as the propagation distance increases.When the value of L is small, theoretical predictions cannot distinguish between different topological phases, and all methods show similar accuracy plateaus in their graphs.Further, the accuracy of machine learning methods increases with increasing L, see Fig. 5(A).At the same time, the theoretical curve for the Zak phase in the nontrivial case ceases to converge to the quantized invariant value π for L = 10.6 cm [see Fig. 2(A)], while the power in the main array tends to grow and exceeds one half [see Fig. 2(B)].This is explained by reflection from the boundaries of leaky channels, as the field returns back to the main array.The requirement to know both the intensity and phase at the output in the method of Ref. [23] is replaced by statistical information from dynamics, but only intensity distributions at fixed L. Machine learning methods perform better for larger L. This may be due to the fact that, as soon as the radiation reaches the edges, to distinguish the trivial case from the non-trivial one, we can consider not only bulk properties but also the edges themselves, and machine learning methods allow us to take this effect into account.For instance, the trivial and nontrivial cases are even visually distinguishable in the dynamics shown in Fig. 2(C,D): in the non-trivial case the bulk modes poorly couple to waveguides at the edges.Note that if we increase the number of auxiliary waveguides N env , the theoretical power curve will exhibit convergence to 0.5, but the reflection off the main array edges will still manifest at larger propagation distances.Thus, neural network methods are applicable in a wider range of cases than the theoretical scheme based on the projector calculation.
Based on results summarised in Fig. 5(A), we conclude that classical machine learning methods show lower accuracy compared to neural networks and support vector machine (SVM).One of the two most promising models, the MLP method, was chosen for more thorough examination in Fig. 6.
As noted above, training was held using N c < N central waveguides.Figure 6(A) shows the dependence of the classification accuracy on the number of central waveguides while in training batches all L were involved.In the initially proposed theoretical scheme, we calculated the field projector for N c = N elements, but we can formally calculate it for any N c < N , as shown in Fig. 6(B).The Zak phase is seen to converge better to the correct quantised value for larger N c , and this condition is also necessary to increase the accuracy of machine learning algorithms: in Fig. 6(B) the precision increases as the N c /N ratio increases.
To better understand the performance of the supervised classification approach at distinguishing the different edge types, we compare topological SSH lattice with even number of elements and its non-topological counterpart, where dimerization is stipulated by the alternating difference in propagation constants (∆ 1 and ∆ 2 = −∆ 1 ), whereas the coupling between neighboring elements is uniform and equal to J, as schematically shown in Fig. 7(A).To prepare the corresponding datasets, parameters of the non-topological lattice (∆ 1 and J) are chosen such that its band structure coincides with the topological one (see Supplementary Materials, Section I).We introduce trivial edge defects as detunings of the propagation constant in the edge elements.Thereby, the defect potential for the left end is ∆1 = ∆ 1 (1 − q 1 ), whereas the defect potential for the right end is ∆2 = ∆ 2 (1 − q 2 ).We compare the accuracy of the neural network at three propagation distances [see Fig. 7(B)] for the topological SSH array and non-topological array with the edge defects in distinguishing the two classes: both edges either support confined solutions (class 11) or not (class 00).We find that for small amplitudes of the defect the accuracy for the case of the non-topological lattice is small compared to the topological one, since the defect is not connected to its bulk properties (unlike in the topological case), but bulk modes also change when the defect amplitude becomes large, leading to an increase in the model accuracy.

V. DISORDER AND TRANSFER LEARNING
Transfer learning refers to the use of a model trained on one set of data to make accurate predictions on a new task.Here we consider the performance of models trained on ideal data in classifying data obtained from different model parameters.If the quality metric falls slightly, we can conclude that the model has a generalization ability.This is particularly important in the context of nanophotonic circuits, where inevitable disorder will lead to sample-to-sample variations of device parameters.
First, we note that the generalization ability is not observed for the parameter L, and the accuracy drops significantly when testing on L different from the propagation distance used for the training data.On the other hand, we observe generalization over some N , that corresponds to attaching dimers to both edges of the main array, stipulated by the fact that such an addition of elements does not qualitatively change the topology of the lattice (see the cross-validation control map for parameter N in Supplementary Materials, Section IV).
Next, we examine a transfer learning approach that allows for the reuse of pretrained models at a fixed propagation distance of L = 10.6 cm [referring to the last point in Fig. 4(A)] on models with disorder.We introduce perturbations into the SSH Hamiltonian coefficients of two types: off-diagonal disorder in the inter-site coupling strengths and on-site disorder in the propagation constants.Incorporating disorder involves adding random variables to the coefficients of the Hamiltonian.For example, the off-diagonal disorder perturbs each coupling coefficient by the random variable l⟨d⟩mean(J 1 , J 2 ), where l is uniformly distributed in the range [−1/2, 1/2] and ⟨d⟩ is the disorder strength.This is a chiral type disorder in the sense that the Hamiltonian describing the disordered system respects the chiral symmetry, thus its topological edge states will remain at zero energy.We train the neural network using a non-disordered array and test it on the disordered lattice.We have identified a range of disorder strengths in which the previously trained neural network can operate with high confidence.
To quantify the impact of the disorder on the data, we compute the similarity between the output intensities.Specifically, we compute the output fields ψ m (⟨d⟩, i) 1,2 , where the indices 1 and 2 correspond to diagonal and off-diagonal disorders, respectively, and i represents the number of the specific disorder realization.We then introduce the intensity overlap as where summation is taken over waveguides of the main array and ψ 0 m is the output distribution in the disorder-free case.This overlap measures the similarity between the two distributions.It is a useful quantity to study the effect of disorder on the output of a system, as it allows us to quantify how much the output changes due to disorder.To plot the overlap measure, we calculate O 1,2 (⟨d⟩, i) over 4000 disorder realizations for each of the values of ⟨d⟩.To standardize the plotted functions, we divide them by the value of O 1,2 (⟨d⟩, i) when ⟨d⟩ is zero.This normalization process allows us to compare the variability of the overlap measure across different scenarios.The dotted areas in Fig. 8(A that for a given ⟨d⟩ the two forms of disorder have a similar effect on the overlap measure. To demonstrate transfer learning for disordered arrays, we train the neural network using a nondisordered array and test it for the disordered lattice [see Fig. 8(B)], the ranges of parameters as in Table II.The accuracy curves are similar for both types of disorder, showing a decrease in accuracy as the disorder amplitue increases.Expanding the range of the overlap measure results in a significant change in the output intensity, which ultimately leads to a sharp decline in the classification accuracy.
To estimate confidence of the trained neural network, we study the output of the last layer [see Fig. 2(C)] in detail.Softmax function returns probabilities of four classes.Here we fix the class 00 (both ends are trivial), but the results are comparable for the other classes as well.If the model assigns a high probability to a particular class, it is more confident in that prediction than if it assigns a lower probability.
We create a set of test vectors for each disorder amplitude and select vectors that have the highest probability of belonging to class 00.If this vector indeed belongs to class 00, we label the probability as true; otherwise, it is labeled as false.And then we average false and true answers to plot Fig. 8(C,D).Interestingly, as the accuracy of the neural network decreases, its level of certainty in both accurate and inaccurate responses increases.In other words, the neural network will more confidently give the wrong answer as the disorder strength is increased, indicating that the fabrication disorder can act as an adversarial perturbation.

VI. CONCLUSION
We have studied the performance of a variety of machine learning techniques at distinguishing different topological phases of leaky photonic lattices using measurements of the bulk intensity profile after a fixed propagation distance.First, we found that uncertainty in the initial conditions (such as the excited waveguide) reduces the quality of unsupervised clustering, leading to either mixing between different classes or the prediction of too many classes.We then compared the performance of a different supervised learning methods, finding that high accuracy can be achieved for sufficiently large propagation distances.The classification accuracy can be further improved by increasing the number of bulk waveguide intensities used.Finally, we studied the transfer learning ability of neural network-based classifiers.While the accuracy drops significantly if the network is trained on data obtained using a different propagation distance, the networks can accurately classify data from systems with sufficiently weak disorder, thus avoiding extensive training on each new system.Our approach for classifying lattices based on incomplete measurements can be further developed to solve a more general problem of reconstruction of the lattice Hamiltonian with some a priori knowledge of its symmetries in various fields including photonics, condensed matter physics, and quantum computing.

GFIG. 1 .
FIG. 1. (A) Schematic of a dimerized lattice of singlemode dielectric waveguides with tunable radiative losses and a possible experiment: the waveguide indexed by i is excited at the input as indicated by a yellow circle, the intensity distribution is measured in the central area of Nc elements at the output of the sample (the gray rectangle) to generate a dataset for learning the topological properties.(B) Tight binding model visualization of the photonic lattice in (A).The red and orange circles depict the main array -a one-dimensional dimerised SSH-like array of coupled elements.Gray circles illustrate auxiliary arrays constituting leaky channels attached to the main array.The differing dashing between the elements denote different coupling strengths.(C) Band structures of the main (dashed red lines) and auxiliary (gray solid line) arrays in the designed leaky photonic lattice inscribed in glass.(D) Different configurations of the two edges in a finite lattice.(E) The output intensity distribution (colored) overlaid with the proposed lattice cross-section.(F,G) Intensity distribution, numerically obtained in paraxial modeling at the output facet of the waveguide array for (F) the Hermitian (lossless) lattice and (G) the lattice with leaky channels.

11 FIG. 2
FIG. 2. (A,B) Evolution characteristics of the field in the main array in the lattice with fixed parameters obtained in the TBM of the nontrivial SSH array with (gold curves) and without (green curves) leaky channels.The Zak phase at z > 4 cm converges to the quantised π value, provided Nenv = 14 elements in leaky channels.(C,D) Field evolution in N elements of the main array assembled in a nontrivial (C) and trivial (D) configuration with fixed parameters of the lattice.The gray line on the right side marks the area of Nc central waveguides, the intensity of which is fed to the input of the neural network.

FIG. 3
FIG. 3. t-SNE maps of the system having 4 topological classes depending on its 2 edges: (A-C) Hermitian lattice, (D-F) lattice with leaky channels.The waveguide excited at the input is indexed by i. (A,B,D,E) correspond to the case of single-waveguide excitation: (A,D) i = 11 is odd, (B,E) i = 12 is even, (C,F) the excited waveguide is randomly chosen within a dimer.For each point in the two-dimensional parameter space there is a corresponding intensity distribution vector of dimension N = 22 (or N = 23), depending on the topological class.The four classes are color-coded: 00 (blue), 11 (red), 10 (green), 01 (black).

FIG. 4 .
FIG. 4. Statistical characteristics of intensity distributions in waveguides.The datasets were prepared for the Hermitian (A) and leaky (B) cases assuming two possible positions i = 11, 12 of the initial excitation at L = 7.6 cm.The mean value is indicated by markers in the middle of horizontal lines, while the standard deviation is represented by the borders of the lines.The classes are colorcoded: 00 (blue squares), 11 (red circles), 01 (black rightfacing triangles), 10 (green left-facing triangles).The total number of waveguides N is 22 (even) for classes 00 and 11, and 23 (odd) for classes 01 an 10.
FIG. 5. (A) Accuracy of supervised learning methods as a function of the propagation distance L. (B) Scheme of the convolutional neural network, which takes the intensity distribution at z = L as the input and determines topology of the lattice edges, Nc = 16.

FIG. 6
FIG. 6. (A) Accuracy of classification by deep learning methods depending on parameters: the total number of waveguides N and the number of the central waveguides Nc involved in the training.(B) Theoretical dependence of the Zak phase on the propagation distance and Nc in the nontrivial lattice of N = 22 elements.

FIG. 7 .
FIG. 7. (A) Schematics of the topological (upper row) dimerized array and the non-topological (lower row) dimer lattice with defect potentials ∆1,2 at the edges.(B) The accuracy of the neural network trained for the nontopological case for different values of the edge defect detuning q1, introduced as ∆1,2 = ∆1,2(1 − q1), and different propagation distances L = 7.6 cm (red dots), L = 8.6 cm (blue left-facing triangles), L = 10.6 cm (black right-facing triangles).For comparison, the colored horizontal lines depict the accuracy in the topological case for the corresponding L. (C) The band structure of the finite nontopological lattice depending on the defect detuning, at the fixed number of elements within the main array N = 22.The shading shows bands for all possible coupling coefficients, J, and detunings, ∆1 = −∆2, that were utilized to generate the datasets.(D) Profiles of the modes bound to the ends of the non-topological lattice.Colors and shapes of the markers in (C) in the representative spectral positions correspond to the profiles in (D).

FIG. 8 .
FIG. 8. (A) Overlap measure variation induced by the disorder: shaded areas are ranges of variance due to disorder over an ensemble of 4000 disorder realizations (green is for diagonal disorder, gray for off-diagonal disorder), asterisks and dots are mean values.All parameters of the lattice are fixed.(B) Transfer learning for the disordered lattice.We train neural network in the absence of disorder ⟨d⟩ = 0 and test the prediction accuracy for different values of disorder.All parameters of the lattice are varied according to Table II.(C,D) Probability assigned to false (C) and true (D) answers of the neural network for different values of disorder (green bars are for diagonal disorder, gray bars are for off-diagonal disorder).

TABLE I .
Parameters of the designed leaky photonic lattice: semiaxes of elliptical single-mode waveguides ax,y;

TABLE II .
Ranges of parameters used in data set preparation.Average values of the listed TBM parameters correspond to the physical quantities in TableI, as established in paraxial modeling.