Christopher Yeung, Ju-Ming Tsai, Brian King, Benjamin Pham, David Ho, Julia Liang, Mark W. Knight and Aaswath P. Raman

Multiplexed supercell metasurface design and optimization with tandem residual networks

De Gruyter | 2020

Abstract

Complex nanophotonic structures hold the potential to deliver exquisitely tailored optical responses for a range of applications. Metal–insulator–metal (MIM) metasurfaces arranged in supercells, for instance, can be tailored by geometry and material choice to exhibit a variety of absorption properties and resonant wavelengths. With this flexibility, however, comes a vast space of design possibilities that classical design paradigms struggle to effectively navigate. To overcome this challenge, here, we demonstrate a tandem residual network approach to efficiently generate multiplexed supercells through inverse design. By using a training dataset with several thousand full-wave electromagnetic simulations in a design space of over three trillion possible designs, the deep learning model can accurately generate a wide range of complex supercell designs given a spectral target. Beyond inverse design, the presented approach can also be used to explore the structure–property relationships of broadband absorption and emission in such supercell configurations. Thus, this study demonstrates the feasibility of high-dimensional supercell inverse design with deep neural networks, which is applicable to complex nanophotonic structures composed of multiple subunit elements that exhibit coupling.

1 Introduction

Nanophotonic materials, including metasurfaces and metamaterials, have greatly expanded our ability to tailor light–matter interaction and deliver new functionalities for information processing and sensing applications [1], [2], [3], [4]. As demand for advanced capabilities and high-performance nanophotonic devices grow, multimodal implementations with interconnected ensembles of optical subcomponents, including supercells, have shown great promise in delivering tailored responses with respect to many optical characteristics [5], [6], [7], [8]. For example, complex spatial arrangements within photonic crystal circuits have yielded high-efficiency spatial mode conversion [9]. Similarly, by employing metasurfaces that contain periodic arrays of meta-atoms with different geometric parameters, a range of useful behaviors including out-of-plane beam deflection and mirroring can be demonstrated [10]. Although the incorporation of numerous distinct subunit elements within a photonic structure is desirable, it is accompanied by an exponential increase in design costs as a result of the increased dimensionality of the associated design space [11].

A particular category of periodic metasurface structures that has shown promise in supercell configurations is the metal–insulator–metal (MIM) metasurface absorber. Periodic MIM absorbers yield strong resonances that are narrowband in nature, where the wavelength of the resonance peak can be shifted by changing the shape of the resonator [12], [13], [14], [15]. By adopting simple supercell configurations, which contain more than one resonator geometry, multiresonant and broadband absorption behavior has previously been realized [16], [17], [18]. The design and optimization of more complex supercells with hybridized behavior, however, remains an open challenge, but holds the potential of yielding a broader range of spectral responses than previously achieved.

Conventional design processes for periodic and complex supercell metasurfaces rely on electromagnetic (EM) simulations that are iteratively optimized by tuning key design parameters until the desired optical properties are obtained. Techniques that have been employed include evolutionary algorithms [19], topology optimization [20], [21], [22], and adjoint-based methods [23], [, 24]. In the context of supercells and complex/nonperiodic arrangements, methods such as Schur complement domain decomposition and overlapping-domain approximation have yielded compelling results [25], [, 26]. As the unit cell of a metasurface increases in size and complexity, however, computation times from iterative optimization can rapidly escalate from hours to potentially days or weeks. Additionally, optimizations must be repeated and reconfigured for every new target, thus requiring a substantial amount of computational resources and, oftentimes, prior intuition on the capability of a particular class of nanophotonic structures. These computational costs are further compounded by the fact that only the final optimized results are preserved; any prior data generated in an optimization cycle is not typically reused in the future [27]. As a result, iterative design methods also become increasingly inefficient over time [28].

In response to the need for more efficient design strategies, data-driven approaches based on machine learning, such as deep neural networks (DNNs), have found applications in nanophotonic design [29]. DNNs are now well established in many fields, including: natural language processing, drug discovery, materials design, and medical diagnosis [30], [31], [32]. In the photonics context, DNNs have shown promise in designing a diverse range of high-performance structures by directly predicting key geometric parameters (e.g., resonator widths, lengths, radii, etc.). By leveraging a one-time investment of EM simulation training data, DNNs can generate designs at orders-of-magnitude faster speeds than traditional optimization algorithms. An accurate DNN can also be paired with numerical optimization methods to save simulation time, where the DNN identifies solutions near the global minimum and the optimization refines the performance further [33]. With training datasets ranging from several hundred [34] to several thousand [27] instances, previously explored machine learning and DNN-based photonics design include the forward and inverse modeling of multishell nanoparticles, multilayer thin films, and various classes of metasurfaces [35], [36], [37], [38], [39]. A forward-modeling DNN takes structural parameters as inputs and predicts optical properties such as the absorption spectra. In contrast, an inverse-modeling DNN accepts target optical properties as inputs and generates matching structural parameters. Further advancements in DNNs have led to the development of the tandem network, which is designed to overcome the nonuniqueness scattering problem [27], [, 37]. However, prior tandem networks have relied on traditional fully connected or dense networks, while tandem implementations with recent architectural advancements such as residual networks or ResNets (which address the well-known vanishing gradient problem [40]) remain unexplored.

While promising, prior studies of DNNs for nanophotonic design have primarily focused on individual scatterers or periodic structures with single-unit cell elements and relatively narrowband operation [41], [42], [43]. Recent studies involving the design of structures with multiple optical elements have assumed that they are separately constructed and then assembled into a multielement structure [44], [, 45]. This approach makes the limiting assumption that the coupling between adjacent elements is sufficiently weak, and further does not affect their cross-section. Moreover, in studies where coupling cannot be neglected, separately trained models were required in order to design metasurfaces with specific numbers of elements [45], which limits scalability. Other work with unit cells consisting of multiple neighboring elements do not solve the inverse problem, but instead develop a fast and accurate proxy or surrogate model for forward design [46]. Therefore, an ML-based strategy for complex supercells that: 1) directly solves the inverse design problem, 2) generates structures with a wide range of unique elements, and 3) considers strong coupling interactions or mode hybridization between individual elements is lacking today, but could allow for the demonstration of complex nanophotonic architectures with a broader range of spectral responses.

In this article, we investigate the inverse design of large multiplexed supercell metasurfaces with over 100 subunit elements that can achieve a diverse set of broadband spectral responses. Specifically, we focus on engineering arbitrary bandwidth absorbers operating in the mid- and long-wave infrared regime (4–12 µm) by designing supercell MIM metamaterial absorbers through a deep learning approach. To navigate the large design space that comes with the increased dimensionality of supercells and to address the vanishing gradient problem associated with deep network architectures, we employed a tandem residual network (shown conceptually in Figure 1A). We demonstrate that with a training dataset of several thousand simulations, in a high-dimensional design space with over three trillion possible design combinations, the network can successfully design narrowband, multiresonance, and broadband-absorption supercell metasurfaces with high degrees of accuracy. Furthermore, we show that the network itself can be harnessed to approximate the structure–property relationships of the explored class of metasurfaces.

Figure 1: Inverse design of supercell metasurfaces with a range of underlying symmetries using a tandem residual network approach.(A) A target absorption spectrum is defined, and the matching design parameters for a multiplexed array of metal–insulator–metal (MIM) resonators are generated. (B) Data preparation schematic for deep learning. Supercell design parameters (1) representing resonator lengths and positions (2) are converted into 3D models (3). Full-wave electromagnetic simulations are performed on the models (4). Design parameters along with corresponding “ground truth” (5) and predicted (6) absorption spectra are used to train the tandem residual network.

Figure 1:

Inverse design of supercell metasurfaces with a range of underlying symmetries using a tandem residual network approach.

(A) A target absorption spectrum is defined, and the matching design parameters for a multiplexed array of metal–insulator–metal (MIM) resonators are generated. (B) Data preparation schematic for deep learning. Supercell design parameters (1) representing resonator lengths and positions (2) are converted into 3D models (3). Full-wave electromagnetic simulations are performed on the models (4). Design parameters along with corresponding “ground truth” (5) and predicted (6) absorption spectra are used to train the tandem residual network.

2 Results and discussion

2.1 Data preparation for deep learning

Nanophotonic supercell structures such as MIM metasurfaces are capable of producing unique optical responses that extend beyond the sum of their parts. Specifically, in addition to the superposition of individual responses, distinct responses may also arise from the interaction or hybridization between neighboring elements [47]. Several examples of such interactions are presented in Figure S1, where the absorption spectra and EM field profiles of various supercell designs are shown. In this figure, we show that specific arrangements of identical subunit resonators can yield absorption peaks with different amplitudes and wavelengths, or new peaks entirely, in comparison to the response of the individual elements. These examples reveal that the relative positions of individual elements are critical as the characteristics of the peaks can depend strongly on which resonators are adjacent to each other. Therefore, supercell-class metasurfaces can potentially access new domains of functionalities by leveraging multiscale optical phenomena. To this end, a comprehensive inverse design scheme for supercell structures must consider the geometries and physical arrangements of individually integrated structures as well as their collective EM interactions.

Figure 1B presents the detailed implementation of our supercell inverse design strategy. First, we defined a supercell layout of MIM resonators (labeled “1” in Figure 1B). The layout contains an assortment of 100 nm-thick gold cross-shaped resonators with a 100 nm gold backing and 200 nm Al2O3 spacer. This class of metasurfaces was derived from existing literature on selective thermal emitters and exhibits narrowband resonances in the mid-infrared (MIR) range [48]. A single supercell design is represented by an array of cross-shaped resonator lengths (ranging from 1.4–3 µm in 0.2 µm steps), each with fixed widths (500 nm). The resonator arrays (labeled “2” in Figure 1B) embody a quadrant of the supercell and resemble a hexagonal close-packed (HCP) lattice with a twin boundary, where the individual resonators are mirrored along the diagonal plane. The quadrant is then mirrored along the x- and y-axes to create a fourfold symmetric supercell. The HCP configuration is designed to maximize the area density (and therefore the resonance efficiency) of the supercell, while the fourfold symmetry ensures the structure is s- and p-polarization independent under normal incidence. We limited our supercell size to 25 unique resonators per quadrant (12.8 × 12.8 µm2 before fourfold symmetry) to maximize the resonance modes within the 4–12 µm window while simultaneously attempting to minimize simulation time. Thus, a unique supercell design is represented by DA = [l1, l2, … , l25], with l25 being the length of the 25-th resonator (where DA, … , Dn are vectors with distinct l-values). These vectors were then used as the supercell design parameters for deep learning.

We converted the supercell design parameters into three-dimensional MIM structure models (labeled “3” in Figure 1B) and performed full-wave EM simulations (Lumerical FDTD) on these models over the spectral range of 4–12 μm at normal incidence (labeled “4” in Figure 1B), obtaining an 800-point “ground truth” absorption spectrum for each structure (labeled “5” in Figure 1B). Using this approach, we simulated the absorption spectrum (A) for pseudo-randomly generated design parameters (D) to create training data pairs (D, A) for the neural network. As discussed in the next section, the deep learning model “learns” by comparing the ground truth spectra (A) to the network-predicted spectra (A′, labeled “6” in Figure 1B).

2.2 Network characterization and evaluation

The performance of a tandem network hinges on the accuracy of the forward-modeling network as well as the breadth and size of the training dataset. Thus, we sought to optimize the architecture of the forward-modeling network and to ensure that the size of our training dataset maximizes the network’s implementation efficiency and predictive capabilities. Unlike previous implementations of the tandem architecture, our approach utilizes one-dimensional convolutional neural networks (1-D CNNs) instead of dense networks. 1-D CNNs have been used in various scientific domains [49], [, 50], with recent works showing that they are capable of outperforming dense networks in terms of regression fidelity and generalization capabilities [51]. This is enabled by the convolutional layers of the CNN, which are optimized to extract highly discriminative features using a large set of 1-D filter kernels [49]. Furthermore, our particular CNN consists of residual building blocks, which leverage identity shortcuts or skipped connections to address the vanishing gradient problem and achieve better performance than “plain” networks of the same depth [52], [53], [54]. The corresponding ResNet was trained in the forward-modeling configuration to predict an absorption spectrum (A′), given a set of design parameters (D) as inputs. We evaluated and compared the performances of the dense network, CNN, and ResNet in Figures S2 and S3, where it can be observed that the ResNet achieved the lowest validation loss out of the three model types. Our optimized forward-modeling ResNet architecture consists of a 25-neuron input layer (matching the vector size of the supercell design parameters D, with values normalized from 0 to 1), two residual blocks, followed by an 800-neuron dense layer. Each residual block contains two 1-D convolutional layers with 32 filters, kernel size of 3, and zero-padding. In addition, the Adam optimizer, batch size of 10, and ReLU activation functions yielded the lowest validation loss.

Using the same hyperparameters as the forward network, except with an inverted sequence of input and output layers (with 800 and 25 filters, respectively), we designed an inverse-modeling network for the prediction of design parameters (D′) given an input A. However, plain inverse modeling networks are known to encounter the nonuniqueness problem [27], [, 37], where the multiple mappings between an EM response and its available structural parameters may confound the network’s learning process. To illustrate this problem in the context of our training data, Figure S4 shows several examples where two substantially different supercell design layouts map to nearly identical dual-band and triple-band responses. Due to the considerable degrees of freedom in a supercell design, the nonuniqueness problem in a supercell architecture is exacerbated relative to single-element and periodic structures, and is crucial to address. Thus, to account for this issue, we implemented the tandem architecture by coupling the inverse-modeling network with a pretrained forward-modeling network.

First, we trained a standard tandem dense network by minimizing the loss function between the input absorption spectrum (A) and the spectrum predicted by the forward-modeling network (A′), where A′ is generated by the same D′ predicted by the inverse-modeling network from above. As in the study by Liu et al. [37], we define the tandem network’s loss as the mean squared error between A and A′: MSE = 1 n ( A i A i ) 2 . This loss calculation is distinct from the plain inverse modeling network, which is programmed to minimize the loss between the designs from the training dataset (D) and the predicted designs (D′). Since D′ may offer a completely different solution than D that correctly maps to the target response (due to the issue of nonuniqueness), a plain inverse-modeling network can struggle to minimize loss or converge, whereas in the tandem network, the loss function converges so long as the target and predicted spectra (A and A′) are similar [27], [, 37]. In other words, since the task at hand requires solving a one-to-many problem, the tandem network finds the optimum response for an input target rather than mixing the corresponding outputs (which can lead to a suboptimal solution).

To validate that the tandem network reaches said optimum response, in Figure S5, we show the tandem neural network architecture’s ability to resolve the nonuniqueness issue by testing the accuracy of designs for which an explicit nonunique relationship exists, which were found in Figure S4. As seen in Figure S5A, dual-band and triple-band spectra were passed into the inverse modeling network, and a poor match between the input spectra and the simulated design parameters can be observed. However, when the same spectra were passed into the tandem network (Figure S5B), the accuracy between the input spectra and the simulated parameters is substantially improved. Thus, we find that the tandem dense network effectively addresses the nonuniqueness issue, and yields superior accuracy over a plain inverse modeling network for the multiplexed supercell metasurfaces evaluated here. Furthermore, in Figure S6, we verify that the quantity of our training data was capable of maximizing the network’s ability to learn supercell designs.

Using the customized loss function approach described above, we implemented a tandem residual network and compared its performance to the tandem dense network. Figure 2 presents several example test results from both networks; with new spectral targets from the validation dataset. These test results are obtained by simulating the predicted D′, then comparing the target and simulated spectra (shown as blue and orange lines, respectively). It can be observed that across various spectral response patterns, the tandem residual network (Figure 2B) produces supercell designs with greater accuracy than the tandem dense network (Figure 2A). A larger statistical evaluation using the entire validation dataset (360 input spectra) reveals that the average validation MSE is approximately 2.5 × 10−3 for the tandem dense network (Figure 2C) and 8.2 × 10−4 for the tandem residual network (Figure 2D). Moreover, we observe that the tandem residual network exhibits a lower overall distribution of errors. As a result, we demonstrate that a tandem architecture composed of 1-D convolutional layers and residual building blocks is well-suited for supercell design and can outperform a tandem network (of the same network depth) that is based on fully connected layers.

Figure 2: Performance evaluation based on mean-squared error (MSE) of target spectra and simulated designs for (A) the tandem dense and (B) tandem residual networks. The tandem residual network achieves a lower MSE for three distinct targets. Inset images show the design parameters for the corresponding spectra. Statistical analyses across the entire validation dataset (over 300 spectra) for the tandem dense (C) and residual (D) networks indicate that the latter achieves a lower average MSE.

Figure 2:

Performance evaluation based on mean-squared error (MSE) of target spectra and simulated designs for (A) the tandem dense and (B) tandem residual networks. The tandem residual network achieves a lower MSE for three distinct targets. Inset images show the design parameters for the corresponding spectra. Statistical analyses across the entire validation dataset (over 300 spectra) for the tandem dense (C) and residual (D) networks indicate that the latter achieves a lower average MSE.

2.3 Inverse design of multiresonance and broadband metasurfaces

We utilized the tandem residual network to generate new supercell metasurface designs with a broad range of spectral properties. Figure 3 presents a series of test cases comparing the target network inputs to the simulated results of the corresponding output designs. The inset images show the spatial geometries of each supercell designed by the network. For example, as shown in Figure 3A, after specifying a narrowband target with a full width half maximum (FWHM) of 0.5 µm, the network generated a periodic layout that matches the target spectra with over 90% accuracy as well as results from prior literature [48]. Similarly, in Figure 3B and C, dual-narrowband and triple-narrowband designs were created (with sharp resonances at two and three discrete wavelengths) that closely match their respective targets. In these multiresonance structures, the supercells include additional cross dimensions that are associated with distinct resonances.

Figure 3: Inverse design of new supercell metasurfaces with the tandem residual network.The structures exhibit (A) narrowband, (B) dual-narrowband, (C) triple-narrowband, (D) broadband, (E) dual-broadband, and (F) graybody behaviors. Blue lines indicate the target spectra used as inputs to the network, and orange lines represent the simulated results of the output design parameters. Inset images show the physical layouts of the network-generated supercells.

Figure 3:

Inverse design of new supercell metasurfaces with the tandem residual network.

The structures exhibit (A) narrowband, (B) dual-narrowband, (C) triple-narrowband, (D) broadband, (E) dual-broadband, and (F) graybody behaviors. Blue lines indicate the target spectra used as inputs to the network, and orange lines represent the simulated results of the output design parameters. Inset images show the physical layouts of the network-generated supercells.

The ability to construct an array of resonator geometries suggests that different resonant modes can be superimposed to achieve responses of arbitrary bandwidth [55]. Accordingly, we tasked the neural network with designing metasurfaces with various broadband characteristics (FWHM > 1 µm). In Figure 3D, a broadband structure with an FWHM of 1.5 µm is shown, and in Figure 3E, we increased the complexity of the target to design a structure with dual-broadband absorption peaks. Lastly, in Figure 3F, we demonstrate the design of a broadband graybody structure that encompasses the entire MIR range of resonance wavelengths captured by the training dataset (5–9 µm).

In the design of the aforementioned multiresonance and broadband structures, the network not only defined the resonator dimensions required to achieve resonances at the target wavelengths, but also determined their appropriate placements within the lattice in order to reach the target absorption amplitudes. For example, as shown in Figure 4A, the network-designed triple-narrowband structure possesses three primary cross lengths that are responsible for resonances at 5.2, 7.2, and 8.6 µm. The high absorption amplitudes are attributed to the periodic and short-range ordered arrangements (repeating patterns spanning 1–2 subunit cell distances) of the resonators, which result in the strong dipole resonances seen in the electric field enhancement plots. When short-range order is converted to long-range order (patterns spanning beyond 2 subunit cell distances), additional response types are enabled. In particular, we observe in Figure 4B that the network can alter the relative positions of the same subunit resonators to produce a new response with lower peak amplitudes and a peak shift. As seen in the EM field profiles, this response is achieved by the new interactions that emerge from the modified arrangement of resonators, as well as modifications to the cross-section of a given resonator by its neighbors. By introducing a larger assortment of cross geometries with more complex interactions, the net absorption spectra can also produce a graybody response (Figure 4C). Thus, by systematically predicting the subunit resonator dimensions as well as their spatial positions, the trained network can modulate absorption peak phase and amplitude by designing multiplexed metasurfaces with a range of underlying symmetries.

Figure 4: Relationships between supercell absorption properties and their subunit resonator spatial distributions. Simulated absorption spectra (of the tandem network-designed structures) and corresponding electric field profiles are shown for triple-narrowband structures with (A) high and (B) low absorption peaks and a (C) graybody structure. These plots reveal the dependence of absorption response on resonator geometry and position relative to other elements.

Figure 4:

Relationships between supercell absorption properties and their subunit resonator spatial distributions. Simulated absorption spectra (of the tandem network-designed structures) and corresponding electric field profiles are shown for triple-narrowband structures with (A) high and (B) low absorption peaks and a (C) graybody structure. These plots reveal the dependence of absorption response on resonator geometry and position relative to other elements.

2.4 Estimating the structure–property relationships of the explored metasurface class

It is a challenging design task to combine multiple distinct resonant modes in a single metastructure while leveraging, or alternatively minimizing, hybridization between modes and maintaining high absorption per unit area [55], [, 56]. Thus, multiplexed resonator structures impose an inherent tradeoff between broadband response and maximum absorption. Here, we seek to investigate the structure–property relationships of the explored metasurface class by leveraging the near-instantaneous calculation speed of the neural network. Previous studies have used pretrained ML models for design space exploration and pattern discovery [42], [57], [58], [59]. For instance, it has been shown that for a constrained domain, the fast inference speed of ML models can produce reasonably accurate estimates of an optical system’s physical responses so that unnecessary exploration of the solution space can be avoided [60]. Similarly, to enable the exploration of our supercell design space, we use the pretrained forward-modeling network as a validation mechanism for the design parameters predicted by the tandem network. As one example, we specified design targets using Lorentzian functions of increasing bandwidth (FWHM of 0.2–4 µm centered at 7 µm), illustrated in Figure 5A. The tandem network outputs were then fed into the forward-modeling network, and the resulting design predictions were compared to the initial targets. Full-wave simulation results of the tandem network-designed structures are also presented in Figure 5A, indicating that the network-predicted results match well with the ground truth. In this approach, the forward network effectively serves as a high-speed surrogate EM solver, replacing the FDTD software that was used to generate the training data.

Figure 5: Probing metasurface design relations using the tandem network.(A) Tandem network input targets (blue lines) for various Lorentzian functions (FWHM of 0.6, 1, 2, and 3 µm centered at 7 µm) and the corresponding forward network-predicted results (orange lines). The network is unable to identify designs that exceed the model’s estimate of bandwidth/maximum absorption of the explored class of supercell metasurfaces. Full-wave simulation results are shown (green lines) for comparison, indicating that the network-predicted results match well with the ground truth. (B) Network-determined design trends and metrics, including the mean-squared error (MSE) between target and design responses, emissivity of the metasurface, and max absorption as functions of FWHM (THz).

Figure 5:

Probing metasurface design relations using the tandem network.

(A) Tandem network input targets (blue lines) for various Lorentzian functions (FWHM of 0.6, 1, 2, and 3 µm centered at 7 µm) and the corresponding forward network-predicted results (orange lines). The network is unable to identify designs that exceed the model’s estimate of bandwidth/maximum absorption of the explored class of supercell metasurfaces. Full-wave simulation results are shown (green lines) for comparison, indicating that the network-predicted results match well with the ground truth. (B) Network-determined design trends and metrics, including the mean-squared error (MSE) between target and design responses, emissivity of the metasurface, and max absorption as functions of FWHM (THz).

The design predictions reveal that when an unobtainable target was specified, the network designs a structure with the closest possible solution in the context of the supercell design space which it was trained on. As a result, we observe that as the target bandwidth increases, the discrepancy between the target response and the closest design (measured by MSE) increases as well (Figure 5B). This, in turn, allows us to numerically infer a relationship between the broadband response and the maximum obtainable absorption for this class of resonant metasurfaces, as estimated by the machine learning algorithm. By fitting these observations, we can further derive an estimate for maximum absorption at various bandwidths (R2 = 0.98):

(1) A max = 0.0004 f 2 0.0302 f + 1.0154  ,
where A max is the model’s estimate of maximum absorption and f is the FWHM in THz. While we show that the structure–property relationships of a design space can be easily represented using a forward-modeling network, we note that the relation in Eqn. (1) (or any relation captured in a similar manner) is not universally applicable, but subject to the same parametric restrictions that were imposed on the training data (i.e., cross widths fixed at 500 nm and cross lengths within the range of 1.4–3 µm). Metasurfaces with dimensions beyond the range restricted by the training data may exhibit a relationship that is different from Eqn. (1). However, the training dataset may simply be updated to incorporate a wider range of geometries to account for such parametric restrictions.

As an additional example of discovering application-specific design insights through the neural network, we can calculate the average normal-incidence emissivity of the optimized supercell metasurfaces within defined target bandwidths:

(2) ε ¯ = v 1 v 2 I B B T , v · ε v d v v 1 v 2 I B B T , v d v .
Here, I B B ( T , v ) = 2 h v 3 c 2 1 e h v k B T 1 is the spectral radiance of a blackbody at temperature T, where h is Planck’s constant, k B is the Boltzmann constant, c is the speed of light, and v is frequency. The lower and upper bounds of the integral ( v 1 and v 2) are derived from the evaluated spectral range (4–12 µm). ε( v) is the metasurface’s spectral emissivity, which is equal to A( v) by Kirchoff’s law. In this case, by querying the neural network in a cyclic manner to solve for emissivity (at various temperatures) as a function of the target bandwidth, we can find the relationship between the two parameters ( Figure 5B) in a remarkably short time frame (less than 1 min). Overall, we observe that as the sought bandwidth (FWHM) increases, the MSE between the target (with near-unity absorption across the entire bandwidth) and design response increases, while the maximum absorption point of the achievable design decreases. Furthermore, the integrated normal-incidence emissivity increases as the additional bandwidth compensates for the decreases in the peak absorption/emissivity value. However, as can be seen in Figure 5B, the precise relationship is complex and depends both on the bandwidth being specified and the temperature of the metasurface because the blackbody spectral radiance changes with temperature. Thus, by training a neural network that is tasked with the inverse design of complex supercell metasurfaces, we demonstrate that the same framework can be strategically leveraged to rapidly identify design trends and dependencies associated with application-specific properties (within the parameter ranges represented by the trained class of metasurfaces).

3 Conclusions

In this article, we demonstrated a machine learning approach to the inverse-design of multiplexed supercell metasurfaces with over 100 subunit elements. The added degrees of freedom offered by a supercell architecture, relative to periodic single-element structures, yield new tailored capabilities including multiresonant and broadband responses. By forming a cascaded architecture with inverse-modeling and forward-modeling networks, we show that a tandem network effectively overcomes the nonuniqueness problem present in supercell architectures, and can successfully learn a high-dimensional design space of over three trillion possible designs using only 3600 data instances. Moreover, we present a network architecture based on 1-D convolutional layers and residual building blocks that is capable of generating designs with greater accuracy than a conventional tandem network based on fully connected layers. Through the superposition and coupling of multiple resonant modes in a compact region, the tandem residual network can efficiently design supercell structures with a range of symmetries that yield narrowband, broadband, and multiresonant responses. The network not only predicts the geometric parameters for an array of resonators (e.g., resonator widths, lengths, radii, etc.), but also their optimum spatial arrangement toward satisfying a specified target. Therefore, the presented approach enables additional degrees of complexity in metasurface design by directly generating structures with a wide range of unique elements while accounting for coupling between these elements. Though we sought to maximize implementation efficiency by minimizing the required training data, we expect that the performance of our tandem network can be improved with more training data and a larger network architecture. Furthermore, we demonstrate that the network itself can be utilized to approximate the structure–property relationships of the investigated class of metasurfaces (within the parameter ranges represented by the training data). By using the forward-modeling network as a full-wave EM simulator, high-speed parameter sweeps can be performed to capture property-specific design trends such as maximum absorption and emissivity as a function of bandwidth. Importantly, our results show that DNN-based approaches can efficiently design and characterize large-scale supercell metasurfaces with numerous discrete resonators. We believe our results can expedite the development of supercell-class nanophotonic structures and materials, which may in turn yield new tailored capabilities not achievable through conventional periodic nanostructures.

Acknowledgments

This work was supported by the Sloan Research Fellowship from the Alfred P. Sloan Foundation.

References

[1] J. Olthaus, P. Schrinner, and D. Reiter, “Optimal photonic crystal cavities for coupling nanoemitters to photonic integrated circuits,” Adv. Quantum Technol., vol. 3, p. 1900084, 2020, https://doi.org/10.1002/qute.201900084. Search in Google Scholar

[2] H. Yoshimi, T. Yamaguchi, Y. Ota, Y. Arakawa, and S. Iwamoto, “Slow light waveguides in topological valley photonic crystals,” Opt. Lett., vol. 45, pp. 2648–2651, 2020, https://doi.org/10.1364/ol.391764. Search in Google Scholar

[3] F. Bin Tarik, A. Famili, Y. Lao, and J. D. Ryckman, “Robust optical physical unclonable function using disordered photonic integrated circuits,” Nanophotonics, p. 20200049, 2020. Search in Google Scholar

[4] V. Mittapalli and H. Khan, “Excitation schemes of plasmonic angular ring resonator-based band-pass filters using a MIM waveguide,” Photonics, vol. 6, no. 2, p. 41, 2019, https://doi.org/10.3390/photonics6020041. Search in Google Scholar

[5] F. Ding, Z. Wang, S. He, V. M. Shalaev, and A. V. Kildishev, “Broadband high-efficiency half-wave plate: a supercell-based plasmonic metasurface approach,” ACS Nano, vol. 9, no. 4, pp. 4111–4119, 2015, https://doi.org/10.1021/acsnano.5b00218. Search in Google Scholar

[6] R. A. Aoni, M. Rahmani, L. Xu, et al., “High-efficiency visible light manipulation using dielectric metasurfaces,” Sci. Rep., vol. 9, no. 1, pp. 1–9, 2019, https://doi.org/10.1038/s41598-019-42444-y. Search in Google Scholar

[7] P. C. Wu, W. Y. Tsai, W. T. Chen, et al., “Versatile polarization generation with an aluminum plasmonic metasurface,” Nano Lett., vol. 17, no. 1, pp. 445–452, 2017, https://doi.org/10.1021/acs.nanolett.6b04446. Search in Google Scholar

[8] Q. Ma, L. Chen, H. B. Jing, et al., “Controllable and programmable nonreciprocity based on detachable digital coding metasurface,” Adv. Opt. Mater., vol. 7, no. 24, p. 1901285, 2019, https://doi.org/10.1002/adom.201901285. Search in Google Scholar

[9] V. Liu, D. A. Miller, and S. Fan, “Ultra-compact photonic crystal waveguide spatial mode converter and its connection to the optical diode effect,” Opt. Express, vol. 20, no. 27, pp. 28388–28397, 2012, https://doi.org/10.1364/oe.20.028388. Search in Google Scholar

[10] X. Guo, Y. Ding, X. Chen, Y. Duan, X. Ni, “Molding free-space light with guided-wave-driven metasurfaces,” 2020, arXiv preprint, arXiv:2001.03001, https://doi.org/10.1364/cleo_qels.2020.fth4q.3. Search in Google Scholar

[11] R. S. Hegde, “Deep learning: a new tool for photonic nanostructure design,” Nanoscale Adv., vol. 2, pp. 1007–1023, 2020, https://doi.org/10.1039/c9na00656g. Search in Google Scholar

[12] S. Ogawa and M. Kimata, “Metal-insulator-metal-based plasmonic metamaterial absorbers at visible and infrared wavelengths: a review,” Materials, vol. 11, no. 3, p. 458, 2018, https://doi.org/10.3390/ma11030458. Search in Google Scholar

[13] A. Y. Vorobyev, A. N. Topkov, O. V. Gurin, V. A. Svich, and C. Guo, “Enhanced absorption of metals over ultrabroad electromagnetic spectrum,” Appl. Phys. Lett., vol. 95, no. 12, p. 121106, 2009, https://doi.org/10.1063/1.3227668. Search in Google Scholar

[14] Y. Q. Ye, Y. Jin, and S. He, “Omnidirectional, polarization-insensitive and broadband thin absorber in the terahertz regime,” JOSA B, vol. 27, no. 3, pp. 498–504, 2010, https://doi.org/10.1364/josab.27.000498. Search in Google Scholar

[15] H. H. Chen, Y. C. Su, W. L. Huang, et al., “A plasmonic infrared photodetector with narrow bandwidth absorption,” Appl. Phys. Lett., vol. 105, no. 2, p. 023109, 2014, https://doi.org/10.1063/1.4890514. Search in Google Scholar

[16] Y. Ma, Q. Chen, J. Grant, S. C. Saha, A. Khalid, and D. R. Cumming, “A terahertz polarization insensitive dual band metamaterial absorber,” Opt. Lett., vol. 36, no. 6, pp. 945–947, 2011, https://doi.org/10.1364/ol.36.000945. Search in Google Scholar

[17] X. Shen, T. J. Cui, J. Zhao, H. F. Ma, W. X. Jiang, and H. Li, “Polarization-independent wide-angle triple-band metamaterial absorber,” Opt. Express, vol. 19, no. 10, pp. 9401–9407, 2011, https://doi.org/10.1364/oe.19.009401. Search in Google Scholar

[18] H. Luo, Y. Z. Cheng, and R. Z. Gong, “Numerical study of metamaterial absorber and extending absorbance bandwidth based on multi-square patches,” Eur. Phys. J. B, vol. 81, no. 4, pp. 387–392, 2011, https://doi.org/10.1140/epjb/e2011-20115-1. Search in Google Scholar

[19] A. Gondarenko and M. Lipson, “Low modal volume dipole-like dielectric slab resonator,” Opt. Express, vol. 16, pp. 17689–17694, 2008, https://doi.org/10.1364/oe.16.017689. Search in Google Scholar

[20] C. Y. Kao, S. Osher, and E. Yablonovitch, “Maximizing band gaps in two-dimensional photonic crystals by using level set methods,” Appl. Phys. B, vol. 81, pp. 235–244, 2005, https://doi.org/10.1007/s00340-005-1877-3. Search in Google Scholar

[21] A. Y. Piggott, J. Lu, K. G. Lagoudakis, et al., “Inverse design and demonstration of a compact and broadband on-chip wavelength demultiplexer,” Nat. Photonics, vol. 9, no. 6, pp. 374–377, 2015, https://doi.org/10.1038/nphoton.2015.69. Search in Google Scholar

[22] B. Shen, P. Wang, R. Polson, and R. Menon, “An integrated nanophotonics polarization beamsplitter with 2.4x2.4 μm2 footprint,” Nat. Photonics, vol. 9, pp. 378–382, 2015, https://doi.org/10.1038/nphoton.2015.80. Search in Google Scholar

[23] A. Oskooi, A. Mutapcic, S. Noda, et al., “Robust optimization of adiabatic tapers for coupling to slow-light photonic-crystal waveguides,” Opt. Express, vol. 20, pp. 21558–21575, 2012, https://doi.org/10.1364/oe.20.021558. Search in Google Scholar

[24] P. Seliger, M. Mahvash, C. Wang, and A. Levi, “Optimization of aperiodic dielectric structures,” J. Appl. Phys., vol. 100, p. 034310, 2006, https://doi.org/10.1063/1.2221497. Search in Google Scholar

[25] S. Verweij, V. Liu, and S. Fan, “Accelerating simulation of ensembles of locally differing optical structures via a Schur complement domain decomposition,” Opt. Lett., vol. 39, no. 22, pp. 6458–6461, 2014, https://doi.org/10.1364/ol.39.006458. Search in Google Scholar

[26] Z. Lin and S. G. Johnson, “Overlapping domains for topology optimization of large-area metasurfaces,” Opt. Express, vol. 27, no. 22, pp. 32445–32453, 2019, https://doi.org/10.1364/oe.27.032445. Search in Google Scholar

[27] L. Gao, X. Li, D. Liu, L. Wang, and Z. Yu, “A bidirectional deep neural network for accurate silicon color design,” Adv. Mater., vol. 31, no. 51, p. 1905467, 2019, https://doi.org/10.1002/adma.201905467. Search in Google Scholar

[28] Y. Elesin, B. S. Lazarov, J. S. Jensen, and O. Sigmund, “Time domain topology optimization of 3D nanophotonic devices,” Photonic Nanostruct., vol. 12, no. 1, pp. 23–33, 2014, https://doi.org/10.1016/j.photonics.2013.07.008. Search in Google Scholar

[29] C. Yeung, J. M. Tsai, B. King, et al., “Elucidating the behavior of nanophotonic structures through explainable machine learning algorithms,” ACS Photonics, 2020. Search in Google Scholar

[30] O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed, and H. Arshad, “State-of-the-art in artificial neural network applications: a survey,” Heliyon, vol. 4, no. 11, p. e00938, 2018, https://doi.org/10.1016/j.heliyon.2018.e00938. Search in Google Scholar

[31] W. Muhammad, G. R. Hart, B. Nartowt, et al., “Pancreatic cancer prediction through an artificial neural network,” Front. Artificial Intelligence, vol. 2, p. 2, 2019, https://doi.org/10.3389/frai.2019.00002. Search in Google Scholar

[32] B. Conduit, N. Jones, H. Stone, and G. Conduit, “Design of a nickel-base superalloy using a neural network,” Mater. Des., vol. 131, pp. 358–365, 2017, https://doi.org/10.1016/j.matdes.2017.06.007. Search in Google Scholar

[33] R. Hegde, “Photonics inverse design: pairing deep neural networks with evolutionary algorithms,” IEEE J. Sel. Top. Quant. Electron., vol. 26, no. 1, p. 2933796, 2019. Search in Google Scholar

[34] J. Jiang, D. Sell, S. Hoyer, J. Hickey, J. Yang, and J. A. Fan, “Free-form diffractive metagrating design based on generative adversarial networks,” ACS Nano, vol. 13, no. 8, pp. 8872–8878, 2019, https://doi.org/10.1021/acsnano.9b02371. Search in Google Scholar

[35] S. So and J. Rho, “Designing nanophotonic structures using conditional deep convolutional generative adversarial networks,” Nanophotonics, vol. 8, pp. 1255–1261, 2019, https://doi.org/10.1515/nanoph-2019-0117. Search in Google Scholar

[36] Z. Liu, D. Zhu, S. P. Rodrigues, K. T. Lee, and W. Cai, “Generative model for the inverse design of metasurfaces,” Nano Lett., vol. 18, pp. 6570–6576, 2018, https://doi.org/10.1021/acs.nanolett.8b03171. Search in Google Scholar

[37] D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design of nanophotonic structures,” ACS Photonics, vol. 5, no. 4, pp. 1365–1369, 2018, https://doi.org/10.1021/acsphotonics.7b01377. Search in Google Scholar

[38] J. Peurifoy, Y. Shen, L. Jing, et al., “Nanophotonic particle simulation and inverse design using artificial neural networks,” Sci. Adv., vol. 4, no. 6, p. eaar4206, 2018, https://doi.org/10.1126/sciadv.aar4206. Search in Google Scholar

[39] S. An, C. Fowler, B. Zheng, et al., “A deep learning approach for objective-driven all-dielectric metasurface design,” ACS Photonics, vol. 6, no. 12, pp. 3196–3207, 2019, https://doi.org/10.1021/acsphotonics.9b00966. Search in Google Scholar

[40] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. Search in Google Scholar

[41] W. Ma, F. Cheng, and Y. Liu, “Deep-learning-enabled on-demand design of chiral metamaterials,” ACS Nano, vol. 12, no. 6, pp. 6326–6334, 2018, https://doi.org/10.1021/acsnano.8b03569. Search in Google Scholar

[42] S. Inampudi and H. Mosallaei, “Neural network based design of metagratings,” Appl. Phys. Lett., vol. 112, no. 24, p. 241102, 2018, https://doi.org/10.1063/1.5033327. Search in Google Scholar

[43] E. S. Harper, E. J. Coyle, J. P. Vernon, and M. S. Mills, “Inverse design of broadband highly reflective metasurfaces using neural networks,” Phys. Rev. B, vol. 101, no. 19, p. 195104, 2020, https://doi.org/10.1103/physrevb.101.195104. Search in Google Scholar

[44] Z. Liu, D. Zhu, K. Lee, A. Kim, L. Raju, and W. Cai, “Compounding meta-atoms into metamolecules with hybrid artificial intelligence techniques,” Adv. Mater., vol. 32, p. 1904790, 2020, https://doi.org/10.1002/adma.201904790. Search in Google Scholar

[45] P. Naseri and S. Hum, “A generative machine learning-based approach for inverse design of multilayer metasurfaces,” 2020, arXiv preprint, arXiv:2008.02074. Search in Google Scholar

[46] M. Zhelyeznyakov, S. Brunton, and A. Majumdar, “Deep learning to accelerate Maxwell’s equations for inverse design of dielectric metasurfaces,” 2020, arXiv preprint, arXiv:2008.10632. Search in Google Scholar

[47] E. Prodan, C. Radloff, N. J. Halas, and P. Nordlander, “A hybridization model for the plasmon response of complex nanostructures,” Science, vol. 302, p. 1089171, 2003, https://doi.org/10.1126/science.1089171. Search in Google Scholar

[48] X. Liu, T. Tyler, T. Starr, A. F. Starr, N. M. Jokerst, and W. J. Padilla, “Taming the blackbody with infrared metamaterials as selective thermal emitters,” Phys. Rev. Lett., vol. 107, no. 4, p. 045901, 2011, https://doi.org/10.1103/physrevlett.107.045901. Search in Google Scholar

[49] T. Ince, S. Kiranyaz, L. Eren, M. Askar, and M. Gabbouj, “Real-time motor fault detection by 1-D convolutional neural networks,” IEEE Trans. Ind. Electron., vol. 63, pp. 7067–7075, 2016, https://doi.org/10.1109/tie.2016.2582729. Search in Google Scholar

[50] B. Xiao, Y. Xu, X. Bi, J. Zhang, and X. Ma, “Heart sounds classification using a novel 1-D convolutional neural network with extremely low parameter consumption,” Neurocomputing, vol. 392, pp. 153–159, 2020, https://doi.org/10.1016/j.neucom.2018.09.101. Search in Google Scholar

[51] Q. Chao, J. Tao, X. Wei, Y. Wang, L. Meng, and C. Liu, “Cavitation intensity recognition for high-speed axial piston pumps using 1-D convolutional neural networks with multi-channel inputs of vibration signals,” Alexandria Eng. J., 2020, https://doi.org/10.1016/j.aej.2020.07.052. Search in Google Scholar

[52] M. Tahersima, K. Kojima, T. Koike-Akino, et al., “Deep neural network inverse design of integrated photonic power splitters,” Sci. Rep., vol. 9, p. 1368, 2019, https://doi.org/10.1038/s41598-018-37952-2. Search in Google Scholar

[53] I. Sajedian, J. Kim, and J. Rho, “Finding the optical properties of plasmonic structures by image processing using a combination of convolutional neural networks and recurrent neural networks,” Microsyst. Nanoeng., vol. 5, p. 27, 2019, https://doi.org/10.1038/s41378-019-0069-y. Search in Google Scholar

[54] J. Jiang and J. A. Fan, “Multiobjective and categorical global optimization of photonic structures based on ResNet generative neural networks,” Nanophotonics, 2020, https://doi.org/10.1038/s41578-020-00260-1. Search in Google Scholar

[55] R. H. Fan, B. Xiong, R. W. Peng, and M. Wang, “Constructing metastructures with broadband electromagnetic functionality,” Adv. Mater., p. 1904646, 2019, https://doi.org/10.1002/adma.201904646. Search in Google Scholar

[56] W. Ma, Y. Wen, and X. Yu, “Broadband metamaterial absorber at mid-infrared using multiplexed cross resonators,” Opt. Express, vol. 21, no. 25, pp. 30724–30730, 2013, https://doi.org/10.1364/oe.21.030724. Search in Google Scholar

[57] S. An, B. Zheng, M. Shalaginov, et al., “A freeform dielectric metasurface modeling approach based on deep neural networks,” 2019, arXiv preprint, arxiV:2001.00121. Search in Google Scholar

[58] D. Melati, Y. Grinberg, M. Dezfouli, et al., “Mapping the global design space of nanophotonic components using machine learning pattern recognition,” Nat. Commun., vol. 10, p. 4775, 2019, https://doi.org/10.1038/s41467-019-12698-1. Search in Google Scholar

[59] C. C. Nadell, B. Huang, J. M. Malof, and W. J. Padilla, “Deep learning for accelerated all-dielectric metasurface design,” Opt. Express, vol. 27, no. 20, pp. 27523–27535, 2019, https://doi.org/10.1364/oe.27.027523. Search in Google Scholar

[60] W. Ma, Z. Liu, Z. Kudyshev, A. Boltasseva, W. Cai, and Y. Liu, “Deep learning for the design of photonic structures,” Nat. Photonics, pp. 1–14, 2020. Search in Google Scholar

Supplementary Material

Included in the supplementary material are details regarding hyperparameter optimization and training data size.

The online version of this article offers supplementary material (https://doi.org/10.1515/nanoph-2020-0549).