Progress in nanophotonics has yielded numerous extraordinary optical properties such as cloaking objects , , imaging beyond the diffraction limit , , and negative refractive index , . In nanophotonics, a sub-wavelength antenna interacts with light, so precisely designed components can provide useful functionalities. Although several systematic design approaches for desired optical behaviors have been introduced , , , , the procedure for inverse design of nanophotonic structures mostly still relies on a laborious method of optimization. Such a conventional design method requires time-consuming iterative simulations.
Recently, data-driven design approaches have been proposed to overcome this problem. These approaches use artificial neural networks (NNs) to design nanophotonic structures , , , . Previous studies first set the shape, such as multilayers  or H-antenna , of the structures to be predicted, then trained NNs provide the output structural parameters that achieve the desired optical properties. Once the NNs are trained, they provide the corresponding design parameters without additional iterative simulations. Such attempts have greatly reduced the effort and computational costs of designing nanophotonic structures. So far, these approaches have only been applied to conditions in which the basic structures are predefined where only the structural parameters are predictable. Most recently, a generative adversarial network (GAN) model has been used to inversely design metasurfaces in order to provide arbitrary patterns of the unit cell structure .
In this article, we provide the first use of a conditional deep convolutional generative adversarial network (cDCGAN)  to design nanophotonic structures. cDCGAN is a recently developed algorithm to solve the instability problem of GAN, and provides a very stable Nash equilibrium solution. The generated designs are presented as images, so they provide essentially any arbitrary possible design for the desired optical properties that are not limited to specific structures. Our research provides designs of a 64×64 pixel probability distribution function (PDF) in a domain size of 500 nm×500 nm, which allows 264×64 degrees of freedom of design.
2 Results and discussion
2.1 Deep-learning procedure
For deep learning, we first collect a dataset consisting of 10,150 silver antennae with six representative shapes (circle, square, cross, bow-tie, H-shaped, and V-shaped). Each entry in the dataset is composed of a reflection spectrum with 200 spectral points and its corresponding cross-sectional structural design with a 64×64 pixel image. Sixty-four coarse meshes are used for both the x- and y-directions for the simple calculation. The cross-sectional structure designs are prepared in the form of images with a physical domain size of 500 nm×500 nm. The antenna of 30 nm thickness is placed on a 50-nm MgF2 spacer, a 200-nm silver reflector, and a silicon substrate (Figure 1). To obtain the reflection spectrum of each structure, a finite-difference time-domain (FDTD) electromagnetic simulation is performed using the commercial program FDTD Lumerical Solutions. The simulation is conducted over the whole spectral range from f=250–500 THz, and 200 spectral points are extracted. Periodic boundary conditions with the periodicity of 500 nm are used along the x- and y-directions, and perfectly matched boundary conditions are used along the z-direction. At each simulation, y-polarized light is incident on the antenna with 0 incident angle. The current deep-learning setting solves the designing structure problem in a fixed physical domain and fixed wavelength. Designing structures with different periodicity or wavelengths requires additional data-collection or deep-learning procedures.
As a next step, we implement a deep-learning algorithm using the Pytorch framework. Artificial intelligence has revolutionized the field of computer vision recently , . A convolutional neural network (CNN) ,  is among the most widely used techniques, inspired by the natural visual perception mechanism of the human brain. A CNN uses convolution operators to extract features from the input data, which are usually images. It greatly increases the efficiency of image recognition, because every channel extracts important features of the images. On the other hand, the development of GAN has resulted in major progress in computer vision . A GAN is composed of a generator network (GN) that generates the images and a discriminator network (DN) that distinguishes the generated images from real images. GN is trained to generate authentic images to deceive the DN, and DN is trained not to be deceived by the GN. The two networks compete with each other in every training step; ultimately, the competition leads to mutual improvement of each network, so that GN can generate higher quality realistic images than when it learns alone. DCGAN combines the idea of a CNN and GAN to provide a very stable Nash equilibrium solution . We employed the cDCGAN algorithm with a condition , which is the input reflection spectrum in this case.
The cDCGAN architecture to design nanophotonic structures is presented in Figure 2. cDCGAN is composed of two networks: a GN that generates structural cross-sectional images, and a DN that distinguishes the generated images given by the GN from user-given target designs group. GN is composed of four transposed CNN layers consisting of 1024, 512, 256, 128, and 1 channel, respectively; DN is a CNN with four layers. GN takes inputs of both the 100×1 size random noise (z) and the 200×1 size input spectrum. GN provides a probability distribution function (PDF) of the antenna as output, which is generated from the random noise. The input spectrum guides the GN to generate a PDF that has such optical properties. On the contrary, DN takes input as a structural image from either a user-provided target designs group (x) or the generated PDF images by GN, GN(z). DN plays the role of discriminating GN(z) from the target designs group. Ultimately, GN and DN are simultaneously trained competitively: GN is trained to generate an authentic structural design to deceive the DN, and DN is trained to distinguish target designs from the design generated by GN. Mathematically, GN and DN are trained in the direction to minimize or maximize the objective function:
where DN(x) represents the probability of a structural image coming from the target design group (x), and DN(GN(z)) represents that coming from generated design G(z) by GN. In terms of DN, the network is trained to give maximized expectation values of E with for a given image coming from the target design, and for a given image generated by GN. On the other hand, GN is trained to give minimized expectation values to deceive DN. This adversarial training allows GN to generate high-quality structural images.
where lGN,design is the design loss, lGN,adv is the adversarial loss defined in Eq. (1), and ρ is the ratio of the adversarial loss. The design loss is introduced to explicitly guide the GN to generate structural images well. It directly measures the quantitative difference between two probability distributions of the target design (xi) and the generated design using a binary cross-entropy criterion:
where a σ is a Sigmoid function.
We optimized ρ to make the GN generate high-quality realistic designs. For a low ρ, a competition effect cannot be expected, whereas a high ρ can cause confusion in the learning process. Therefore, an appropriate value of ρ=0.5 was chosen to maximize the ability of GN to produce convincing structural designs. During each training step, the network is trained to optimize the weights to describe the mapping between the input spectrum and the PDF (see Supporting Information for details about deep-learning procedure and network optimization).
After training, cDCGAN suggests designs on a 64×64 pixel PDF p(i, j), which represents the probability that a silver antenna exists at the location (i, j). To reduce the PDF to a binary image representing the existence of antennae at the locations, we employed a post-processing step according to Otsu . This method determines the binary threshold t that minimizes the intra-class variance of the black and white pixels as
where ω0 and ω1 represent the weights for the probabilities of two classes separated by t, is the variance of the black pixels, and is the variance of the white pixels. In summary, for a given reflection spectrum, cDCGAN produces a PDF which is then converted to a binary design image in the post-processing step. At each training step, 2000 validation samples are used to validate the trained network. The average loss of the validation set converged to 5.564×10−3 after 1000 training steps. Using a single GPU of GTX 1080-Ti, the training a network for one epoch requires about 4 min. However, once a network is trained, the trained network can generate a design for a desired spectrum within 3 s.
2.2 Network evaluation
The trained cDCGAN is evaluated on test data that were not used in the previous training or validation steps. Randomly chosen test results are shown in Figure 3. Target designs of various nanophotonic antennae (upper-right panel in Figure 3) and the corresponding suggested PDFs (lower-right panel in Figure 3) show good qualitative agreement. For a quantitative evaluation of the suggested PDFs, FDTD simulation based on those suggested designs were conducted. The PDFs were converted to binary designs to be imported into the simulations. Reflection spectra of the suggested images agree well with the given input spectra. We introduce a mean absolute error (MAE) criterion
to quantitatively measure the average error of the reflection spectrum per spectral point between the FDTD simulation result (Yi) obtained from the suggested design and the input spectrum that is originally fed into the network. The average MAE error of 12 test samples is 0.0322, which supports that the trained network can essentially provide an appropriate structural design that has the desired reflection spectrum. Interestingly, even if the antennae have similar shapes, there can be a discrepancy between the predicted and input spectra, as shown in the second column and bottom row of Figure 3. This is due to the small artifacts in the provided PDF. These artifacts can be removed using a different image filter in the post-processing step (see Supporting Information for the effect on post-processing).
We also test our cDCGAN with completely new structures of triangle and star antennae whose shapes never existed in the training and validation dataset (Figure 4A, B). cDCGAN generated new designs that had distorted forms of the antennae, which were used for training. The results imply that our cDCGAN can suggest any designs, unconstrained by structural parameters. The generated images are different from the target designs, but the generated reflection spectra are similar to the input reflection spectra. This is because of the non-uniqueness of the correlation between the optical properties and designs: several different designs can have the same optical property. Among the several possible designs, the generated results are most likely to be found in areas that do not deviate much from the trained dataset space.
Finally, our cDCGAN was further tested with randomly generated, hand-drawn spectra of the Lorentzian-like function
with three cases: (A) a=120, b=900, c=150, (B) a=70, b=850, c=70, (C) a=100, b=1150, c=70, and (D) a=80, b=900, c=80. For each case, the generated image and its corresponding reflection spectrum are shown in Figure 5. The MAE of the reflection spectrum for the four examples are (a) 0.0496, (b) 0.0396, (c) 0.0409, and (d) 0.0408. The predicted responses show reasonably good agreement with the input spectrum in terms of the overall behavior. Most interestingly, the generated images (insets in Figure 5A–D) deviate much from the shapes that are used for training. Such extraordinary structural shapes are not constrained by predefined structures and are even not describable; this is the key advantage of our method over previous ones which can only suggest given structural parameters. The results also indicate that cDCGAN actually learns well the correlation between structural designs and their overall optical responses and hence can be widely used to systematically design nanophotonic structures.
In conclusion, we demonstrated the first use of cDCGAN to design nanophotonic structures. The two networks of GN and DN in cDCGAN competitively learn to suggest appropriate designs of nanophotonic structures that have the desired optical properties of reflection. Our cDCGAN is not limited to suggesting predefined structures but can also generate new designs. It has numerous design possibilities with 264×64=24096 degrees of freedom. The 7.8 nm pixel size used in the current design makes it very difficult to fabricate the suggested design within this resolution. This fabrication difficulty can be overcome by increasing the pixel size up to the feasible fabrication scale of 20–30 nm. Here, we limited the input to a single reflection spectrum. This method can also be extended in the diffraction regime, where additional diffraction orders may appear, if multiplexed input spectra with several diffraction orders are used. Because of the limitation on the used training dataset, it is not always possible to generate structural images with extraordinary reflection spectra. Such weakness may be overcome by collecting additional data that can represent extraordinary reflection spectra. Although our examples set the thickness of each layer and the material type of the antenna, they can also be added as output parameters to be suggested. This modification would allow artificial intelligence to be used to design nanophotonic devices completely independently and would thereby greatly reduce the time and computational cost of designing them manually. We believe that our research findings will lead to the rapid development of nanophotonics by solving the main problem of designing the structures.
4 Supplementary Material
Supplementary Material is available online on the journal’s website or from the author.
Malkiel I, Nagler A, Mrejen M, Arieli U, Wolf L, Suchowski H. Deep learning for design and retrieval of nano-photonic structures. 2017; arXiv preprint, arXiv:1702.07949. Google Scholar
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. 2015; arXiv preprint, arXiv:1511.06434. Google Scholar
Russell SJ, Norvig P. Artificial intelligence: a modern approach. Malaysia: Pearson Education Limited, 2016. Google Scholar
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 2012:1097–105. Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. Adv Neural Inf Process Syst 2014:2672–80. Google Scholar
Mirza M, Osindero S. Conditional generative adversarial nets. 2014; arXiv preprint, arXiv:1411.1784. Google Scholar
Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA. Context encoders: feature learning by inpainting. Proc IEEE Conf Comput Vis Pat Recogn 2016:2536–44. Google Scholar
Isola P, Zhu JY, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. 2017; arXiv preprint. Google Scholar
The online version of this article offers supplementary material (https://doi.org/10.1515/nanoph-2019-0117).
About the article
Published Online: 2019-06-22
Funding Source: National Research Foundation
Award identifier / Grant number: NRF-2017R1E1A1A03070501
Award identifier / Grant number: NRF-2019R1A2C3003129
Award identifier / Grant number: CAMM-2019M3A6B3030637
Award identifier / Grant number: NRF-2018M3D1A1058998
Award identifier / Grant number: NRF-2015R1A5A1037668
This work was financially supported by the National Research Foundation grants (NRF-2017R1E1A1A03070501, NRF-2019R1A2C3003129, CAMM-2019M3A6B3030637, NRF-2018M3D1A1058998, and NRF-2015R1A5A1037668) funded by the Ministry of Science and ICT, Korea. S.S. acknowledges a global Ph.D. fellowship (NRF-2017H1A2A1043322) from the NRF-MSIT, Korea.
Citation Information: Nanophotonics, Volume 8, Issue 7, Pages 1255–1261, ISSN (Online) 2192-8614, DOI: https://doi.org/10.1515/nanoph-2019-0117.
© 2019 Junsuk Rho et al., published by De Gruyter, Berlin/Boston. This work is licensed under the Creative Commons Attribution 4.0 Public License. BY 4.0