Inverse Design of Nonlinear Metasurfaces for Sum Frequency Generation

Sum frequency generation (SFG) has multiple applications, from optical sources to imaging, where efficient conversion requires either long interaction distances or large field concentrations in a quadratic nonlinear material. Metasurfaces provide an essential avenue to enhanced SFG due to resonance with extreme field enhancements with an integrated ultrathin platform. In this work, we formulate a general theoretical framework for multi-objective topology optimization of nanopatterned metasurfaces that facilitate high-efficiency SFG and simultaneously select the emitted direction and tailor the metasurface polarization response. Based on this framework, we present novel metasurface designs showcasing ultimate flexibility in transforming the outgoing nonlinearly generated light for applications spanning from imaging to polarimetry. For example, one of our metasurfaces produces highly polarized and directional SFG emission with an efficiency of over 0.2 cm^2/GW in a 10 nm signal operating bandwidth.


Introduction
Sum frequency generation (SFG) is a fundamentally important second-order nonlinear process with many applications ranging from wavelength conversion of optical sources [1] and infrared imaging [2]- [5] to nonlinear polarimetry [6].This phenomenon arises from induced polarizations in the medium, and in general, it can be observed in the presence of strong optical fields.Efficient second-order nonlinear frequency conversion, such as the SFG, traditionally requires long interaction lengths in bulky nonlinear crystals.As a result, only certain crystalline orientations and input polarizations can satisfy the phase-matching conditions for producing sizable nonlinear effects.This limits the types of polarization transformations and the directionality of emission that are possible in nonlinear bulk crystals.
Recent advances in nanotechnologies have facilitated the development of ultrathin single-layer dielectric metasurfaces, where optical nano-resonators can enhance and tailor the nonlinear interactions with functionalities beyond the capabilities of traditional bulky crystals [7]- [11].To generate optical resonances, previous metasurface designs have often relied on semi-analytical approaches in the limiting cases of Mie-type modes for individual nano-resonators [12]- [17], or bound state in the continuum resonances [18]- [25].The angular-dependent properties of nonlocal metasurfaces could also be utilized to tune the nonlinear interactions over a range of wavelengths [26].There is ongoing research on the enhancement of SFG in metasurfaces with resonances at non-degenerate wavelengths [27]- [30].Non-planar structures with broken 3D symmetries have been identified for designing effective nonlinear susceptibility response [31], [32].Furthermore, several studies have considered inverse design or machine learning approaches to optimize for strong resonances [33]- [37].The resulting geometries have highly counterintuitive nanostructured geometries, which prove superior to conventional designs in their respective applications.
However, these examples fall short of controlling nonlinear generation beyond the mere enhancement of conversion efficiency, which is only a part of the advantages that metasurfaces offer.For many practical applications such as imaging, it is often important to consider the input and output polarizations as well as the directional distribution of the generated emission.Control of the nonlinear polarization and the regulation of diffraction orders while maintaining high overall efficiency still remain a major challenge in the field.Several works demonstrate switching of diffraction behaviour by tuning the input polarization [38], [39], allowing for different routing schemes.In many situations, it is desirable to reduce diffraction, which can be achieved with exploiting selection rules [40] or having subwavelength unit cell for all wavelengths [41].Highefficiency nonlinear conversion generally requires strong field enhancements in the nonlinear region, which can be characterized by the Q-factor of the resonance, its matching with the pulse bandwidth, and the overlap between high Q-factor modes.The task of engineering overlapping high Q-factor resonances with a prescribed bandwidth in materials at non-degenerate wavelengths is difficult in itself.The additional objectives of engineering emission polarization and direction expand the complexity of the task even further.
In this manuscript, we address the aforementioned engineering challenges by developing a novel inversedesign framework for optimizing metasurfaces that enables the simultaneous enhancement of the SFG efficiency over a desired pulse bandwidth, tailoring the polarization transformation matrix, and increasing the emission directionality in a multi-objective manner.Our resulting metasurfaces account for the intrinsic χ(2) of the material and combine it with the optimized structural geometries to form an effective χ (2) eff of the device (Figure 1(a)), enabling ultimate control of nonlinear interactions.Thereby, we broaden the flexibility and increase the benefits of using nonlinear metasurfaces, from wavelength conversion to imaging and other applications.

Adjoint optimization of SFG
Beginning from the perspective of classical nonlinear optics, a polarization field is induced in the medium that has a nonlinear relationship with applied electric field [42], [43] P =  0 [  (1) : E +  (2) : EE +  (3) : This polarization then generates a nonlinear current that excites the fields at harmonic frequencies.Our work of the SFG field can have an independent effective  (2)   eff associated with it.
By optimization of the structure, each of these  (2)   eff can be tailored as desired.(b) The energy levels of the SFG process with each field's respective frequencies.
focuses on the second-order process, so we explicitly write this nonlinear current at the sum-frequency  3 as for input fields E 1 and E 2 at the frequencies  1 and  2 , respectively (Figure 1(b)).The implementation of an additional position-dependent parameter (r) allows the strength of the  (2) interaction to be modified simultaneously with the value of the refractive index in the presence or absence of material (Figure 2(a)).We assume the nondepleted pump approximation, where the driving fields will not suffer a significant loss of intensity as it propagates due to low enough conversion efficiency.Then the fields E 1 and E 2 are found as solutions of linear Maxwell's equations, and then E 3 at  3 may be obtained by solving the inhomogeneous wave equation with the current from Eq. ( 2).We aim to maximize the value of an objective function that depends on the generated sum-frequency field, which in turn depends on the medium parameters (p) and on the incident fields: We choose the optimization parameter as p = (r) and define the permittivity at each position of the design space as where  d and  c are the relative permittivity for the patterned and cladding materials, respectively.Thereby, (c) corresponding adjoint simulations.The forward fields are labelled E i , and the adjoint fields are labelled v E, j for frequency  i,j .p = 0 or p = 1 corresponds to the absence or presence of a nonlinear material at a particular spatial location according to Eqs. ( 2) and ( 4), while the values of 0 ≤ p ≤ 1 are used during the intermediate stages of the optimization process as we discuss in the following paragraph.
We formulate the adjoint optimization approach to nonlinear metasurfaces following the general principles developed in Refs.[44]- [49].Previously, nonlinear cavities [50] and metasurfaces [34], [37] were optimized to maximize the total second harmonic generation.Here, we focus on the SFG process, which includes three fields with different spatial distributions, adding further complexity to the optimization problem.Furthermore, we perform optimization for SFG radiation over targeted diffraction orders, rather than just the total converted radiation in all directions, as well as the tailored nonlinear polarization transformations.For this purpose, we aim to computationally efficiently calculate the derivatives dT∕dp simultaneously for all spatial points in the nonlinear layer, which then allows for the fast gradient descent optimization of the metasurface patterns as sketched in Figure 2(a).Instead of repeating the forward calculation for material variations at every spatial point (Figure 2(b)), we identify the adjoint linear problems at the three wavelengths (Figure 2(c)), which solutions allow the calculation of objective function derivative at all spatial locations through the overlaps between the fields at the three non-degenerate wavelengths and each of their corresponding adjoint fields.Detailed mathematical expressions for the material derivative and its formulation can be found in Supplementary S1.In the main manuscript, we summarize the ways in which the necessary adjoint electric fields are obtained.
For optimization of sum-frequency emission in the outward far field and on the surface Ω, we can define the objective function through the complex amplitude a of a particular wave or mode with a field (E 3, f , H 3, f ).This mode can be of any form, including plane waves, Gaussian beams, vortices, and other beam shapes, in combination with any polarization structure.Then, the mode amplitude can be defined, consistent with the derivations based on the unconjugated Lorentz reciprocity theorem [51]- [53], as where n is the unit normal vector outward the surface Ω, (E 3,b , H 3,b ) is a direction-reversed wave of (E 3, f , H 3, f ), and the normalization coefficient is We now consider the most common case, in which all materials are reciprocal with symmetric permittivity and permeability, and we only have the electric field-induced electric current source.Then, the adjoint field v E,3 at  3 is a result of linear scattering from the metasurface for a source The adjoint fields v E,1,2 at  1,2 are obtained by solving Maxwell's equations with a current source where The subscript E refers to the adjoint field v E being an electric field.We find that there is no contribution from magnetic adjoint fields.These currents only exist where (r) (2)   ijk is non-zero, i.e. in the nonlinear material.We note that these currents have a dependency on v E,3 , and so it must be obtained before solving the aforementioned equations.
Finally, the objective function gradient is where indices q = 1, 2 are for  1,2 .This equation allows the calculation of derivatives for any functions T(a 3 ) using a chain rule.For example, for the maximization of SFG light power into a particular mode, we can set T = |a 3 | 2 , and In total, a set of six simulations is required to calculate the gradient using the above equations for an arbitrarily large number of spatial positions determined by the computational grid: three forward simulations to calculate E q , and three adjoint simulations for v E,q at the three nondegenerate wavelengths  q with q = 1, 2, 3, respectively.

Optimization for multiple polarizations in SFG
We now discuss the methodology for multi-objective optimization by extending the results in the previous section that were formulated assuming fixed input waves E 1 and Of particular interest is to simultaneously tailor the sumfrequency polarization states for multiple combinations of different input polarizations.Then, we can define a figure of merit (FOM) as where m enumerates different input-wave polarization combinations and a (m) 3 are the sum-frequency amplitudes of the chosen polarization and spatial mode profiles.The derivative can be found using a chain rule d d 2 ) d 2 ) d , (13) where the derivatives on the right-hand-side are determined using the adjoint formulation in Eq. ( 10).
In the most general case, the input polarizations for  1,2 can each be decomposed into pairs of orthogonal polarization states.Each of M = 4 combinations of input waves can generate a different nonlinear current.Then, a total of 14 unique simulations are required for the material derivative to fully capture all possible input and output polarization combinations, including 8 forward (2 at  1,2 and 4 at  3 ) and 6 adjoint (2 at each wavelength) simulations.
Furthermore, our method can also optimize diffraction outputs by specifying the adjoint source waves' k-vectors.Diffraction into particular orders can be enhanced (suppressed) by increasing (decreasing) the corresponding mode power |a 3 | 2 , respectively.

Numerical implementation
We iteratively update the function (r) at all spatial locations inside the quadratically nonlinear material.At each iteration, the material permittivity is updated via gradient descent, or another gradient-based method, at each discretized point in the domain [47], [54].For a single-layer metasurface design, we consider the pattern to be independent of the longitudinal coordinate z.The optimization concludes when the FOM no longer increases from one iteration to the next or the set maximum number of iterations has been reached.
In our optimization, we employ several different techniques to ensure that the free-form structures converge to a state that is both physical and realizable.We introduce an increasing binarization factor as the optimization progresses that eventually forces the final pattern to be either material or air [55], [56].The patterns are also periodically blurred with a Gaussian filter that ensures the minimum feature is larger than limitations imposed by fabrication precision [57], [58].Importantly, an artificial absorption coefficient is added onto the material, which will prevent convergence to structures that have arbitrarily large quality factors (Q factor) [50], [59].This non-radiative decay rate defines the finite bandwidth of the SFG process, which is a practical consideration for future experimental verification and distinguishes our optimization algorithms from other works purely relying on high-quality factor resonances.This artificial absorption coefficient is later removed for the analysis of the actual metasurfaces.

Results
In general, we can optimize for any combination of polarizations and elements in the effective  (2) tensor, as discussed above.However, in this work, we tackle a simpler problem that nevertheless still highlights the strength of our optimization scheme.Specifically, we consider a signal having any transverse polarization while the pump has a fixed polarization.Such functionality may be beneficial for the operation of upconversion infrared imaging, where the image can have any arbitrary polarization and the pump is a light source of fixed polarization.For our examples, we have a signal wavelength  1 = 1, 550 nm and a pump wavelength  2 = 1, 350 nm (resulting in  3 = 721.6 nm).The transformation of the signal state into the SFG state can be expressed in the form (2) , The fields E q represent the orthogonal complex amplitudes q , E y q ) T in the transverse components basis.We can make a connection of Eq. ( 14) to the Jones vectors in polarization optics in the following way.Let M represent the nonlinear scattering matrix analogous to the Jones matrix in linear materials.Individual elements of M can be interpreted as complex scattering amplitudes.For an unpatterned film, M is constrained by the crystal orientation and is set by the elements of  (2) and the input state of E 2 .One such case is provided in Supplementary S3.1 for unpatterned nonlinear film.Then E 1 and E 3 can be interpreted as the input and output Jones vectors of the system, respectively.Now, with the ability to pattern structures into the nonlinear film, the permittivity  is no longer uniform across the domain but instead can be engineered to achieve the desired transformation of M. In all the examples, the nonlinear material is indium gallium phosphide (InGaP) that is of (100) crystalline orientation and 300 nm thick.The film is resting on the fused silica substrate, with a 900 nm × 900 nm unit cell.This unit-cell size is chosen because it produces the best performance from preliminary studies.Ideally, to reduce diffraction, a subwavelength unit-cell should be used for all wavelengths, but it was found that such cells are too small to support strong resonances in contrast to larger unit-cells.Future work may explore subwavelength unit-cells at all wavelengths.
Details regarding InGaP parameters are contained in Supplementary S2.The electromagnetic simulations are performed using the commercial COMSOL Multiphysics software package suite.Each simulation takes approximately 100 s when using a discretization of 40 nm.The resulting fields are exported to the MATLAB programming language, where we implement our inverse-design optimization.

Polarizing nonlinear metasurface
At the SFG wavelength, multiple diffraction orders exist due to the periodicity of the metasurface.In infrared imaging applications, the higher diffraction orders should be ideally suppressed so that the majority of the SFG light is propagating in the zeroth order.In previous works [5], [29], it has been a challenge to suppress these higher-order propagating modes.Simultaneously, with our method, we can also optimize diffraction outputs by specifying the adjoint source waves' k-vectors.This is a significant advance compared to previous inverse-design approaches for quadratically nonlinear metasurfaces [34], [37] where only total second-harmonic conversion but not directionality could be optimized.
We denote the zeroth order of the scattering matrix as M 0 , whose elements can be optimized individually.In the first example, the signal is an unpolarized state while the pump at  2 is polarized in the |V⟩ state.We intend for the SFG zeroth order diffraction to also be polarized in the |V⟩ state (Figure 3(a)).In this scheme, the metasurface can be considered to be a nonlinear polarizer by taking an unpolarized input and polarizing it into the |V⟩ state at the SFG output.The figure of merit is formulated as , and its derivative is calculated with Eq. ( 13).We note that unpolarized light can be decomposed into equal powers of any pair of orthogonal states, which are the |V⟩ and |H⟩ pair in our modelling.In this notation, the states | 1 ⟩ denote the input signal state, while ⟨ 3 | denote the outgoing SFG state.
The optimization converges to a design that is highly non-trivial, as shown in Figure 3(b).We calculate the expected transformation of input unpolarized light and find that it is almost fully converted into the |V⟩ state at the SFG wavelength, as depicted on the Poincaré sphere in Figure 3(c).The numerical values of M 0 matrix elements can be found in Supplementary S3.2, where further analysis is also provided.In Figure 3(d), we show the predicted SFG conversion efficiency into all the propagating diffraction orders, of which the order comprises approximately 80 % of the total SFG light (see Supplementary S4.1 for precise values).Therefore, the optimized metasurface directs the vast majority of SFG light perpendicular to the surface, which greatly benefits imaging applications.We note that the suppression of higher order diffraction orders is not explicitly included in the cost function, which only includes the maximization of the desired zeroth diffraction order.In initial studies, we did not find a significant improvement in performance by incorporating the suppression of higher order diffraction, while it slowed down the optimization process.
We now determine the frequency bandwidth by fixing the wavelength of the pump at  2 = 1,350 nm and calculating the SFG efficiency as an unpolarized signal wavelength is swept around the operating wavelength of  1 = 1,550 nm (Figure 3(e)).The transmission into the desired |V⟩ state is significantly larger than the |H⟩ state and reaches a maximum SFG efficiency of 0.25 cm 2 GW −1 , which is more than 3 orders of magnitude larger than an unpatterned film of the same nonlinear material and thickness.This performance is echoed when we fix the signal  1 = 1,550 nm and sweep the pump wavelength instead (Figure 3(f)).The full-width at half maximum (FWHM) of conversion efficiency for the signal is approximately 20 nm, while it is approximately 3 nm for the pump, representing a reasonably large operating bandwidth suitable for efficient conversion with short optical pulses.From these plots, it is evident that there are resonances present within the metasurfaces at the input wavelengths  1 and  2 that greatly enhance the SFG process.
We provide the field distributions and further analysis in the Supplementary S5.

Polarization independent nonlinear metasurface
In this example, we focus on a metasurface whose SFG efficiency enhancement is independent of the polarization of  1 with fixed polarization for  2 .Previously, this has been achieved in various waveguides [60], [61].A metasurface that has this property can enhance SFG conversion efficiency for all input signal polarization states equally.This is particularly useful for imaging, where the source is typically unpolarized or partially polarized.Such an upconverted image will preserve the relative amplitudes of the original image, even if there are spatial variations in the polarization.This is in contrast to previous demonstrations of SFG imaging, where the metasurfaces typically rely on polarization-sensitive resonant modes of bound states in the continuum [5].Therefore, for this case, the target M 0 matrix is close to unitary after scaling, or its singular values are close to equality.The FOM for this case is where s 1 and s 2 are the ordered singular values of M 0 , and for this demonstration, the pump polarization at  2 is fixed at |V⟩.By maximising s 2 , which is always defined as the smaller of the two singular values, we are effectively increasing the SFG enhancement of the worst-performing polarization input state.For a unitary matrix, the ratio of the singular values must be s 2 ∕s 1 = 1.
The optimized metasurface is again a non-trivial freeform pattern, as shown in Figure 4(b).For this metasurface, we then calculate the transformation of various input states at the signal wavelength and plot them on a Poincaré sphere (Figure 4(c)).We see that the metasurface imparts a rotation of the eigenstates to the input states during the SFG process.We provide further analysis of the matrix M 0 in Supplementary S3.3.Importantly, the eigenstates are nearly orthogonal, which indicates that the transformation is indeed near unitary after scaling.The SFG is primarily channelled into the zeroth diffraction order for an unpolarized signal (Figure 4(d)), with reduced light leakage into higher order diffraction modes (see Supplementary S4.2 for numerical values).
We calculate the SFG efficiency for a range of unpolarized input signal wavelengths and pump wavelength fixed at  2 = 1,350 nm and find a maximum in average efficiency of 6 × 10 −3 cm 2 GW −1 (Figure 4(e)).Again in this example, the FWHM of conversion efficiency for the signal is approximately 20 nm, while it is approximately for 10 nm for the pump.The SFG efficiency for two orthogonal polarizations of |H⟩ and |V⟩ are almost equal.On the right axis, we show the ratio of singular values that reaches a maximum of 0.8, reasonably close to the ratio of 1 for a truly unitary transformation.This analysis is repeated for a fixed signal ( 1 = 1,550 nm) and varying pump wavelengths (Figure 4(f)), with the maximum efficiency peaking at  2 = 1,350 nm, as expected.Therefore, the transformation of input light into SFG output from the metasurface can be considered to be nearly polarization-independent.The preservation of polarization information leads to the ability to perform upconversion polarimetric imaging [6] with greater efficiency than previously reported.

Structure of resonances in the optimized metasurfaces
Finally, we perform linear simulations to inspect the resonances that we expect to be present in the metasurfaces at  1 ,  2 ,  3 (Supplementary S5.1).Because the two metasurfaces presented in this work in Sections 3.1 and 3.2 above are optimized for different functionalities, they also have different resonant characteristics.For the nonlinear polarizer metasurface, only |H⟩ produces a resonance at  1 , while for the polarization-independent nonlinear metasurface, resonances appear for both |H⟩ and |V⟩ close to  1 .As expected for both nonlinear metasurfaces, only |V⟩ produces a resonance at the pump wavelength because the polarization at  2 was fixed.We note that the Q-factors are on the order of 70-300 for the different wavelengths, which allows for a reasonably broad imaging bandwidth and for SFG with short optical pulses.We also provide field enhancement distributions for the metasurfaces at all the interacting wavelengths in Supplementary S5.2.

Discussion
We note that our derived equations governing nonlinear inverse design are general, and can be implemented for any domain with any combination of boundary conditions.In our work, we optimized for SFG in a periodic metausurface.
In the future, one may explore using our optimization in different regimes, for example, in the quasi-periodic approximation.This has been extensively used for the construction of various linear metasurface devices, and has recently been extended to demonstrate nonlinear metalens [62] and nonlinear metasurface beam shaping [63].Full wave simulations of large-area metasurfaces have recently been achieved [64], promising the ability to simulate entire devices.As our nonlinear optimization is decoupled into a series of linear equations, we can leverage these powerful solvers in the future to accomplish more complex objectives.

Figure 1 :
Figure 1: Inverse-designed nonlinear metasurface for sum frequency generation.(a) Schematic of the possible processes that can occur through SFG.The input fields E 1 and E 2 can have arbitrary polarization states, which then generate the SFG field E 3 .Each diffraction order

Figure 2 :
Figure 2: Process of metasurface topology optimization for sum frequency generation.(a) Depiction of topology optimization.The material derivative at each discrete point is calculated through a series of forward and adjoint simulations.(b) Forward simulations of sum frequency generation, and

Figure 3 :
Figure 3: Topology-optimized metasurface for SFG polarization and diffraction order engineering.(a) Schematic of an SFG polarizing metasurface.The polarization state of E 2 is fixed, while the state for E 1 is unpolarized.The SFG light from the optimized metasurface is polarized in the |V⟩, with the higher diffraction orders also being suppressed.(b) Optimized pattern of the metasurface.(c) Poincaré sphere representation of all input states of E 1 being transformed into the same SFG output (S 1 , corresponding to |H⟩) state.(d) SFG efficiency from an unpolarized source for different diffraction orders.(e) Signal wavelengths sweep of SFG efficiency spectra for a pump wavelength fixed at 1,350 nm, and (f) pump wavelengths sweep of SFG efficiency spectra for a signal wavelength fixed at 1,550 nm.

Figure 4 :
Figure 4: Polarization-independent SFG from a topology-optimized metasurface.(a) Schematic of an SFG waveplate metasurface.The polarization state of E 2 is fixed, while the state for E 1 is unpolarized.The SFG light from the optimized metasurface is rotated in the |V⟩, with the higher diffraction orders also being suppressed.(b) Optimized pattern of the metasurface.(c) Poincaré sphere representation of all input states of E 1 being transformed into different SFG output states while preserving their orthogonality.(d) SFG efficiency from an unpolarized source for different diffraction orders.(e) Signal wavelengths sweep of SFG efficiency spectra for a pump wavelength fixed at 1,350 nm, and (f) pump wavelengths sweep of SFG efficiency spectra for a signal wavelength fixed at 1,550 nm.The dashed orange curves show the ratio of singular values (SV) for each plot at their respective wavelengths.