2D photonic memristor beyond graphene: progress and prospects

: Photonic computing and neuromorphic computing are attracting tremendous interests in breaking the memory wall of traditional von Neumann architecture. Photonic memristors equipped with light sensing, data storage, and information processing capabilities are important building blocks of optical neural network. In the recent years, two-dimensional materials (2DMs) have been widely investigated for photonic memristor applications, which offer additional advantages in geometry scaling and distinct applications in terms of wide detectable spectrum range and abundant structural designs. Herein, the recent progress made toward the exploitation of 2DMs beyond graphene for photonic memristors applications are reviewed, as well as their application in photonic synapse and pattern recognition. Different materials and device structures are discussed in terms of their light tuneable memory behavior and underlying resistive switching mechanism. Following the discussion and classification on the device performances and mechanisms, the challenges facing this rapidly progressing research field are discussed, and routes to realize commercially viable 2DMs photonic memristors are proposed.


Introduction
With the emergence of the Internet-of-Things (IoT), the vast increase in the amount of data workloads creates new challenges in the need to improve computation speed and energy efficiency. However, the instructions transfer latency between processor and memory that put on a fundamental limitation for modern computers known as the memory wall [1]. In order to break the memory wall, one revolutionary approach is, to make use of the neuromorphic computing with resistive switching memories (memristors) that aims to carry out calculations in situ where the data are located [2][3][4][5][6][7][8]. The memristor, a combined word of memory and resistor, is a non-volatile memory whose resistance state could be programmed by the applied voltage it has experienced [2,5]. This approach is similar to the computing scheme in the human brain, where information is processed in biological neural networks (BNNs) of neurons and synapses, without any physical separation between computation and memory [3,7]. By building the memristors into large-scale crossbar arrays to form artificial neural network (ANN), they can perform efficient neuromorphic computing to enable artificial intelligence functions, such as pattern storage and image recognition [8].
However, for a single-layer neural network, in general, the number of artificial synapses is the square of the number of artificial neurons. For example, a fully interconnected network of 10 4 neurons requires 10 8 synapses, which is approaching the limit of the state-of-theart VLSI technology. The interplay between photonics and ANNs offers the advantages of parallel processing and massive interconnection capabilities for the design of large-scale photonic neuromorphic computers [9][10][11][12]. Because light beams can pass through a threedimensional space without interfering with each other, optical systems can provide higher space-bandwidthproduct operation and ultrafast propagation speed than their electronic counterparts. These unique properties of optics have stimulated great interest for designing photonic ANNs, which can perform neuromorphic functions at high speed. Moreover, combining photonic stimuli with electronic ones in synaptic devices can enable the integration of synaptic functions with biometric sensing elements such as vision, auditory, and olfactory sensors [13][14][15].
Owing to the exceptional potential in optoelectronic promised by the new class of 2D materials, considerable efforts have been made to demonstrate and realize their applications in photonic memory and photonic synapse. Beyond electronic stimuli, in novel 2DMs based photonic memristors and synaptic devices, either light signal or gate voltage is typically used for the programming/erasing operation, and electric source-drain voltage is used for read-out current. The synaptic weight is encoded into the conductance of the 2DMs channel. Up to now, both the two-terminal photonic memristor and three-terminal floating gate photonic memory have been extensively studied mainly based on molybdenum disulphide (MoS 2 ), tungsten diselenide (WSe 2 ) and BP. The two-terminal vertical memristor, leveraging on the Schottky barrier modulation and conductive filament formation, offers the advantages of lower switching voltages and higher integration density. The three-terminal lateral photonic memristor typically exploits the photo-generated charge trapping/detrapping on the floating gate, which provides more design flexibility and new functionality. Beyond this device-level emulation of synaptic dynamics, these 2DMs photonic synaptic devices have also been used to build photonic ANNs. Subsequently, pattern recognition tasks have been verified by these ANNs, where image preprocessing and colour-mixed pattern classification are applied [13,14].
Here, we have reviewed the progress in this fastevolving area that has been achieved over the past few years, and highlighted the issues that need to be addressed toward their practical application for future photonic ANNs. In the following sections, we have divided the realization of photonic memory into two different configurations, i.e. vertical and lateral structures. Both configurations will be dealt with separately, with a highlight on the switching mechanism and device performance built from 2DMs. Finally, the challenges facing this rapidly progressing research field are discussed and routes to realize commercially viable photonic ANNs are proposed. With our focus on photonic memristors in 2DMs, we recommend readers to refer to some important reviews [8,10,[46][47][48][49][50][51][52][53][54][55][56][57][58][59][60] in the field for some basic understanding of memristors, electronic, and photonic synaptic devices based on a hybrid of materials systems including 2DMs, organic and transition metal oxides (TMOs), and the application on ANNs and neuromorphic computing.

2DMs preparation methods
The growth of large-area, high-quality 2DMs is perhaps one of the most challenging tasks. In general, the production methods can be classified either as top-down, where bulk layered materials are directly exfoliated to yield monoand few layer flakes or bottom-up, where individual flakes are grown or synthesized on substrates [61,62]. The most common methods of top-down synthesis are mechanical exfoliation and chemical exfoliation such as solution processing; as for bottom-up approach, the most commonly used method is chemical vapor deposition (CVD).
CVD is a promising method to realize large scale growth [24,63,64]. The synthesis process depends on the properties of substrate, the temperature and the atomic gas flux. To prepare 2D TMDs, several CVD techniques have been developed. One effective method for growing large area and continuous TMDs is the use of the "twostep method" [64], where transition metal thin film (e.g. Mo, W, Nb, etc.) is first deposited on the substrate (usually Si/SiO 2 ) before the thermal reaction with the chalcogen (S, Se, Te) vapor. The following reaction occurs to form a stable 2D TMD during the CVD process at high temperatures (300-700°C) and inert atmosphere. This "two-step method" has demonstrated wafer scale growth (∼2 in.) and successful thickness modulation of MoS 2 layers (monolayer to multilayer) on SiO 2 /Si substrates. The metal organic chemical vapor deposition (MOCVD) is similar to a conventional CVD except that metalorganic or organic compound precursors are used as the source materials [65]. In MOCVD reaction, the desired atoms are combined with complex organic molecules and flown over a substrate where the molecules are decomposed by heat and the target atoms are deposited on the substrate atom by atom. The quality of films can be engineered by varying the composition of atoms at atomic scale, which results in the desired thin film with high crystallinity.
Compared with graphene and TMDs, the synthesis of BP is still in its infancy [66][67][68]. In 2018, Li et al. demonstrated a synthesis approach for crystalline BP by conversion from red to black phosphorus at 700°C and 1.5 GPa [68]. This is achieved by depositing red phosphorus (RP) thin film on sapphire substrates and then converting it to BP at an elevated temperature and pressure. The synthesized BP thin-film exhibits multi-crystalline structure with crystal domain size ranging from 40 to over 70 µm. BP-FET has also successfully demonstrated a field-effect mobility of 160 cm 2 /Vs. However, CVD presents challenges in costeffectiveness and scalability due to the demanding production conditions (e.g. high temperature, certain gases and precursors) and expensive sacrificial growth substrates. In contrast, top-down chemical exfoliation (i.e. solution processing) allows exfoliation of mono-and few-layer 2D material flakes from bulk crystal in a liquid medium in large quantities for a low setup and production cost (cost-effective equipment, raw materials and material processing).
Solution processing generally refers to exfoliation via ion intercalation, ion exchange, or pure shear forces (i.e. LPE) [69][70][71][72][73]. LPE was first reported in graphene production by Hernandez et al. in 2008 where the authors recorded exfoliation of mono-and few-layer graphene flakes in organic solvents (e.g. N-methyl-2-pyrrolidone (NMP)) by ultrasound sonication. This approach starts with immersing graphite into a liquid medium. Under the effect of ultrasound waves, localized bubbles are generated in the liquid. The bubbles then collapse and generate high shear forces, which overcome the interlayer vdW forces to yield exfoliated graphene flakes. This approach has since been successfully extended to a wide range of 2D materials, including TMDs and BP [74]. In addition to ultrasound assisted exfoliation, other commonly adopted methods for shear force generation include high-shear mixing, high-pressure mixing and ball milling. As LPE exploits shear forces to achieve exfoliation, the intermolecular interaction between the 2D materials and the liquid medium significantly defines the exfoliation process and the subsequent dispersion of the exfoliated 2DMs in the liquid.

Photonic memristors based on 2DMs
Arising from the diverse electronic energy band gap, 2DMs can cover a broad electromagnetic spectrum [75], as shown in Figure 1. Graphene, with its zero bandgap, can interact with light from microwave to ultraviolet wavelengths, but its semimetallic nature makes it a candidate for electrode instead of resistive switching (RS) medium. Thus, in this review, we have mainly focussed on 2DMs beyond graphene. The wide range of material properties, together with the possibilities for fabricating 2DMs heterostructures with different materials, allows for the realization of various photonic memristors, covering a wide spectral range from the ultraviolet to mid-infrared, as shown in Figure 1. The constructions of photonic memristor can be mainly divided into two-terminal vertical RRAM and three-terminal phototransistor memory. The existing switching mechanism of photonic RRAM leverages on the charge trapping/detrapping at the MoS 2 surface. The lateral memristor is constructed in a phototransistor structure based on 2DMs or 2DMs heterostructure. The switching mechanism lies with building potential wells between semiconducting channel and two electrodes, which could store/release charge carries as modulated by light stimuli. In the following section, we have discussed the structure, switching mechanism, device performance of the photonic memristor based on MoS 2 , WSe 2 , and BP, respectively. A detailed device performance of 2DMs photonic memristors are benchmarked in Table 1.

Molybdenum disulfide (MoS 2 ) based photonic memory
MoS 2 has a direct bandgap of 1.8 eV when the thickness is scaled down to monolayer (0.7 nm) [24,77,78], which enables photonic memristor to operate at visible wavelengths. Up till now, MoS 2 is the most studied RS material among all the other 2DMs. Both the photonic memristors in a two-terminal RRAM and three-terminal phototransistor configurations are discussed as follow.

Two-terminal vertical RRAM
The two-terminal vertically stacked memristor, named resistive random access memory (RRAM), is a widely studied memory prototype, which is typically configured in a metal/insulator/metal structure [79][80][81]. The RRAM offers advantage of great scalability, showing great potential to meet the high demand for constructing large-scale high density neuromorphic chips. However, the RRAM in a crossbar configuration is typically suffering from inaccurate reading and programming due to the current flow through the unselected cells ("sneak path current"). One practical approach to prevent the unexpected leakage current is to integrate each memristor with a MOS transistor (i.e. a 1T1R cell) [82] or a selector (i.e. a 1T1S cell) [83], but resulting in a larger circuit footprint. To achieve the passive array without any connected access device, the alternative method is that the memristor has an I-V nonlinearity itself [8]. Therefore, ultrathin MoS 2 with a large self-rectification ratio is a promising choice. The mechanically exfoliated MoS 2 without intrinsic defects inside does not initially display significant RS behavior. The successful strategies to achieve MoS 2 RRAM includes incorporating MoS 2 with polymers [84,85], embedding 1T MoS 2 phase [86], introducing an oxide layer of MoO x [21,87], printing active Ag electrode above MoS 2 ink [22,23], and among others. Notably, the RS behavior is also successfully observed on monolayer MoS 2 grown by chemical vapor deposition (CVD) [20], which sheds light into the fabrication of RRAM crossbar array. Figure 2 compares the electronic and photonic memristors fabricated on CVD grown continuous monolayer MoS 2 films [20,40]. Figure 2A shows the schematic of a memristor wherein the CVD grown monolayer TMDs serve as the RS layer [20]. This device shows the non-volatile characteristics at a current compliance of 10 mA wherein a voltage with opposite polarity is required to reset the device into the high resistance state (HRS) after setting into low resistance state (LRS), as shown in Figure 2B. The transition from HRS to LRS and LRS to HRS are equivalent to a "writing" and "erasing" process in digital memory devices. The devices achieve a switching ratio of 10 4 and a fast switching time of 15 ns. The non-volatile behavior is further extended to a few prototypical semiconducting TMDs including MoS 2 , MoSe 2 , WS 2 , which allude to the universality of this phenomenon in TMD monolayers. The temperature-dependent conduction measurements at LRS/HRS and area-dependent studies reveal that the switching mechanism is driven by the conductive filaments formation. In the SET process, the electrons of metal electrode are transported through the grain boundary of MoS 2 to form a conductive path, and in the RESET process, the conductive path is disrupted, resulting in a Schottky barrier at the device interfaces [20].
Under light illumination, the photonic memristor has been demonstrated in Figure 2C in a sputtered W/MoS 2 /SiO 2 /p + Si structure [40]. Different from the electronic memristor, the switching mechanism lies with the presence of charge trap at the MoS 2 /SiO 2 interface. The low-energy electrons and holes are localized in potential wells wherein the trap sites are originated from the dangling Si-O bonds at the MoS 2 /SiO 2 interface and sulfur vacancies in the CVD-monolayer MoS 2 . Without light illumination, the I-V sweep shows a current hysteresis which is attributed to the native SiO 2 interfacial trap layer ( Figure 2D). Since the retention time is less than 150 s in dark condition, the behavior resembles a volatile behavior wherein the resistance restores to HRS naturally when the voltage is removed. Under light illumination, photo-excited electrons and holes are created and separated by the built-in electric field in the n-MoS 2 /p-Si heterostructure, leading to photodiode-like behaviors. Thus, the light illumination generates a persistent photocurrent phenomenon, which increases the relaxation time and results in a long-lasting current increment.
Another vertical photonic memristor configuration is an Al/UCNPs-MoS 2 heterostructures/ITO/Substrate [33]. UCNPs refer to the upconversion nanoparticles based on lanthanide ion (Yb 3+ , Er 3+ )-doped NaYF 4 . The heterostructure of incorporation of MoS 2 with UCNPs can act as both near-infrared (NIR) sensitizer and excitons generation/ separation centers, which can significantly improve the photo carriers' generation and NIR light-controlled RS [33]. The transfer curve shows the non-volatile bipolar switching behavior wherein the resistive switching mechanism is related to the NIR light-manipulated charge trapping/detrapping. The NIR-generated charge carriers can be localized in MoS 2 due to the charge traps, which in turn manipulate the conductivity of MoS 2 via photo-gating effect. The RS process is illustrated by the energy band diagram in Figure 2E. At initial HRS state, since the Fermi energy of top electrode aluminum (Al) is very close to the valence band of MoS 2 , a high electron Schottky barrier is formed between Al and heterostructure. Under dark condition, when a low voltage bias is applied, most of the free charge carriers are trapped at the interface of MoS 2 -UCNPs and only a few free charge carriers can cross the barrier. Under the irradiation of NIR light, the UCNPs could emit visible light, which would be further reabsorbed by MoS 2 to induce the formation of excitons. Under the external electric field, the photo-generated excitons can be separated at the interface between MoS 2 and UCNPs. The photo-generated free charge carriers will further increase the bending of energy band and decrease the Schottky barrier height, and thus resulting in LRS.

Three-terminal phototransistor memory
Although the three-terminal structure experiences a relatively high programming voltage and complex circuit integration, the third terminal would provide more design flexibility for performing synaptic functions. Floatinggate FET (FG-FET) is a conventional configuration of a memory, for which an additional FG is introduced within the gate dielectric of a phototransistor. The switching mechanism is typically based on charge trapping/detrapping by the FG [88][89][90]. The charge carriers storing/ releasing process by FG enables a shift in the device threshold voltage (V th ) and subsequent drain current (I ds ) modulation. Figure 3A and C shows the simplified photonic FG-FET memory wherein the conventional gate dielectric SiO 2 itself can function as both the chargetrapping and charge-blocking layer. As exemplified by a multi-layered CuIn 7 Se 11 (CIS) based photonic FG-FET as shown in Figure 3A, the design concept lies with building potential wells between semiconducting channel and two electrodes when a positive gate voltage is applied to the n-type semiconductor [35]. The working principle is successfully extended to other 2DMs (i.e. InSe and MoS 2 ). The switching mechanism is explained in Figure 3B. Under dark conditions (Step I in Figure 3B), the inversion charge electrons are trapped by the interfacial charge while the Schottky barriers prevent electrons from being injected from the electrodes into the channel, so the device is in HRS which is set as the initial state. Then the laser pulse illumination generated electron-hole pairs in the channel (step II). Photo-generated holes escaped from the channel through a bending upwards of the energy band, thus leaving behind the photo-generated electrons. In addition, the release of trapped electrons through surface electron-hole recombination alleviates the gate screening effect, resulting in more electrons in the conduction band. During the waiting time that is when the laser is turned off while the gate voltage is maintained; the information delivered by the light is stored and maintained. Finally, when a −10 V read-out bias is applied to the drain electrode, the accumulated electrons are released since the Schottky junctions is forward biased and reduced in the source side of the potential well (step III). The stored information in the device is read out in this reading step. At that early stage, a short retention time (less than 50 s) and low switching ratio (less than 10) are achieved. In order to increase the retention time and switching ratio, an improved structure is to create artificial charge trap sites at SiO 2 /MoS 2 surface by using oxygen plasma treatments [36]. Figure 3C shows the photonic FG-FET memory with monolayer MoS 2 channel on an oxygen plasma treated SiO 2 substrate. The surface treatment is found to induce a number of functional silanol groups (Si-OH) on the SiO 2 substrate so as to trap more electrons at a high positive gate voltage, as evident by the larger hysteresis in the transfer curve as compared with the MoS 2 FET on pristine SiO 2 substrate. Moreover, the energy band structure of MoS 2 with functional groups indicates that the local potential fluctuations induced by artificial trap sites are able to trap electrons energetically. The operating mechanism of the optical memory cell is similar to the CIS as discussed above by taking advantage of the potential well between MoS 2 channel and electrodes. The photonic FG-FET memory is programmed in a reset, readout (offstate), light exposure and readout (on-state) sequence ( Figure 3D). In this work, the introduction of artificial trap sites increases the retention time and switching ratio to ≈10 4 s and ≈4700. Even though the retention time and switching ratio are improved as compared with the first work, the charge trapping by the trap sites at SiO 2 /MoS 2 interface is sensitive to the environment and limits the data storage capacity.
Alternatively, Lee et al. demonstrated a real photonic FG-FET memristor made from gold nanoparticle/ crosslinked poly (4-vinylphenol)/MoS 2 heterojunction-FETs [31], as shown in Figure 3E. Instead of relying on charge trapping at the MoS 2 /SiO 2 interface, they used thermally deposited metallic gold nanoparticles (AuNPs) inside the gate stack for charge trapping purpose, which is found to deliver superior performance in terms of long retention time of >10 4 and high switching ratio of 10 6 [31]. As can be seen from the energy band diagram in Figure 3F, when the tunnelling time of electrons from AuNPs to MoS 2 is shorter than photo-generated electron-hole pair recombination time of MoS 2 (1 fs vs 50 ps), the electrons in the AuNPs will be transferred to the valence band of MoS 2 to recombine with photo-generated holes and prevent the recombination of the photo-excited electron-hole pairs in the MoS 2 layer. Thus, the photo-excited excess electrons maintain a long and high current level until the application of the positive erasing gate voltage. The incident light intensity could be stored as a persistent photocurrent in the photonic FG-FET memory. Moreover, pulsed light at seven different optical powers was applied to create eightlevel data storage over a current level span of seven orders of magnitude ( Figure 3G).

Lateral van der Waals heterostructure
One advantage of 2DMs is that the dangling-bond free surface allows the van der Waals (vdW) integration to create heterostructures with atomically clean and electronically sharp interfaces [27]. The 2DMs vdW heterostructure is formed by physically assembling the building blocks together through vdW interactions, offering an alternative bond-free integration strategy without lattice and processing limitations [28]. The 2DMs vdW heterostructure provides a great platform to extend the absorption range [38], the multi-level storage [32] and the switching time [41] of the photonic memritsor.
Leveraging on the PbS/MoS 2 heterostructure, the optical sensing range could be extended to the wavebands of infrared radiation, which are used as a communication medium for night vision, military communication, and medical diagnosis [38]. As show in Figure 4A, the infraredsensitive PbS nanoplates are grown on top of few-layer MoS 2 flakes by CVD. Under laser illumination, the infrared (808, 1340, 1550, and 1940 nm) illumination-induced electrons in PbS nanoplates are injected into the MoS 2 , possibly due to the electron injection over the barrier from PbS to MoS 2 or the photon-enhanced thermionic emission ( Figure 4B). When switching off the laser pulse, since an interface barrier (FR) prevents the reversed diffusion of electrons from MoS 2 to PbS, the localized holes cannot be recombined and, hence, induce a persistent photocurrent (PPC) lasting. The exponentially fitted lifetime is 5125 s of localized holes in the MoS 2 /PbS heterostructure ( Figure 4C). Subsequently, the electrical erasing operation is performed via a positive V g pulse which induces electron tunnelling from MoS 2 to PbS. However, the memory has to operate at 80 K to avoid the thermal fluctuation at high temperature. Thus, small energy bandgap 2DMs such as BP is a natural candidate for mid infrared illumination, which will be discussed in Section 3.3.
To improve the switching speed, Figure 4D demonstrates a photonic memristor based on the MoS 2 /single-walled SWCNTs network vdWs heterostructure, which achieves a program/erase time of ∼32/0.4 ms. [41]. The MoS 2 /SWCNTs heterostructure forms a Gaussian heterojunction transistor. This device structure works in an optical-write-electrical-erase mode. Under dark condition, the device is in HRS since the large negative gate bias (V g = −50 V) completely deplete the carriers in MoS 2 . Once the laser (542 nm, 421 mW/cm, duration 0.1 s) turns on, the photo-generated holes of MoS 2 will enter the SWCNTs network under a negative electric field and trapped by water or hydroxyl groups on polar substrate surfaces of the SWCNTs network. Since large amounts of photo-generated electrons are accumulated in MoS 2 , together with p-type SWCNTs network, the device will form a p-n junction. So high current is maintained in such a forward biased p-n junction even when the light is removed during read out. Subsequently, a positive V g (50 V) pulse is applied to erase the LRS wherein MoS 2 and SWCNTs both become n-type conductive. The excess electrons in SWCNT will recombine with the previously trapped holes and weaken the gate screening effects. Thereafter, when the next round negative gate pulse is applied, MoS 2 will become nonconductive again and the device restore to the HRS. Figure 4E shows a two-terminal multibit optical memory via MoS 2 /hBN/graphene vdW heterostructure [32]. The top monolayer MoS 2 is used as both a conducting channel and light absorption layer, while the bottom graphene layer serves as a floating gate and the thin h-BN flake sandwiched between them acts as a dielectric tunnelling layer. Different from the writing/erase process of the three terminal FG-FET memory, the memory operates in an electrical-writing and optical-erasing mode. As shown by the energy band diagram in Figure 4F, the device operates in an electrical programming (V ds = −10 V), off-current reading (V ds = 0.5 V), optical erasing and oncurrent reading sequence. Under dark condition, a negative source-drain voltage (V DS ) pulse can induce electrons tunnelling from MoS 2 through the h-BN layer to be stored in the bottom graphene floating gate via a Fowler-Nordheim mechanism [91]. When the V ds is switched to 0.5 V, the channel is in HRS since the electrons are being confined and spread out in the floating graphene. Moreover, the electrons in graphene exert a negative gate bias to MoS 2 channel which further depletes the electrons in MoS 2 . Subsequently, upon light pulse exposure, the photo-generated holes in MoS 2 can easily tunnel through the small triangular hole-barrier of MoS 2 /h-BN to neutralize the stored electron in the graphene, which removes the negative gate bias. Consequently, the Schottky barrier of the MoS 2 and metal contacts is reduced and narrowed, facilitating electron injection from the metal contacts to MoS 2 channel. Together with the photo-generated electrons in the MoS 2 channel, the device switches into the LRS. This photonic memristor achieves a long retention time of more than 3.6 × 10 4 s and a switching ratio of 10 6 .
Moreover, 18 successive light pulses generates 18 current levels in six orders of current with clear gaps (Figure 4G), achieving improved data storage capacity.

Tungsten diselenide (WSe 2 ) based memristor
Tungsten diselenide (WSe 2 ) is among the rising stars in the family of TMDs with distinguishable features in the application of spintronics, transistors and photodetectors. In 2018, a WSe 2 /h-BN heterostructure photonic memristor was demonstrated which exhibited a retention time of 4.5 × 10 4 s, a high switching ratio of 1.1 × 10 6 and a data storage capability of 128 distinct states [37]. The schematic of the device structure is shown in Figure 5A wherein a monolayer WSe 2 flake is transferred on top of a BN flake on the SiO 2 substrate. Different from the MoS 2 heterostructure photonic memristor, both the programming and erasing process involves the gate voltage pulse and light stimuli together. Figure 5B and C show the switching mechanism and the dynamic behavior. Under a short wavelength of 405 nm, the light stimuli results in excitation of electrons from the mid-gap donor-like states of BN to its conduction band. During the programming, the negative gate bias of −20 V drives the photogenerated electrons into WSe 2 and leaves the holes localized in the middle of h-BN bandgap. After programming, the intrinsic p-type WSe 2 FET will change to n-type behavior. The positive charges in hBN will screen the negative gate bias and terminate the programming process. During the read out, a positive gate bias of 50 V induces a high electron Schottky barrier at WSe 2 /hBN interface which maintains a long retention time. Finally, when a light pulse is applied again, the photogenerated electrons will fill the ionized positive defects in hBN and the electrons in WSe 2 will transfer into hBN to recombine with the photogenerated holes under a positive gate voltage. The WSe 2 restores the hole transport dominant behavior after erasing. During programming, with the increasing number of pulses, the current rises progressively to generate 128 distinct data storage level. Leveraging on the CVD growth monolayer WSe 2 , Figure 5D shows an integrated memory matrix with 27 WSe 2 /hBN photonic memristor, which successfully realized the function of a color image sensor. Three laser beams with different wavelengths (red 638 nm, green 515 nm, and blue 473 nm) are applied to expose the selected pixels in sequence while the other pixels are left unexposed. Figure 5E demonstrates an image of a "NUS" logo wherein the letter "N", "U", and "S" record the red, green, and blue lights, respectively.
BP is a naturally small bandgap semiconducting material (0.3-2 eV), thus it is not ideally suited for RRAMs. For this reason, the existing study on BP photonic memristor mainly focuses on lateral phototransistor structure. Figure 6A shows the schematic of a photonic memristor based on a ferroelectric PZT-gated black phosphorus transistor (Fe-FET) [29]. This structure takes advantage of the interfacial charge effect induced by lead zirconate titanate (PZT) ferroelectric polarization to achieve modulation of the BP channel resistance. A positive pulse voltage applied to the PZT substrate induces negative polarization charges (P up state) on the top surface of PZT, which would extract holes to increase the conductivity of BP, meanwhile screen the PZT polarization. On the contrary, a negative pulse voltage induces positive polarization charges (P down state), which would deplete holes to decrease the conductivity. Interestingly, the BP/PZT photonic memristor can exhibit both the negative photoconductivity (NPC) and positive photoconductivity (PPC) by modulating the PZT polarity, as can be seen from Figure 6B. That is, the photocurrent is smaller than the dark current P up while the photocurrent is larger than the dark current at P down state. The authors attribute the different photo response to the interface charge level (E t ). At the P up state, E t exhibits donor-like state. Since the recombination between photo-generated electrons from donorlike traps and holes in the valence band of BP dominates, thus the hole concentration decreases and the photocurrent exhibits NPC behavior. At the P down state, E t is in a acceptorlike state, and the photo-excited holes in the acceptor like trap supplement the hole concentration, thus result in PPC.
Besides the commonly used interfacial charge effect, another method employs the optical stimuli with different wavelengths to trigger the positive and negative photocurrent, as shown in Figure 6C [39]. Figure 6D shows the photocurrent under 280 nm and 365 nm laser illumination. Since BP is reactive to the environment, the native phosphorus oxide (PO x ) layer could be used to act as charge trap sites to reduce the conductance of the device under 365 nm laser wavelength. The role of PO x is evident by the thermal treatment. It is found that when the surface adsorbates are removed by thermal treatment, the negative photocurrent disappears. One effective way to trigger the positive photocurrent is to use high energy UV (280 nm) exposure. The photo-generated carrier is far more than the trap sites and thus they will dominate the overall increase in photocurrent.
The positive and negative photocurrent could be used to emulate the excitatory postsynaptic current (EPSC) or inhibitory postsynaptic current (IPSC) in an artificial synapse, which will be discussed in the next section.

Photonic synaptic devices
Photonic synaptic electronics is an important application of photonic memristor and also an emerging field of research aiming to build photonic neuromorphic circuits [10]. Neurons and synapses are the two basic computational units in the brain [107,108]. Neuron performs signal-processing tasks by integrating the inputs coming from other neurons and generating spikes as a threshold is reached [109]. The synapses contribute to the computation by altering their connection strength in response to neuronal activities, which is known as synaptic plasticity [16]. Synaptic plasticity is the mechanism that is believed to underlie learning and memory of the biological brain.
In a biological synapse, as shown in Figure 7A, the presynaptic neuron is connected to the postsynaptic neuron via synapse [22]. The neurotransmitters in the synaptic gap are caused by the synaptic vesicles as the Ca 2+ ions diffuse inside the neurons [46]. As a result, the neurotransmitters bind to the receptor sites of Na + gated ion channels at the post-synaptic neuron. This cause the ion channels to open and allow the Na + ions to diffuse inside the cell. When the membrane potential of the post-synaptic neuron becomes more positive and crosses a threshold, the neuron fires an action potential, known as a spiking activity responsible for information flow and complex computations performed by the brain. In a photonic memristor, the tunable conductance of the photonic memristor is regarded as the synaptic weight, while photonic stimuli or electric stimuli are considered as the synaptic spikes. Photonic memristor integrated with the storage capability and synaptic functions show a great potential to mimic the associative learning of biological neural via olfactory sensor, leading to the development of novel devices for future photonic neuromorphic circuits. Recently, the synaptic plasticity has been successfully implemented in several types of electronic and photonic memristors based on 2DMs. Different device structure, material system and RS mechanism are utilized to mimic the ion flux and neurotransmitter release dynamics in chemical synapses. Up to now, a wide variety of synaptic functions are achieved, such as short-term plasticity, long-term plasticity, pairedpulse facilitation, spike-rate dependent plasticity (SRDP), spike-time dependent plasticity (STDP), among others. The two-terminal artificial synapse closely mimics the structure and function of the biological synapse [16,110]. The stimulus spikes applied on the top electrode (bottom electrode) are considered as presynaptic spikes (postsynaptic spikes). The stimulus spikes are applied to the top electrode, and the current in the bottom electrode is simultaneously monitored. The increase of the conductance corresponding to EPSC can be used to mimic the potentiation of the synaptic strength, while the decrease of the conductance corresponding to IPSC can be used to mimic the habituation of the synaptic strength. The short-term plasticity and long-term plasticity could be realized by the modulation of stimuli frequency and relaxation time [22]. In a two-terminal photonic RRAM, the optical stimuli applied above the top electrode could also induce synaptic plasticity to the channel. In the photonic MoS 2 RRAM in Figure  2B, the short-term potentiation and long-term potentiation were successfully implemented by applying the frequencyvaried photonic stimuli [40]. At a low frequency of 0.1 Hz, the device conductance is firstly enhanced, since the time interval between two successive photonic stimuli is long enough for the recombination of electrons and holes, the current relaxes to its initial state after the simulation. In contrast, when the high frequency of the UV pulses of 1 Hz is utilized, the current exhibits a long-lasting increase which is corresponding to long-term potentiation.
In the three-terminal phototransistor structure, the gate terminal serves as the presynaptic input, the source/drain electrodes are regarded as postsynaptic output terminal and channel conductance serves as the synaptic weight [76]. Postsynaptic responses (I ds ) triggered by presynaptic pulses (V g ) is recorded as a function of pre-and postsynaptic pulse width, pulse interval, frequency, and number of repetitions. As shown in Figure 7B, the optical stimuli could serve as the fourth presynaptic inputs terminal to enable synaptic plasticity via the modulation of channel conductance. Compared with two-terminal RRAM, additional gate-control and optical-control provide augmentative functionalities which could match the degree of complexity and number of neurons in the human brain. Figure 7B shows a simplified photonic FG-FET memristor where the switching mechanism is discussed in Section 3.1.2. Upon repeated light stimuli, the conductance increases rapidly initially and then exhibits a rapid drop on termination of optical illumination, finally the conductance maintains in LRS ( Figure 7C). This long-lasting LRS is called persistent photoconductivity (PPC) which is attributed to defect or trap centred slow recombination in accordance with the random local potential fluctuation (RLPF) model. The STP and LTP could be implemented by the modulation of photon dosage wherein the increased photodose enhances device conductance and modulates the decay dynamics.
One application to testify the associative learning ability of photonic synapse is to use the optical stimuli to emulate the classical conditioning in Pavlov's dog experiment [39,76,111,112], which is illustrated in Figure 7D. The food and salivation are unconditional stimulus and unconditional response, which are emulated by the light pulses and postsynaptic current beyond 500 nA (salivation response threshold) [76]. The conditioned stimulus bell is emulated by the voltage pulses, which is applied to the back gate to activate a conditioned response [86]. At stage a (apply 10 light pulses, λ = 445 nm with a gate voltage of 2 V), the Pavlov's dog started to salivate (unconditioned response) on noticing food. These 10 cycles pulse stimuli with gate voltage bias mimic the training routines with simultaneous feeding and ringing. However, the 10 cycles training is insufficient for the dog to learn to associate the ring (conditioned stimulus) with food (unconditioned stimulus), as evident by the stage (2) wherein the conditioned stimulus bell alone (2 V gate voltage pulse) did not produce salivation. Further, when the repeated training with simultaneous feeding and ringing the bell increases to 40 cycles, the subsequent conditional stimulus alone successfully produces postsynaptic current beyond salivation threshold voltage (stage (3), (4)). This is the hallmark that indicates an efficient association (learning) between the food/unconditioned stimulus and the bell/conditioned stimulus. In stage e and f, since the conditioned response after 40 training cycles remained so strong that subsequent conditioned stimulus alone clearly produced a current output higher than the salivation threshold for ≈2 h. As a result, Pavlov's dog now salivated when it heard the ringing of the bell alone. After 2 h, as shown in stage g, the following conditioned stimuli applied to the "well-trained" dog failed to salivate which mimics the forgetting process. This conditioned association could be rebuilt or recovered again. As shown in stage (8), less retraining cycles of 15 successfully triggered the association.

Photonic artificial neural network and pattern classification
To obtain a better understanding of the brain, the paradigm of brain science has evolved into two directions including biological neural networks (BNNs) and artificial neural networks (ANNs) [10]. The studies of BNNs involve the use of advanced electrophysiology and imaging techniques to study the biological, structural, and functional features of the brain [113,114]. Another category is carried out by building ANNs with electronics, photonics, or direct growth of biological neuron cells to emulate the BNNs [8,10]. The concept of mimicking BNNs with ANNs, which is also called neuromorphic computing, was introduced in late 1980s [4]. The studies of ANNs include software simulation to enable neuromorphic computing and experimental demonstration to realize pattern classification tasks [8]. In 2018, Wang et al. demonstrated a fully memristive neural network on chip which consists of an 8 × 8 1T1R artificial synaptic devices in a crossbar integrated with 8 artificial neurons [109].
ANNs follow a simplified model inspired by BNNs which consists of artificial neurons and synapse as computing elements. Figure 8A shows a simple ANNs topography proposed by McCulloch and Pitts in 1943 [115]. The neurons (circles) in the input layer (x) receive signals and pass the signals through one or more hidden layers. At the end, the weighted sum is processed through a nonlinear activation function (f, e.g. step function, sigmoid function, etc.) in the hidden layers and then propagated to the output layer to produce the outputs (y). The results at the output layer are evaluated via a loss function as compared with the targeted values. The above calculation is typically referred to "feedforward". Subsequently, an optimization algorithm called "gradient descent" is performed by taking the derivatives of a loss function and updated the synaptic values (often referred to "backpropagation" process). The "feedforward" and "backpropagation processes" are iterated until reaching a satisfied accurate rate. The synapses are connections of different strength (or weight w ij , shown as arrows) that are tuneable during training ( Figure 8B). At each hidden layer, the input value organized in a vector matrix performs vector matrix multiplication with weight w ij , while the w ij is represented by the memristor conductance [7,116]. At each column, the total current is a summation of the current at each cross point according to Kirchhoff's current law. This current is the output of one layer and the input of another layer. The number of hidden layer together with the output layer is typically referred to the n-layer neural network.
To achieve a good recognition accuracy based on the back-propagation learning algorithm, synaptic devices are required to have the linear conductance responses, a sufficient number of effective conductance states, and high stability in each state [117,118]. Ideally, the potentiation/depression characteristic is linearly proportional to the number of input pulses. However, the realistic device based on RRAM, phase change memory etc. does not follow such ideal trajectory, where the conductance changes rapidly at the beginning stages of LTP/LTD and then gradually saturates [117,118]. To extract the nonlinearity, the experimental LTP/LTD curve is typically used to fit with the weight update formula which is related to maximum conductance and minimum conductance values, and the changing step size of the conductance [118]. The number of effective conductance state is defined as the number of conductance state beyond a threshold ∆G ( Figure 8C) [13]. For example, if the number of ∆G points does not exceed threshold ∆G in 36 out of 100, and thus the number of effective conductance state becomes 74 out of 100.
Compared with the ANNs with electronic memristive devices, ANNs with photonic memristors show great potential in building artificial visual systems. The human visual system mainly comprises of the eyes, the lateral geniculate nucleus (LGN), and the visual cortex [14]. The retina first captures the light, pre-processes and prepares the information and then the extracted information will be delivered through optic nerves and processed in the visual cortex. Cones in the human eye provide colour vision functions, which absorb the spectral radiation according to the wavelength (red, green, and blue). Like the visual system of human, the photonic synapses can not only directly respond to the optical stimuli, but also equip with data storage and visual information processing capability. Towards the development of artificial visual system, the photonic ANNs show great potential to enable new realms in image sensing, image memorization, colour differentiation and real-time pre-processing functions, further to reduce the amount of hardware and power consumption. The 2DMs based photonic synapses have been introduced in the neuromorphic computing area, which successfully achieve the functions of image pre-processing [14] and coloured pattern recognition [13]. Figure 9A shows the schematic of a MoO x photonic RRAM which could be arranged in a crossbar array to enable image contrast enhancement function [14]. Before the input to ANNs, as shown in Figure 9B, the photonic array performs a pre-processing to enhance the image contrast. A preliminary test is performed by applying repeated optical stimuli to different pixel with different power. After some time, since the currents that correspond to pixels with a lower intensity pulse stimuli decay faster and the currents from pixels with a higher intensity decay slower, the contrast of the output image is further enhanced. In this way, the main feature in one image is highlighted, and the image contrast is enhanced. Thus, the simulation integrates the pre-processing functions before carrying out image recognition based on an ANNs. An image database that consisted of images of the letters P, U, and C with 6 × 7 pixels is adopted as the training and testing samples. A neural network with an input layer (42), hidden layer (20) and output layer (3) was established in the simulation for image recognition. Figure 9C compares the input images before and after the pre-processing. It is clearly seen that the body features of the letters are highlighted after the pre-processing by photonic RRAM array. As a result, the recognition rate efficiency is found to improve as compared with the ANNs without image pre-processing. The recognition rate with image pre-processing system reaches 0.986 after 1000 training epochs, whereas the recognition rate only achieves 0.980 after 2,000 training epochs without the image pre-processing.
Another interesting function achieved by the 2DMs photonic synapses based ANNs is to enable the pattern recognition for colored and color-mixed pattern [13]. Figure 9D shows the schematic of the optic-neural synaptic (ONS) device, which connects an optical-sensing device and a synaptic device in series on the same h-BN/ WSe 2 heterostructure, namely optic-neural synaptic device (ONS). The optical-sensing device is exposed to red (655 nm), green (532 nm), and blue (405 nm) light with a constant pre-synaptic drain bias, and the post-synaptic current is measured by applying a voltage pulse to the synaptic device. The optical stimuli decrease the resistance of the optical-sensing device, while increasing the density of carriers trapped in the charge trapping layer of the synaptic device simultaneously. This subsequently allows the adjustment of the synaptic dynamic properties of the ONS device. Under an excitatory and inhibitory pulses of 600 times each, the ONS is found to achieve 599 effective conductance states under a 0.3% threshold ∆G and a nonlinearity of 1.5 (potentiation)/1.5 (depression) with a pulse amplitude of ±0.3 V. The ONS device achieves a good linearity and sufficient number of states for deep neural networks which require 64-256 states.
Following the dynamic response of the ONS system, an artificially optical neural network (ONNs) system is built using the extracted device parameters and a simple perceptron network model. The effectiveness of this system is verified by applying the ONN to the colored and color-mixed pattern recognition tasks. As shown in Figure 9E, two neural networks are developed for recognizing the target colour number from the complex mixed-color numerical digits (similar to a color-blindness test), one of the synaptic functions are performed by conventional neural network while the other is performed by ONNs. The ONNs constructed with ONS synaptic device is equipped with the optical-sensing functions. In the simulation, a modification is made to the MNIST (Modified National Institute of Standards and Technology) dataset to generate color-mixed numerical patterns, as shown in Figure 9F. The objective of this pattern recognition task is to recognize "1" and "4" from the color-mixed test images, where the target numbers are buried in the color-mixed patterns. In the input layer, each modified image has 28 × 28 pixels which are reshaped into a 784 × 1 matrix. Different voltages are applied to the input neurons representing the RGB (red = 1 V, green = 0.5 V, blue = 0.3 V) colours. In the conventional neural network ( Figure 9G, left panel), the black weight connections represent the synaptic devices with the LTP/ LTD characteristic under dark condition. In the ONN ( Figure 9G, right panel), the synaptic connections is furnished with the LTP/LTD characteristics in response to different RGB conductance as expressed by RGB colour lines. Finally, the currents obtained at output neurons by a matrix product of the input signal and the synaptic weight (W) results in the output neuron signals (f). Then the delta value (δ), which is the difference between the output neuron signals and the label values, guides the synaptic weight update until reaching the satisfied recognition accuracy. Finally the pattern recognition rate for the ONN exceeded 90% after the 50th epoch while the recognition rate for the NN is below 40% ( Figure 9G). These results confirm the effectiveness of the ONS system in implementing color and color-mixed pattern recognition task.

Summary and Outlook
Herein, we reviewed the recent progress on photonic memristor devices based on emerging 2DMs beyond graphene. The fundamental resistive switching mechanism in different device configuration based on different types of 2DMs was discussed. The fabrication method and performance including light absorption range, switching voltage, switching ratio and applications are summarized. Over the past years, significant progress has been achieved in demonstrating their potential through the fabrication of photonic memristor and photonic synapse with different materials and structures. As seen from Table 1, in vertical configuration, photonic RRAM can be created by means of CVD or solution processed MoS 2 . The lateral structure in FET, FG-FET and vdW heterostructure have been fabricated without the lattice mismatch concern on MoS 2 , WSe 2 , and BP through mechanical exfoliation and transfer technique. The two-terminal vertical structure benefits from the superior on-chip integration capability while the three-terminal structure in a lateral transistor configuration is more appealing due to (1) abundant material options available, (2) wide light absorption range from UV to infrared, (3) more linear potentiation/depression characteristics and (4) the available conductance states.
However, challenges remain in enhancing their performance to meet the demands of practical applications. Firstly, the device performance is still far from meeting the ITRS's (International Technology Roadmap for Semiconductors) requirements for non-volatile materials, such as low operating voltages of <1 V, low power consumption of ∼10 pJ /transition, high operating speed of less than 10 ns/ transition and high endurance of >10 9 cycles. While 2DMs are extremely suitable for fabricating next generation memristor from the scaling perspective, efforts are still needed to further reduce the switching voltage and switching time, and also improve the switching ratio, retention time and endurance time. Secondly, most of the studies focus on the simple prototype demonstration which utilizes different 2DMs or programming/erasing process instead of optimizing the device performance. Access to controllable high-quality material growth, a scalable fabrication process, and device variation engineering are all indispensable. Thirdly, while photonics memristor concept has been demonstrated, there is still a lack of efforts in scaling up for commercial use [119]. A large number of memristive materials and production methods are not compatible with standard silicon CMOS processes, which prevents the production of devices in large number and sufficient quality. Overcoming these challenges was crucial for commercialization.
Moreover, although significant advances have been achieved in building ONNs for the pattern recognition task, the use of 2DMs for photonic ANNs is still in its infancy. As mentioned above, there are still many challenges in building large scale optoelectronic neural network for their practical application in future neuromorphic computing. In the late 1980s, Carver Mead performed the first trial to use the electronic analog circuit to mimic the neural networks of the brain [3]. In 1985, Farhat et al. proposed the first optoelectronic neural network [9]. In their proposed architecture, the input vector (represented by a light-emitting diode, LED, array) is interconnected to a weight matrix to carry out a one-dimensional matrixvector operation [9]. The output vector is then detected by a photodiode array for amplification and thresholding before feedback for the next iteration [9]. Since then, extensive effort has been made to emulate BNNs by building ANNs or ONNs. Recently, a fully functioning all-optical neural network (AONN) is demonstrated, in which linear operations are programmed by spatial light modulators and Fourier lenses, while nonlinear optical activation functions are realized in laser-cooled atoms with electromagnetically induced transparency [12]. Even though the development of 2DMs based ONNs is still in its infancy, recent advances are closing that gap by integrating the optical sensing functions into the conventional ANNs. With a great amount of continued efforts dedicated to tackling these issues and the rapidly ongoing progress in 2DMs research, photonic memristors and synaptic devices based on 2DMs with high performances can be expected in years to come.