High-rate laser processing with ultrashort laser pulses by combination of diffractive elements with synchronized galvo scanning

: The combination of diffractive optical elements or spatial light modulators with fully synchronized galvo scanners offers a possibility to scale up machining processes with ultra-short pulses to several 100 W of average power with minimal thermal impact. This will be demonstrated with the high-rate applications multi-pulse drilling on the ﬂ y and material removal with special intensity distributions up to an average power of 162 W and a removal rate of 16.5 mm 3 /min. Based on the experimental results strategies to achieve drilling rates of several 10,000 holes/s or removal rates of multiple 10 mm 3 /min will be discussed.


Introduction
Since the first demonstration of its high-quality [1][2][3] ultrashort laser processing has found its way into different applications, not only for micromachining and metals but also for nanoprocessing and materials like semiconductors, glasses, and plastics e.g. summarized in [4]. We will concentrate in this work on metals but most results will be applicable for other materials as well.
For industrial applications, the total costs of processes often represent the limiting factor for wider use of ultrashort pulsed lasers systems. Besides others, throughput is a key factor for the successful implementation of this technology and throughput not only demands an optimized process but also directly scales with the average power of the laser system. Thus, in past, often higher average powers were demanded. But nowadays, industrial ready ultrashort pulsed laser systems capable of 24/7 operation offer average power up to 200 W whereas the development in research goes above the kW level. E.g. in 2018, an average power of 3.5 kW was demonstrated for coherently combined ultrafast fiber lasers with 430 fs pulse duration and 80 MHz repetition rate [5], a value which was recently exceeded to 10.4 kW of average power [6] resulting in a pulse energy of about 130 µJ with an even shorter pulse duration of 254 fs. Much higher pulse energies at lower repetition rates are achieved with disc amplifiers, e.g. in [7] a pulse energy of 97.5 mJ at a repetition rate of 2 kHz was demonstrated for pulses with 1 ps pulse duration. High average powers are also achieved with the innoslab technology [8] where already in 2010 an average power of 1.1 kW was demonstrated for a repetition rate of 20 MHz and a pulse duration of 615 fs [9] and recently 530 W for a pulse duration of 30 fs at a repetition rate of 500 kHz was demonstrated [10]. Thus, missing average power will not be an issue in the future the challenge will rather be to deal with it by keeping the high machining quality as it will be shown in the following sections.
2 Demands and limits for high-rate laser processing

Ablation process
Following the logarithmic ablation law [1,11,12] the ablation depth z abl logarithmically depends on the fluence z abl = δ ⋅ ln(ϕ/ϕ th ) with δ the energy penetration depth and ϕ th the threshold fluence. Depending on the pulse duration and wavelength the ablation depth can show different regimes having different threshold fluences and energy penetration depths, usually a low fluence regime and a regime with moderate fluences where δ denotes the optical penetration depth and the diffusion length of the hot electrons, respectively [11]. Ultrashort pulsed systems generally emit a very good beam quality near M 2 ≈ 1 i.e. a Gaussian beam with the local fluence With r the distance to the beam center, w the spot radius, E p the pulse energy, and ϕ 0 the peak fluence in the beam center and ϕ the average fluence.
Assuming a single threshold model and first, a top-hat intensity distribution (constant intensity over the whole spot) and an area A of the spot, the removed volume per pulse reads: For a Gaussian beam the volume per pulse is deduced by introducing (1) into the logarithmic ablation law and reads [13][14][15]: For the energy-specific volume i.e. the removed volume per energy, representing the efficiency of the ablation process, one gets: The normalized energy-specific volume (in terms of δ/ϕ th ) as a function of the relative average fluence ( ϕ/ϕ th ) is shown in Figure 1. For both intensity distributions (Gauss and top hat) the energy-specific volume shows a maximum value i.e. the ablation process is most efficient for an optimal fluence. A short calculation leads to: Both measures, the threshold fluence ϕ th as well as the energy penetration depth δ can be subject to the incubation effect [14,[16][17][18] lowering their values with the number of pulses applied. However, for applications discussed here, the number of pulses per area is high and the values for the threshold fluence and energy penetration depth can be therefore considered as being constant [17,18]. Figure 2 shows the energy-specific volume as a function of the peak fluence (Gaussian beam) for the six metals copper, brass, steel AISI 304, nickel, silver, and gold, machined with 10 ps pulses with a wavelength of 1064 nm at a repetition rate of 200 kHz. As it can be seen from the curves the ablation behavior of the metals follows well the presented model (3b) for the energy-specific volume. Only in the case of steel AISI 304, there is a deviation between the measured energyspecific volume and the model for higher fluences above the optimum value. This deviation is caused by the formation of cavities [19], also referred to as cone-like protrusions conelike protrusions (CLP) in literature [20], appearing when peak fluences above about two times the optimum value are applied as illustrated in Figure 3. These cavities start to grow and finally cover the whole machined surface leading to a reduction in the energy-specific volume. This cavity formation is even more pronounced for fs pulses [21]. Normalized energy-specific volume as a function of the relative average fluence following for a Gaussian (orange) and a top hat (blue) beam following (3a) and (3b). Note that the Gaussian beam ablation starts for a relative fluence of 0.5 as the peak fluence amounts twice the average value. For both intensity distributions, the energy-specific volume shows a maximum value at an optimum value of the relative fluence (dashed lines).
The influence of the pulse duration on the energyspecific volume was extensively investigated in the last decade [19,[21][22][23][24][25]. For metals in general an increasing maximum specific removal rate is observed for shorter pulse durations. A tremendous drop occurs when the pulse duration is increased from about 5 ps to several hundreds of ps followed by a slight increase when the pulse duration is further raised to 4 ns as illustrated in Figure 4 for wavelengths in the near-infrared (NIR). This dependence of the maximum energy-specific volume on the pulse duration is much less pronounced for UV radiation and pulse durations ranging from 400 fs to 14 ps as shown in [25]. Further, for short wavelength (visible or UV) and/or short pulse durations, the ablation behavior can show two different regimes having different threshold fluences and energy penetration depths. In this case, an adapted and more sophisticated model considering the two regimes can lead to a better agreement with the experimentally deduced values [21]. : Energy-specific volume for copper, brass, steel AISI 304, nickel, silver, and gold machined with 10 ps pulses, a wavelength of 1064 nm, a spot radius of w = 15.5 µm, and a repetition rate of 200 kHz. All metals except steel AISI 304 follow the presented model (blue curves) for the energy-specific volume and a Gaussian beam. The given threshold fluences and energy penetration depths were deduced by least square fits with the model function.  (optical microscope: upper line, Scanning electron microscope: lower line) from steel AISI 304 surfaces machined with 10 ps pulses, a wavelength of 1064 nm, a spot radius of w = 15.5 µm, and a repetition rate of 1064 nm with the optimum peak fluence (left), 2.5 times the optimum fluence (middle) and 10 times the optimum fluence (right).

Beam guiding concepts
2.2.1 Line scanning with single pulses 2.2.1.1 Line scanning with galvo scanners Surface structuring and engraving are usually realized as a 2.5 D process where the structure is sliced and divided into several layers. For each layer, the area to be removed is filled with a hatch i.e. in general only straight lines are marked with a galvo scanner. To prevent deep markings at the line start and line end the laser is usually switched on when the demanded marking speed v mark is reached, a method often denoted as skywriting. In conventional mode, the scanner trajectory consists of many acceleration/deceleration, jump, and mark vectors as illustrated in Figure 5a with the jump vectors again consisting of acceleration/deceleration phases and a phase with constant speed. Depending on the pattern this method can become rather slow. In the case of shifted laser surface texturing [26,27], the beam is guided with a constant speed and the laser pulse train is switched on and off, illustrated in Figure 5b. This mode allows taking benefit from the optimum scanner speed leading to the shortest time for a line of given length [28]. Depending on the exact pattern this mode can lead to a massive reduction of the processing time. However, as the laser pulse train is usually not synchronized to the movement of the galvo scanner a jitter is observed at the line start and the line end. A significant higher accuracy by keeping the benefits of shifted laser surface texturing can be achieved by synchronizing the galvo scanner and the laser pulse train [28][29][30] as illustrated in Figure 5c. In this case, the structuring information can be coded in a b/w bitmap where e.g. white pixels mean laser on and black pixels laser off. In any case, the marking speed v mark , the spot to spot distance p x , often called pitch, and the repetition rate f rep of the laser pulse train are linked together by: In [28] it is shown that for a given line length and maximum acceleration an optimum marking speed leading to the shortest processing time exists i.e. to fully optimize the processing time the repetition rate of the laser system should be adapted to the length if the pitch p x is fixed. For longer lines, the deduced optimum speed can increase the maximum speed of the scanning system and then the latter one is used. A good machining quality going with low surface roughness is usually achieved with a pitch corresponding to ¼ to ½ of the spot radius i.e. an overlap o of 50-75% [30]. To further optimize the processing time the line length should be dynamically adapted to the pattern [31] thus a full optimization of the processing time would demand a dynamic adaption of the laser repetition rate from line to line, a feature which is not offered by today's ultrashort pulsed laser systems. Therefore, the repetition rate is set to a fixed value leading to a short processing time taking the maximum speed of the scanner system into account.
Working at the optimum point with maximum efficiency (4) defines the pulse energy for a given spot radius and therefore, considering a Gaussian beam and combining (4) and (5), this leads to an expression for the marking speed as a function of the threshold fluence ϕ th , the overlap o, the spot radius w and the average power P av of the laser system: Thus, the laser average power which can be applied in the optimum point with the highest efficiency is limited by the marking speed defined either by the maximum speed of the scanner system or by the optimum speed with the lowest processing time. E.g. assuming a spot radius of w = 16 μm, a maximum marking speed of 30 m/s (which is almost the limit of today's galvo scanners with the assumed spot size) this would result in an average power of 20 W respectively 40 W for an overlap of 50 and 75% in case of copper. For steel, the average power drops to 4 and 8 W due to the significantly lower threshold fluence. In the case of copper, the usable average power can principally be increased by working at higher peak fluences. Indeed, this would not only lead to a reduction in the energy-specific volume but also to a higher applied average power going with a higher removal rate. Working at a fluence corresponding to n times the optimum value (4b) a short calculation for a Gaussian beam leads to But, this reduced energy-specific volume is obtained at an n-times higher average power and therefore the removal rate, compared to the one at the optimum point increases (for the identical repetition rate): e.g. for a fluence that is five times higher than the optimum value, the energy-specific volume is reduced to 65% of its optimum value but the achieved removal rate amounts 3.25 times the one obtained for the optimum fluence. If the machining quality is not affected by choosing a higher fluence this is a method to deal with higher average powers. But e.g. for steel, increasing the peak fluence strongly affects the machining quality as illustrated in Figure 3 and therefore the average power which can be applied is still limited to values below 10 W (spot radius of w = 16 μm, maximum marking speed 30 m/s) and higher marking speeds are highly demanded for higher average powers.

Line scanning with polygon scanners
Higher marking speeds can be achieved with polygon line scanners. Indeed, speeds up to v mark = 1000 m/s with a spot radius of w = 29 μm and v mark = 2000 m/s with a spot radius of w = 37.5 μm were e.g. demonstrated in [32,33]. These high speeds principally allow using higher average powers but, there is another effect which has to be taken into account. The maximum line length l max is given by the number of facets of the polygon and the used objective. But due to the limits of this objective and the transition between the polygon facets the effective marking length l eff is much shorter. This defines the maximum facet utilization rate or duty cycle η f,max = l eff /l max [33] which is in the range of 50% for the polygon scanners discussed above. And if the real processing length l proc is even shorter than l eff the facet utilization rate further reduces. Anyway, this group has demonstrated maximum effective area processing rates of 1.50 m 2 /min with fs pulses having an average power of 416 W and a repetition rate of 40 MHz for a surface roughing to increase the static friction coefficient [34]. For the generation of riblets having a spacing of 121 μm, a depth of 62 μm, a tip with <19 μm and a tip angle of α = 30°in an aluminum alloy (EN-AW 5005A) an effective area Figure 5: Illustration of the sky-writing a) shifted laser surface texturing b) and synchronized scanning c) method for the identical pattern. The spot to spot distance is denoted by p x whereas the line to line distance is p y . Skywriting consists of many marks-, jump and acceleration vectors which may lead to rather high processing times. In the case of skywriting a) and shifted laser surface texturing b) the movement of the galvo mirror is not synchronized to the laser pulse train leading to jitter in the line start position and finally in the pattern and highest accuracy is therefore achieved in the synchronized mode c).
processing rate of about 6 cm 2 /min was demonstrated [35]. A duty cycle of almost 100% was demonstrated in digital printing applications by distributing the laser beam between two polygons having each a duty cycle closely below 50% [36].
A polygon system having a duty cycle of 71%, reaching marking speeds up to 100 m/s, having an effective line length of l eff = 170 mm, and a spot radius of w = 22.5 μm for NIR was presented in [37,38]. Its applicability for scribing in CIGS solar cells and 2.5D structuring applications were investigated in [39][40][41]. Steel AISI 304 was machined with 43.5 W of average power at a repetition rate of 6.83 MHz with 10 ps pulses having a wavelength of 1064 nm with the highest surface quality. Surface texturing applications with 400 fs pulses in IR and green are e.g. presented in [42].
Thus, due to the achievable high speeds polygon line scanners seem to offer opportunities for scale-up laser ablation processes up into multi-hundred Watts regimes. But, scaling can be limited by effects appearing at high repetition rates and high average power. Heat accumulation can lead to bumpy surfaces described in [43] for steel AISI 304 machined with 6 ps pulses. For line scanning, it was shown that if the surface temperature just before the next pulse impinges on the surface exceeds 610°C a lowquality bumpy surface with high roughness will appear. Simulations following the analytical models presented in [43] showed that a scale-up with spot sizes w ≈ 20 μm and overlap between 50 and 75% is limited to several tens of W average power before bumpy surfaces would appear. A scaling above 100 W average power is principally possible by enlarging the pitch and switching to an interlaced mode as illustrated in Figure 6. But, this would require marking speeds of several 100 m/s, laser repetition rates of several tens of MHz, single pulse switching capabilities at these high repetition rates as well as synchronization of the line start with the laser pulse train; all representing rather demanding tasks.
For metals like copper, gold, or silver the threshold fluence is significantly higher compared to steel which reduces the repetition rate due to higher pulse energies at the optimum point. Additionally, a further increase of the pulse energy, finally leading to higher usable average powers following (8), is not strongly affecting the surface quality. Thus, a scale-up to average powers of several hundreds of Watts seems to be possible. But, there is another effect hindering this scale-up process, the plasmaand particle shielding [44]. If the time separation between two pulses is too short, the second pulse can be partially or fully absorbed by the plasma and particles produced by the previous pulse and can even redeposit already removed material as e.g. demonstrated in [45][46][47], supported by numerical simulations [48] and e.g. used for thrust enhancement and propellant conservation [49].
The power scale-up process to average powers shortly above 300 W was investigated [50] for 3 ps pulses having a wavelength of 1030 nm and a fast polygon offering marking speeds up to 480 m/s. It was shown that already for an average power of 62 W heat accumulation represents an issue in the case of steel AISI 304 and that it can be completely reduced by decreasing the overlap from 85 to 12.5% for a repetition rate of 10.15 MHz. However, for higher average powers the maximum polygon speed was too low and bumpy surfaces were always obtained. Bumpy surfaces on steel machined with average powers of 100 W and higher were also obtained in [32,33]. In the case of copper, the maximum energy-specific volume dropped by more than 50% when the repetition rate was increased from 2 up to 40 MHz. Similar reductions were also observed for brass. Anyway, for copper, a removal rate of 40 mm 3 /min was achieved at a repetition rate of 5 MHz, an average power of 306 W, and a peak fluence of about 4 J/cm 2 which is above its optimum value. In the case of brass, a similar removal rate of 41 mm 3 /min was demonstrated for an average power of 243 W and also repetition rate of 5 MHz. The exact influence of the overlap rests partially unclear and further investigations would be needed to clarify if a reduced overlap could reduce the observed limitations due to shielding effects.

Energy splitting strategies
The previous considerations clearly showed that the power scale-up process with single pulses and line scanning is limited by heat accumulation and plasma/particle shielding. The interlaced mode would significantly reduce these problems but would demand extremely high marking speeds and repetition rates as well as synchronized scanning not available today. One resort is to split high pulse energy at moderate repetition rates among several spots or pulses to get the peak fluence nearer to its optimum value for a single spot.

Pulse bursts as temporal energy splitting strategy
In a burst, the single pulses in the pulse train are replaced by pulse packages including several pulses with a temporal distance much shorter than the one given by the repetition rate. This allows to keep the repetition rate rather low and simultaneously reduce the energy of the single pulses in a pulse package, e.g. for an average power of 100 W at a repetition rate of 2 MHz and with 10 pulse bursts, the single pulses in a burst package would have an energy of only 5 µJ. For a spot radius of w = 20 μm this would result in a peak fluence of about 0.8 J/cm 2 which is near the optimum value for steel AISI 304 (see Figure 2). Additionally, bursts can show additional benefits as an increased maximum energy-specific volume e.g. in case of a burst consisting of three pulses on copper [19,50,51] or for three pulses and more for silicon [52]. A further benefit represents the polishing effect on steel which can be achieved when bursts are applied [19,[53][54][55][56]. But also negative effects like a strong shielding effect in case of an even number of pulses per burst in case of metals like copper, brass, gold, and silver [19,46,47,49]. Recently burst with very short intra burst delays of 2 ns and less, so-called GHz bursts, gained a lot of interest due to reported high removal rates [57][58][59][60]. However, for a moderate number of pulses per burst, the energy-specific volume is tremendously lower than for single pulses [61][62][63][64] and approaches values of ns pulses for a high number of pulses per burst [58,59]. A gain of about a factor of 2 in the ablation efficiency going with increased melting effects, as also observed for ns pulses, are reported in [56] for 500 MHz bursts of 10 ps pulses with eight pulses in a burst at a repetition rate of 2 MHz. Craters in silicon were machined with fs pulses and even shorter intra burst delays of about 4000 ps to less than 1 ps i.e. THz bursts with 2, 4, 8, and 16 pulses [65,66]. It is shown in [65] that for 2 THz bursts the measured energy-specific volumes increase the values for single pulses for two pulses in the burst and also four pulses at high energies. This finding could not be explained by the numerical simulations also presented in this work. A former publication [66] of this group shows that for an intra-burst delay of about 4 ps (corresponding to 250 GHz burst) the crater depth is lower for all bursts and applied fluences compared to single pulses whereas for an intra-burst delay of about 1 ns (1 GHz) the crater depth increases with the number of pulses per burst. Various burst regimes, therefore, lead to different behavior which is not only valid for silicon but also for metals as shown in a recent review about laser micromachining of metals with pulse bursts [67]. Thus, there are still many open questions to be solved.

Multi spot machining as spatial energy splitting strategy
Diffractive optical elements (DOE's) can be used to generate a pattern with a defined number of m × n spots allowing parallel processing of identical structures in the dimension of the spot separation achieved by the DOE as illustrated in Figure 7a). This technology has e.g. applied in the machining or cutting of thin steel sheets [68,69] or in the machining of solar cells [70,71]. A specific scanner set-up including a relay optic for multi-beam micromachining applications is shown in [72] and extended with active elements for correcting beam distortions for two-dimensional and three-dimensional applications in [73,74].
Another interesting approach is presented in [75] where the beam of a 300 and a 500 W high power laser is divided into either 9 or 17 sub-beams by a DOE. 8 and 16 of these sub-beams could be individually controlled in its pulse energy by one or two multi-channel acousto-optic modulators (AOM). This setup was then combined with a linear and a rotating axis for the rotogravure of embossing cylinders with high throughput and high average power.
Individually addressable multi spots can also be generated by a spatial light modulator (SLM) which additionally offers a dynamic change of the spot pattern compared to the fixed pattern obtained with a DOE. Scanning of different SLM-generated multibeam patterns with a a) b) c) galvo scanner was e.g. demonstrated in [76]. Patterning with galvo scanners and SLM with alternating holograms for generating different spot patterns or individual controlled intensities extends the flexibility of this method as it was demonstrated in [77][78][79]. In [80] it is shown that with 20 spots the ablation efficiency still amounts to 90% of the value obtained for single spots, i.e. 18 times higher removal rate at 20 times higher average power can be achieved in stainless steel.

Direct beam forming
Another method to work with higher pulse energies is the direct forming of the desired pattern. One method for regular patterns is direct laser interference patterning (DLIP) where the laser is divided into sub-beams that can be individually modulated e.g. in polarization. An interference pattern is then generated by superimposing the subbeams, whereas the size and periodicity of the pattern are defined by the relative angle of superposition of the subbeams. This method is mainly used to generate functional surfaces [81] e.g. super-hydrophobic [82] and antibacterial [83] surfaces or for generating surfaces with diffractionbased colors [82]. The DLIP method was extended for generating nonsymmetrical periodic microstructures [84] and can also be combined with a galvo scanner [85]. An SLM additionally offers the possibility to directly form the beam and to generate a desired pattern as e.g. illustrated in [86][87][88][89]. As an SLM is a phase-only device, the beamforming directly goes with speckle generation leading to a nonoptimized intensity distribution. Using singleshot ablation with different holograms generated for the identical intensity distribution the surface quality is significantly improved by temporal averaging of these speckle effects [90,91]. However, as the change of the hologram can only be made with a repetition rate of a few tens of Hz this method is not suited for high power laser machining. Another approach was recently shown in [92] where a laser beam is split into two orthogonal linear polarized sub-beams where each passes an SLM forming a specific pattern. The two sub-beams are combined again and superimposed by a polarizing beam splitter leading to more homogeneous intensity distributions e.g. more uniform top-hats could be generated rather than by a single SLM setup.

Combining spatial multi-spot or beamforming with synchronized scanning
The aim of this work is to combine spatial multi-spot strategies or beamforming with synchronized galvo scanning [31] to demonstrate its potential for high power and high throughput applications. In the case of synchronized scanning, the optical axis of the original laser beam hits fully regular positions with the highest accuracy as illustrated in Figure 5c. If a DOE or SLM is introduced in front of the scanner, the pattern is changed, but the regularity of the positions of the optical axis stays (just replace the spots in Figure 5c with an arbitrary beam pattern). This beam pattern is then shifted by a constant distance p x from pulse to pulse which is given by the marking speed and the repetition rate of the laser (5). As the identical positions are hit also from layer to layer it's possible to realize an optical stamping process on the fly with the minimum local thermal load. The beam pattern can also consist of multiple elementary cells as illustrated in Figure 7(b) and (c). If this beam pattern e.g. Figure 7(b) is moved by one elementary cell from pulse to pulse (see Figure 8a-g) it finally forms a regular pattern (Figure 8h) where the outer area is machined with fewer pulses. The example shown in Figure 8 also represents a multi-pulse drilling on the fly process where in every second line only every second hole is drilled. The on the fly process, where the holes are drilled within several layers, leads to the lowest possible thermal impact, e.g. [72] shows an example of a thin foil where holes are drilled with 12 × 12 spots. Applying a drilling process where the optical axis is placed at the desired position, the holes are machined by a percussion drilling process and the optical axis is then moved to the next position can lead to a thermal coloring of the foil in the center of the spot pattern due to heat accumulation effects. Using the same spot pattern and synchronized scanning would help to avoid these negative effects as shown by the following consideration: Let's assume that n p pulses are needed to drill a hole. For a single spot the temperature raise in the center due to heat accumulation, ΔT HA , can be estimated following [93] ΔT HA (n p ) = Q Where Q represents is the residual energy per pulse, ρ the density, c p the specific heat capacity, and κ the thermal diffusivity. For n × n spots, the raise of the temperature ΔT HA, n depends on the spacing between the spots and the individual spot to be considered, but ΔT HA, n ≥ ΔT HA always holds and assuming ΔT HA as temperature raise always underestimate the real heat accumulation. For multi-pulse drilling on the fly, each spot is hit with n pulses with an identical repetition rate. Then the material is assumed to cool down before the next n pulses will impinge on the same position when the pattern is moving into the opposite direction (as shown in Figure 8). Thus the temperature raise ΔT HA, fly in this situation can be approximated by replacing n p with n in equation (9). Also, this temperature is underestimated which compensates for the identical effect for ΔT HA, n ≈ ΔT HA . Comparing the two temperature raises leads to: E.g. for n p = 1000 and n = 12, the heat accumulation is about 20 times higher for the standard drilling process. This value increases to 40 for n p = 1000 and n = 5 or to 25 for n p = 500 and n = 5. Thus, multi-pulse drilling on the fly significantly reduces heat accumulation effects.

Experimental set-up
The principle experimental set-up is sketched in Figure 9. The pulse energy of the linear polarized beam of a laser system is controlled by a half-wave plate and a polarizer, further guided via folding mirrors onto a first telescope to adapt the beam size and the directed into a beam forming element located in front of the synchronized galvo scanner. Different combinations of laser systems, beamforming elements, and scanning systems have been used.

Set-up 1
The first set-up consisted of a FUEGO 10 ps laser system from Lumentum (former time-bandwidth-products), an SLM-based beam shaper FBS-G3 from Pulsar Photonics, and an excelliSCAN 14 from scanlab with a telecentric f = 100 mm objective. All experiments were performed with a wavelength of 1064 nm, a fixed repetition rate of 200 kHz, and a linear polarized beam with a beam quality of M 2 < 1.3. The polarization direction was controlled by an additional half-wave plate and the beam was enlarged to the beam shaper entrance diameter of 5 mm. Internally the beam was shaped using an SLM from Hamamatsu (HighRes, SXGA with 1280 × 1024 pixel). The maximal power on the workpiece amounted to 20 W going with pulse energy of 100 μJ at the used repetition rate of 200 kHz. The holograms for the desired spot pattern were calculated by using an  iterative Fourier transform algorithm IFTA [94]. Two applications were tested with this set-up first the multipulse drilling on the fly process and second the stamping of the Olympic rings. The on the fly drilling process was tested according to the strategy shown in Figure 8 with a regular n × n spot pattern where n was ranging from three to five.
The corresponding holograms could be directly calculated by the control software of the beam shaper unit. However, as can be seen for 5 × 5 spots in Figure 10(a) the resulting pattern is of low quality but can significantly be improved by using a training algorithm that uses the actual internal camera image as shown in Figure 10(b). The holes were spaced by 90 μm and drilled in a 10 μm thin foil out of stainless steel (AISI304) placed in the focal plane of the objective. After processing, the foils were cleaned in an ultrasonic bath with isopropanol and analyzed using an optical microscope. For the stamping of the Olympic rings, an axicon-like phase mask was applied leading to a ring of 270 μm diameter in the focal plane of the scanner as illustrated in Figure 10(c). The stamping process was realized on a 100 μm thick stainless steel (AISI304) again placed in the focal plane of the objective. The process is illustrated in Figure 11: the scanner moves the beam from left to right and back in synchronized mode. The single points denote the position of the optical axis when a pulse would be present. The spacing between these positions was set at 150 µm which corresponds to a marking speed of v mark = 30 m/s for the repetition rate of 200 kHz. This speed cannot be obtained with the present combination of scanner and objective, therefore the repetition rate was reduced to 100 kHz for these experiments also reducing the demanded marking speed to v mark = 15 m/s and the maximum average power to about 10 W. By blanking out the corresponding pulses with the AOM of the laser system, called pulse on demand (PoD) option, three rings with a distance of 300 µm to each other in the first line and another two rings 150 µm below in the second line were machined. The third and fourth lines (denoted by light gray points in the figure) were skipped to reduce the processing time by a factor of two. By repeating this pattern for each of the multiple machined layers the pattern is stamped into the steel foil.

Set-up 2
The desired n × n multi-spot patterns generated by the beam shaper system could be improved by training algorithms but they can be generated in even higher quality by a corresponding DOE due to its higher resolution. Therefore, a specially designed DOE generating a 5 × 5 spot pattern was placed in front of the scanner system and a spot to spot separation of 160 μm was achieved in the focal plane of the telecentric f = 100 mm objective. Additionally, shorter pulses would lead to higher removal rates [19,21]. To evaluate this, a PHAROS PH1-20 laser source from Light Conversion emitting 230 fs pulses at a wavelength of 1028 nm with a beam quality of M 2 < 1.35, a maximum average power of 20 W, and a repetition rate between 1 kHz and 1 MHz was used. The system worked at a constant repetition rate of 100 kHz resulting in maximum pulse energy of about 200 µJ and a demanded marking speed of v mark = 16 m/s. The holes were again drilled into the 10 μm thick foil of stainless steel.

Set-up 3
In this set-up, the 5 × 5 spot DOE was replaced with a DOE generating a pattern with 8 × 8 squares with top hat intensities having four relative levels namely 0, 33, 66, and 100%, as illustrated in Figure 12(a). The averaged relative intensity for each line and row of the pattern amounts 50% Figure 10: Generated spot patterns with the beam shaping unit a) 5 × 5 spots spaced by 90 µm directly generated from the control software b) optimized pattern with improved phase mask calculation by training the system using the actual camera image of the system. c) Ring with 270 µm diameter generated with an axicon-like phase mask. Higher-order rings could not be observed by the used camera. thus if this pattern would be shifted by one square along a scanned line and from line to line a constant average energy per area, except at the border of this area, would be obtained as illustrated Figure 8(h). As described in the previous section the DOE is a phase-only device and therefore the final pattern suffers from speckles as illustrated in Figure 12(b) showing the calculated intensity distribution which should be achieved with a Gaussian beam. The DOE efficiency was calculated to be 87% assuming 100% Transmission through the system consisting of DOE and scanner. The speckle effects should be averaged and therefore reduced with the above described synchronized scanning. But, in the present set-up, the scanner aperture of 14 mm cut higher diffraction orders from the DOE what was strongly limiting the quality of this optical transformation as illustrated in Figure 12

Multi-pulse drilling on the fly
Multi-pulse drilling on the fly was first tested with set-up 1. First experiments with not optimized patterns were performed with an average power of 7.2 W and 278 repetitions, 12.8 W and 157 repetitions as well as 20 W and 100 repetitions for 3 × 3, 4 × 4, and 5 × 5 spots, respectively, leading to about 2500 pulses per hole. Due to the nonoptimized patterns, the hole entrance shape is slightly elliptical as can be seen from the light-optical microscope (LM-) micrographs in Figure 13(a) for 5 × 5 spots. The ellipticity of the hole can significantly be reduced below 10% with the optimized phase mask as shown in Figure 13(b). In this case, the entrance diameter amounted to about 40 μm whereas the diameter at the exit was 21 μm resulting in a calculated taper angle of almost 45°. For a very high number of pulses applied, the drilling process will stop when the local fluence at the wall reaches the threshold value and the taper cannot be further improved. The pretty high taper angle points toward a high diameter of the single spots and therefore to a still not ideal spot pattern as can be seen from Figure 10(b). But it should be mentioned here that a better taper angle might have been obtained by raising the number of pulses.
Higher quality was achieved with 5 × 5 spots generated with the DOE in set-up 2. The set-up was first tested without the DOE in a single spot drilling process on the fly. Good quality holes were achieved with 765 repetitions and average power of 640 mW, corresponding to pulse energy of 6.4 μJ at the repetition rate of 100 kHz, which corresponds to about 50 times the threshold fluence. For the 5 × 5 spot pattern, the average power was then increased to 16 W and the number of repetitions reduced to 27. The achieved holes, shown in Figure 14, had a diameter of 33 and 18 μm at the entrance and the exit resulting in a calculated taper angle of about 37°. Compared to the experiments with the SLM the hole diameter, as well as the taper angle, are significantly reduced pointing to a better quality of the spots generated by the DOE. The drilling process took about 6 s for 100 × 100 holes resulting in a drilling rate of approximately 1700 holes per second. Compared to these values the single spot process took about 150 s going with a drilling rate of only 75 holes per second. In both cases, the circularity and regularity of the holes demonstrate the almost perfect synchronization of the scanner system with the laser pulse train. Already a small jitter in scan direction from repetition to repetition would have led to elliptic holes having their longer axis into this direction.

Optical stamping with SLM
The Olympic rings were stamped with set-up 1 into a steel foil as described in the previous section. Due to the reduced repetition rate and average power of 100 kHz respectively 10 W the pulse energy amounted to only 10 μJ which is low for a ring pattern with 270 μm diameter (see Figure 10c) especially when one takes into account that higher-order rings could contain a significant amount of energy. Figure 15 shows the obtained results for 50 machined layers. As for the holes, no "smearing" of the rings is observed i.e. a very precise stamping process is achieved.   formation ( Figure 3) is obtained. For lower average powers (5-15 W) remaining pillars are observed whose density is decreasing with increasing average power. As expected (see Figure 8h) the depth in the outer area is only slightly decreasing and for the bottom in the squares a regular structure in the horizontal and vertical direction is visible. All squares were analyzed with the WLI as illustrated in Figure 17(a)-(c). This measurement shows that in the outer area the depth linearly increases within 160 μm which corresponds to the size of the generated pattern ( Figure 12d). Further, Figure 17(c) shows the abovementioned regular structure with a period of exactly the dimension of the small squares and the pitch of 20 μm. The ablation depth and the s a -value, as a measure of the surface roughness, are shown in Figure 18 From the measured depth the removal rate (neglecting the death times of the galvo scanner) and energy-specific volume are deduced and shown in Figure 18(b). A maximum removal rate of about 3 mm 3 /min is achieved for the highest average power of 30 W. The maximum energy-specific volume amounts to about 2.5 μm 3 /μJ and is achieved at an average power between 10 and 20 W. If an ideal pattern, as shown in Figure 12(a), is assumed the fluence of the single small squares can be calculated for a given total pulse energy (including the calculated DOE efficiency of 87%). Using the expression for a top-hat intensity distribution (3a) allows then to calculate the corresponding energy-specific volume if the threshold fluence ϕ th and energy penetration depth δ are given. For ϕ th = 0.072 J/cm 2 and δ = 5.97 nm deduced from the experiments with a Gaussian beam (see Figure 2), the calculated energy-specific volume is shown as an orange solid line in Figure 18(b). The deduced maximum value of 2.8 μm 3 /μJ is slightly higher than the measured value. The orange dotted curve represents the least square fit to the measured data, again assuming an efficiency of 87%, resulting in ϕ th = 0.086 J/cm 2 and δ = 6.22 nm. The small dent between 5 and 10 W is located at the point where the fluence of the squares with the lowest intensity exceeds the threshold fluence.

Results with PB3
High-rate material removal was demonstrated with the PB3 laser system, set-up 3, and the 4-Level DOE again by ablating squares. The average power of the laser was much higher and amounted to 20,30,40,50,75,100,125, 150,  and 162 W. The repetition rate was set to 1 MHz which is five times higher than for the PB2. LM-micrographs of the achieved surfaces are shown in Figure 19. The results are very similar to the ones obtained with the PB2 system. Again, neither CLP formation nor a bumpy surface due to heat accumulation [43] can be observed and the squares show all a rather high quality. The ablation depth, shown in Figure 20(a), is slightly higher than for the PB2 experiments and the surface roughness is not exceeding a value of s a = 0.6 µm and is therefore even better than for the PB2 experiments with lower power. The corresponding removal rate and energy-specific volume are shown in Figure 20(b). A maximum removal rate of about 16.5 mm 3 /min is achieved for the maximum power of 162 W with a surface roughness of s a = 0.5 µm. The maximum energy-specific volume again amounts to about 2.5 μm 3 /μJ and is achieved at an average power of about 75 W which would correspond to 15 W at 200 kHz which is in very good agreement with the PB2 experiments. This excellent agreement is also shown by the orange line in Figure 20(b) representing the calculated energy-specific volume based on the fit parameter ϕ th = 0.086 J/cm 2 and δ = 6.22 nm obtained from the PB2 experiments.

Multi-pulse drilling on the fly
The discussed examples clearly show that very high drilling rates can be realized with the multi pulse drilling on the fly process by using a DOE or SLM. In principle, the drilling rate can be calculated including the dead times of the scanner as follows. If a squared area with N × N holes should be drilled with a pattern containing n × n spots spaced by the distance p x the side length s of the square which has to be marked by the scanner is (N + 2 ⋅ n) ⋅ p x . Following [28] there exists an optimum marking speed v opt for a given line length taking the maximum acceleration and velocity into account and together with p x this would  define the demanded repetition rate by equation (5). The realized synchronization demands a multiple of 100 kHz as repetition rate, therefore the value nearest to the optimum one has to be chosen. For the used excelliSCAN14 with an f = 160 mm objective and p x = 160 μm the switch between 100 and 200 kHz is located at about N = 125 holes per side. From all these data the real machining time Δt L for one layer can be deduced. The number of layers n L is then calculated from the number of pulses n p used to drill one hole by n L = ceil(n p /n 2 ) and the drilling rate is finally given by N 2 /n L ⋅ Δt L . Corresponding calculations for n p = 900 pulses per hole and p x = 160 μm are shown in Figure 21(a). A drilling rate of almost 15,000 holes/s would be achieved for N = 500 holes per side and 10 × 10 spots. Comparing with the experimental results from set-up 2, where an average power of 16 W as applied for 5 × 5 spots at a repetition rate of 100 kHz, this drilling rate would be achieved at an average power of 128 W.
These results should be compared with a corresponding step and repeat process where the n p pulses are applied, the pattern is then shifted to the next position with the optimum speed (shortest time) and the next n p pulses are applied. The corresponding calculated drilling rates are shown for 100 and 200 kHz repetition rates in Figure 21(b). For 10 × 10 spots and 200 kHz, a drilling rate of more than 20,000 holes/s could be achieved, which is 33% higher than the rate for the on-the-fly drilling process. But, as e.g. described in [72] and discussed in Section 2.2.4. Heat accumulation can represent as serious issue. With the given numbers (n p = 900, n = 10) equation (10) leads to a 22 times higher surface temperature in the case of the percussion drilling process compared to drilling on the fly.  Depending on the material this could clearly justify the application of a multi-pulse on the fly drilling process.

High rate material removal
As shown in Figure 18(b), the measured energy-specific volume approximately follows the calculated values for ideal intensity distribution (Figure 12a) and the values for the threshold fluence and the energy penetrations depth of ϕ th = 0.072 J/cm 2 and δ = 5.97 nm obtained from experiments with single spots. A least square fit to the measured data with the theoretical efficiency of 87% results in ϕ th = 0.086 J/cm 2 and δ = 6.22 nm. The scanner input aperture of 14 mm cut a few higher-order diffraction orders and this not only affects the finally achieved distribution but also the efficiency. Reducing the efficiency to 81% leads to ϕ th = 0.080 J/cm 2 and δ = 5.79 nm and a further reduction to 77% to ϕ th = 0.076 J/cm 2 and δ = 5.50 nm which is in line with the assumption that higher-order diffraction modes are cut. For more accurate considerations the measured intensity distribution instead of the ideal one should be considered. However, all calculations finally show that the theoretical model for a top-hat intensity distribution i.e. equations (2a) and (3a) hold.
The 30 W of average power represents the maximum available power in set-up 3 with PB2. The repetition rate of 200 kHz and the pitch of 20 μm between two pulses lead to a marking speed following equation (5) of 4 m/s which is five times slower than the maximum speed of 20 m/s offered by this set-up. Therefore, the experiments were repeated with the PB3 with a maximum average power of 162 W a repetition rate of 1 MHz. The obtained results show that this process can be scaled up to a high volume of at least 16.5 mm 3 /min at the maximum average power. The very good agreement between the measured energyspecific volumes with the model from the PB2 experiments further foster the assumption of scalability. The high surface quality and the shiny surface are somehow surprising when one takes into account that an average power of about 160 W is applied in a 1.4 × 1.4 mm square for a few seconds. With a three pulse burst, reducing the energyspecific volume by about 20% compared to the case of single spots and a Gaussian beam [19], the average power and the removal rate might be further increased to about 480 W and 40 mm 3 /min. However, it has first to be investigated if these high powers would lead to bumpy surfaces due to heat accumulation, as observed and described in [43], when the temperatures are exceeding a certain limit in the case of the used stainless steel (AISI304).
Such intensity patterns could principally also be generated with the SLM in set-up 2 but one has to have in mind that the maximum average power is limited to values between 100 and 200 W and the maximum resolution of the SLM further limits the minimum structure size in the produced pattern. The advantage of the SLM could be the possibility to dynamically change the pattern e.g. after a few layers. This strategy could eventually be used to not only reduce the regular structures observed in the bottom of the machined squares but also a larger scanner aperture, leading to a better quality of the intensity distribution, which could lead to a similar effect.

Conclusion and outlook
In the case of single spot machining with ultra-short pulses, the scale-up process is often limited by the extremely high demanded marking speeds and laser repetition rates. The presented examples clearly demonstrate the potential of DOE's or SLM combined with synchronized scanning for high speed, high throughput, and high rate material removal applications with minimal thermal impact. A drilling rate of about 1700 holes/s has been demonstrated with 5 × 5 spots and only 16 W of average power. But the strategy offers the potential to achieve several 10,000 holes/s with average powers above 100 W with minimum thermal impact. Further experiments with higher power and higher number of spots having a smaller distance are needed to estimate the limit of this scaling strategy.
Very high removal rates can be achieved with specially designed intensity distributions as demonstrated with a four-level top-hat intensity distribution. With an average power of 162 W, a removal rate of about 16.5 mm 3 /min was obtained. This strategy also offers the potential for further scale-up and to several 100 W of average power and to achieve removal rates of multiple 10 mm 3 /min in stainless steel. But the presented experiments show that the blocking of the higher diffraction orders by the scanner aperture limits the quality of the generated intensity distribution and then finally leads to a regular structure in the bottom of the machined structure. Using a scanner with a larger aperture could minimize this effect but would also lead to reduced maximum marking speed and acceleration which would again limit the scale-up process. The limit of the applicable average power without leading to bumpy surfaces due to heat accumulation in the case of stainless steel has further to be investigated. Therefore additional experiments with higher average power and a scanner with a larger aperture are demanded to clarify this open question.
In conclusion, the presented strategies, i.e. using DOE's and synchronized scanning, have the potential to pave the way for using ultrashort pulses with several 100 W of average power for laser micromachining.
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission. Research funding: None declared. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.