Abstract
In the reservoir computing literature, the information processing capacity is frequently used to characterize the computing capabilities of a reservoir. However, it remains unclear how the information processing capacity connects to the performance on specific tasks. We demonstrate on a set of standard benchmark tasks that the total information processing capacity correlates poorly with task specific performance. Further, we derive an expression for the normalized mean square error of a task as a weighted function of the individual information processing capacities. Mathematically, the derivation requires the task to have the same input distribution as used to calculate the information processing capacities. We test our method on a range of tasks that violate this requirement and find good qualitative agreement between the predicted and the actual errors as long as the task input sequences do not have long autocorrelation times. Our method offers deeper insight into the principles governing reservoir computing performance. It also increases the utility of the evaluation of information processing capacities, which are typically defined on i.i.d. input, even if specific tasks deliver inputs stemming from different distributions. Moreover, it offers the possibility of reducing the experimental cost of optimizing physical reservoirs, such as those implemented in photonic systems.
1 Introduction
Reservoir computing is a versatile, fasttrainable machine learning scheme inspired by the human brain [1, 2]. It avoids difficulties in the training of recurrent neural networks, like the vanishing gradient in time [3] by using a highdimensional dynamical system, instead of a network with optimized weights, and only training the linear output weights. Recently, it was shown that the universal approximation property holds for a wide range of reservoir computers [4], demonstrating the generality of the reservoir computing approach. Furthermore, because there is no need to train weights within a reservoir, a wide range of hardware implementations have been shown to be feasible [5–9]. Of particular interest are optical implementations, due to the potential speed up in computation times [10, 11].
Generally, two different approaches to reservoir computing exist. First, the socalled echo state approach, which typically uses a reservoir constructed out of randomly connected nonlinear nodes [1] and has been implemented both experimentally [6, 7, 9, 12] and computationally [13–15]. Second, the alternative delaybased approach introduced in [16], where a single dynamical node, e.g., a laser subjected to external time delayed feedback, serves as a timemultiplexed reservoir. This approach has the benefit of relatively simple implementation and uses the dynamic complexity of time delayed systems [17], introducing socalled virtual nodes. There have been various experimental realizations of the delaybased scheme, including optoelectronic [8, 16, 18, 19], optical [20–22] and electrical [23] ones. Potential applications are timeseriespredictions [24, 25], fast word recognition [10], signal conditioning [26] and optical communication [27].
Aside from speed and power consumption issues, one wants reservoir computers (RCs), which perform well on various regression or classification tasks. In order to evaluate the performance of a reservoir computer (RC), there are various benchmark measures, e.g., the very commonly used NARMA10 task. A taskindependent measure is the (linear) memory capacity [14], which measures the capability of the reservoir to reconstruct previous inputs. The linear memory capacity was generalized to the information processing capacity by Dambre et al. [28] in 2012, which measures the capability of a reservoir to memorize previous inputs and perform nonlinear calculations on them. The information processing capacities (IPCs) are also referred to as nonlinear memory capacities in the literature. This measure has been used in a number of experimental and theoretical publications [29–32] as a classification of the computing abilities of a reservoir. Besides measuring the IPC in theoretical or experimental frameworks, there are multiple recent advances in calculating and manipulating the IPC. For instance, it was recently possible to calculate the linear part of the IPC (the linear memory capacity) through a linearization of the operating reservoir [33], to systematically manipulate the memorizable inputs via delaytime tuning in a delaybased approach [34] and to manipulate the orders of nonlinear transformation performed via manipulating the input gain [29, 35]. Very recently, the measure of IPC was generalized to systems that are not time invariant [36]. The corresponding measure was introduced as the temporal information processing capacity (TIPC) and has potential relevance for biological systems, as it could be measured in neural cortices [36].
However, despite the extensive research that has been carried out on the IPC, the general connection between the IPC and taskspecific performance remains unclear. As an additional challenge, the IPC is typically defined on i.i.d. input; however, tasks often require a different input distribution. Moreover, to the authors’ knowledge, there is no known measure that characterizes a reservoir and at the same time strongly correlates to the performance on specific tasks. It is the aim of this paper to work towards addressing these issues by presenting a method to explicitly relate the IPCs to taskspecific performances by providing estimates of a RCs performance on specific tasks using its IPCs. The relevance of this work for hardwareimplemented reservoirs, such as photonic reservoirs, is that if the IPC is experimentally determined, our approach would allow for an efficient optimization of the performance of the reservoir on a range of tasks.
This work is structured as follows. First, we shortly explain the concept of reservoir computing, using the example of delaybased reservoir computing, and introduce the information processing capacity, as well as the typical benchmarking tasks used in this work. Second, we motivate our new approach by showing that the commonly used sums of IPCs are in general only weakly correlated to taskspecific performance. Third, we analytically derive an explicit relation between weighted sums of IPCs and taskspecific performance for the case that the task has the same input distribution as used to calculate the IPCs, thereby obtaining an estimate of a tasks error out of its IPCs. We then analyse the validity of our method when the input deviates from the mathematical constraints.
2 Methods
In reservoir computing, a dynamical system, called a reservoir, is fed with input information and the nonlinear response of the reservoir is used to perform a linear approximation of an inputdependent specific task. In the original approach, the reservoir consists of many randomly coupled nonlinear nodes with, e.g., tanhfunction dynamics [1]. The input enters into the system via a weighted input matrix. The nonlinear response to the input is then read out via a linear combination of the internalnodes states. The output weights are trained to minimize the Euclidean distance between the generated output and the target. We refer the reader to the literature [1, 37], [38], [39], [40] for more indepth discussions of the concepts and mathematical foundations. In the alternative delaybased approach [16], instead of multiplexing in space, the system is multiplexed in time via measuring the systems’ response at multiple times. In the simplest case, one has a system with one dynamical variable x(t) (“real node”) subjected to a linear delayed feedback term [16]. Several expansions to the original concept with more than one real node were discussed in the literature [41–44].
In this manuscript, we use the delaybased approach, which is shortly introduced in the following. For a more thorough discussion, consider, e.g., [16, 39]. However, our analytical derivations (Section 3) make no reference to what system is used as a reservoir. All of the results are expected to hold in the Echo State Network approach as well.
2.1 Timedelayed reservoir computing
In the delaybased approach of reservoir computing [16, 27, 34, 39, 40, 45], a nonlinear system subjected to one or multiple delays is fed by an input series (u _{1}, u _{2}, …, u _{ M }) and the response of the system is measured multiple times during each input interval. The latter procedure is called timemultiplexing. A corresponding setup is shown in Figure 1. Using a socalled sampleandhold procedure, each input is fed into the system for an interval T, called the input clock cycle. Inside each input interval, a Tperiodic mask function g is applied on the inputs. The mask is typically a stepwise constant function with random step heights. Applying it reduces the linear dependency of the virtual nodes. If the system contains multiple real nonlinear nodes, e.g., a laser network with timedelayed coupling, each node obtains its own input with its own mask function.
The virtual nodes are the states x(t) evaluated at different times t = (m − 1)T + jθ; m ϵ [1, 2, …, M], j ϵ [1, 2, …, N
_{
v
}]. The separation between the virtual nodes is denoted as
with a constant bias b. In the training phase, the reservoir is fed M
_{tr} successive inputs and the response of the reservoir is sampled N
_{
v
} times for each input. These responses are written into a
Here, ζ denotes the Tikhonov regularization parameter, introduced to avoid overfitting. Using the Moore–Penrose pseudoinverse, the solution to Eq. (2) is given by
To quantify the quality of the prediction, we use the normalized meansquare error (NMSE). It is defined as
here
When the reservoir contains multiple real nodes x _{1}(t), x _{2}(t), …, each real node contains its own virtual nodes and the state matrix S is expanded accordingly. The reservoir used in this manuscript is described in Section 4.
2.2 Information processing capacity
The information processing capacity is a taskindependent method of characterizing the performance of a reservoir computer. It quantifies the capability of the reservoir to reconstruct a set of basis functions in the Hilbert space of fading memory functions [28] and is a generalization of the concept of linear memory capacity [14]. In the following, we give a brief introduction to the IPC, using definitions suitable for the later calculations. As mentioned before, the corresponding measures are also referred to as nonlinear memory capacities. Generally, the capacity to reconstruct a certain task
where
If the capacity equals 1, the approximation is perfect, i.e.,
In order to construct a basis on the Hilbert space of fading memory functions, first consider the sequence {u
^{−h
}} = {u
_{0}, u
_{−1}, …, u
_{−h
}} of a current input u
_{0} and h previous inputs prior to u
_{0}. Then, one can construct a basis out of finite products of Legendre polynomials p
_{
d
}(u
_{−i
}) [28], where u
_{−i
} denotes the input i time steps into the past and p
_{
d
}(u
_{−i
}) denotes the Legendre polynomial of order d corresponding to input i time steps into the past. The resulting basis functions are P
_{
n
}(u
^{−∞}), with
We can define the order d(P _{ n }) of a basis element P _{ n } as the sum of the orders of the individual capacities in the sequence {d ^{ i }} yielding d(P _{ n }) = ∑ _{ i } d ^{ i }. The degree is used to define linear, quadratic, and higher order capacities as the sum of all capacities corresponding to a certain degree. Using Eq. (7), the IPC of degree d is defined as
where the summation takes place over all capacities corresponding to basis elements P _{ n } with order d(P _{ n }) = d.
Finally, the total IPC is defined as the sum
In [28], it was shown that the total information processing capacity is always smaller than or equal to the output dimension, i.e., the number of virtual nodes.
2.3 Benchmark tasks
Four standard benchmark tasks are considered in the study: NARMA10 [47], onestepahead predictions of the x variable of the Lorenz 63 system [48], nonlinear channel equalization and onestepahead prediction of the Mackey–Glass system [49]. Descriptions of all four tasks can be found in Section 1 of the Supplementary material.
2.4 Reservoir – network of ring coupled Stuart–Landau oscillators
As a numerical testing setup, we use a system of delay coupled Stuart–Landau oscillators in a delayed ring topology modeled by
where λ _{ j } denotes the pump rate of oscillator j, ω _{ j } the frequency, γ _{ j } the nonlinearity parameter, α _{ j } denotes the rescaled sheer parameter, κ _{ j } the feedback strength, ϕ _{ j } the feedback phase, τ _{ j } the delay time and ξ(t) models Gaussian white noise with amplitude D _{noise} = 10^{−8}. If the number of oscillators is N, then the ring topology implies x _{ N+1} ≡ x _{1}. The input is denoted as η _{ j } g _{ j }(t)u(t), where u(t) is the original (unmasked) input, g _{ j }(t) denotes the jth mask function and η _{ j } the input strength. The mask functions were chosen to be piecewise constant random binary. It is important to note that each oscillator j has a distinct mask. Figure 2a illustrates the oscillator setup with two nodes (j ϵ {1, 2}) and Figure 2b with one node (j = 1). The system is similar to a system of coupled lasers, as the Stuart–Landau oscillator is the generic form of a Hopf bifurcation and behaves like a laser near the lasing threshold [31]. The parameters chosen for the simulations are as in Table 1, unless stated otherwise.
Parameter  Description  Value 

τ  Feedback delay time  425 
T  Clock cycle  200 
Λ  Pump rate  0.01 
κ  Feedback strength  0.18 
Φ  Feedback phase  0 
η  Input strength  0.06 
α  Sheer parameter  0 
ω  Frequency  0 
γ  Nonlinearity  −0.1 
N _{ v }  Virtual nodes per oscillator  100 
3 Connection between information processing capacity and taskspecific performance
In the literature, in the context of the IPC, mostly measures consisting of sums of capacities of a certain order are investigated (IPC ^{ d } in Eq. (8)). However, the summed IPC ^{ d }s are in general only weakly correlated to the error of a specific task and are, therefore, a poor measure for predicting taskspecific performance. As an example, we consider a system of two delaycoupled Stuart–Landau oscillators in a ring coupling configuration (see Section 2.4 for details). In this setup, we find that the ring delays have a strong influence on the IPCs and task performance, but the parameter dependencies of these quantities vary strongly.
In Figure 3, a scan over the two ring delays τ _{1}, τ _{2} is performed and the linear, quadratic, cubic and total IPC are depicted. Figure 4 shows the corresponding plots for the performance on various regression benchmark tasks. For all tasks, a strong dependence on the delay times is evident and various resonance lines can be found, as explained in [34]. However, important for the predictive power of the summed IPC ^{ d }s is that the delay dependence is different for each of the tasks.
To quantify the predictive power of the summed IPC ^{ d }s, we calculate a Pearson correlation coefficient between the depicted IPC ^{ d } s and the NMSE of the benchmark tasks. A negative value close to −1 indicates a low NMSE at values of high IPC, whereas a high positive value indicates a counterproductive effect between the memory capacity and the task NMSE. Zero indicates no linear statistical correlation between both quantities. As one can see in Table 2, the obtained correlation coefficients are generally low. Furthermore, some values are positive, meaning that high values of the corresponding sum of IPCs tend to correspond to low task performance. These results demonstrate that choosing reservoir hyperparameters based on summed capacities, such as the total IPC, is not helpful for optimizing taskspecific performance. The underlying reason is that most tasks have specific IPC requirements and do not need all available ones. Different past inputs are in general not equally relevant for the performance of a specific tasks. A simple sum of all IPCs is, therefore, a very limited, yet often considered, measure.
Task  IPC ^{1}  IPC ^{2}  IPC ^{3}  Total IPC 

Lorenz X  0.18  0.12  −0.20  −0.07 
NARMA10  0.07  0.03  −0.13  −0.08 
Channel equal.  0.14  0.08  −0.12  −0.02 
Mackey–Glass  −0.62  −0.63  −0.02  −0.53 
3.1 Analytic derivation of an explicit relationship between IPCs and NMSE
The appeal of the information processing capacity is that it quantifies the computational capabilities of a reservoir in a taskindependent manner. In this section, we introduce a method of connecting the IPCs with taskspecific performance. The motivation for this is that once the IPCs of a reservoir are determined, these can be used to predict the performance of the reservoir on a multitude of tasks. This can reduce the hyperparameter optimization, which would otherwise need to be carried out individually for each task.
To generate a more task relevant measure than simple sums of information processing capacities, we build a weighted sum of IPCs instead of the unweighted sums usually considered. The corresponding ansatz is
where
The exact calculations are shown in the Supplementary material, here only the most important steps are given. Consider the capacity to compute an arbitrary task (Eq. (6)) [46]:
The idea is to develop the task
It is known that any time invariant dynamical system with fading memory can be approximated by a finite discrete Volterra series [51]. The discrete Volterra series is a series of products of monomials of previous inputs. We, therefore, assume that the task
We can, therefore, expand the target function
where P _{ n } are the basis functions, a _{ n } the corresponding coefficients, and ξ is a noise term independent of the input. We set P _{0} = 1 and, therefore, a _{0} is the constant bias term. The choice of (useful) basis functions is dependent on the probability distribution of the input series. In the context of IPC, often identically drawn independent random numbers in [−1, 1] are chosen for the input. In this case, an orthogonal set of basis functions are products of Legendre polynomials formed by different past input steps (see Section 2.2). We use this basis in Section 4, but the analytical calculations are generalizable and do not refer to a specific basis. We emphasize that other orthogonal bases could be used as well, see e.g., [36], for use of different bases in an IPC context.
While we developed the target into the abovementioned series (Eq. (12)) and obtain the
where p
_{
n
} is the prediction of basis function
We further assume that the input series for the task comes from the same probability distribution as in the definition of the IPC, for example i.d.d. random numbers in [−1, 1]. If one then puts Eqs. (12) and (13) into Eq. (11), we obtain a formula for the NMSE for a given task:
Here, a
_{
n
} are the coefficients of the series expansion of the task (Eq. (12)), ‖P
_{
n
}‖^{2} the squared norms of the basis polynomials and
In order for Eq. (14) to be used, the expansion coefficients a _{ n } must be determined. To evaluate the coefficients a _{ n }, we use a linear regression approach, constructing a linear model out of the basis elements P _{ n }(u ^{−∞}). The corresponding model is
Where y denotes the prediction of the target vector
Our approach predicts the NMSE corresponding to a task
4 Numerical evaluation
In this section, we check the validity of Eq. (14) derived in the previous section.
4.1 Impact of deviating from an i.i.d. input distribution
In order to derive the explicit expression relating the NMSE of a task to the IPCs, the assumption was made that the task input distribution equals the IPC input distribution. It is possible to evaluate IPCs for arbitrary input distributions [36] and, therefore, matching arbitrary tasks. However, the IPC input distribution has to be predefined when measuring the IPCs. Therefore, we investigate how well Eq. (14) performs under systematic deviations of this assumption. To do this, we require a task where the input distribution can gradually be altered from i.i.d. random inputs in [−1, 1]. We use the NARMA10 task (given in the Supplementary material). It can be operated with non i.d.d. random inputs, linearly transformed to be in [−1, 1], without having to change the equation that defines the target series.
There are two main factors that are to be considered. First, it is assumed that different inputs u _{ i }, u _{ j }, i ≠ j are not correlated, and therefore, the autocorrelation function of the input series {u _{0}, u _{1}, …} is zero. This is a severe restriction, since most realistic input series will have an autocorrelation function that is nonzero. Second, it is assumed that the inputs u _{ i } are drawn from a uniform probability distribution p(u _{ i }). Deviating from a uniform distribution, even in the absence of a nonzero autocorrelation, could also degrade the predictive power of Eq. (14). We analyse both effects with input defined via the following equations:
where ρ denotes a correlation parameter and χ _{ i } denotes independently drawn random numbers from a probability distribution p(χ). With this construction, ρ = 0 corresponds to uncorrelated fully independent u _{ i } with zero autocorrelation and ρ = 1 corresponds to u _{ i } = u _{ i−1} and thereby the autocorrelation equals one for every time lag. We can feed the above defined input series u _{ i } into the iterative formula for the NARMA10 task.
To test the predictive power of Eq. (14), we consider the system of one Stuart–Landau oscillator with delayed feedback (see Figure 2b). While fixing the input clock cycle, we scan over the delay time for 50 linearly distributed delays, in a range where the NARMA10 NMSE varies strongly (τ ϵ [20, 1600]). This means, we investigate 50 different reservoirs and evaluate the performance of our new method for the NARMA task with different input distributions. We evaluate the Pearson correlation coefficient as well as the mean square error between the predicted NMSE from Eq. (14) and the simulated NMSE.
Figure 5a shows these quantities in a scan over the correlation parameter ρ for different distributions p(χ) (different symbols). The line without symbols is the uniform distribution in [−1, 1], where ρ = 0 corresponds to i.i.d. inputs and, therefore, the assumptions made in the derivation of Eq. (14) hold (a correlation parameter of ρ = 1 corresponds to identical inputs and, therefore, an infinite autocorrelation time). It can be seen that for low ρ, Eq. (14) works well. For increasing ρ, the mean squared deviation between predicted and measured NMSE becomes increasingly large and the Pearson correlation coefficient decreases. The lines with symbols in Figure 5a show the results for binary, gamma and Gaussian probability distributions p(χ). We chose a binary distribution between 0 and 0.42, a gamma distribution with shape 1.5 and scale 0.1, and a Gaussian distribution with zero mean and standard deviation 0.3. Note, after the target series is created, the input is linearly normalized such that it approximately fits in the interval [−1, 1], see Supplementary material for further details. Even though these distributions strongly violate the assumption of uniform input, the method works well if ρ does not become too high. As a rule of thumb, if one sets a threshold of a Pearson correlation coefficient of at least 0.5, this roughly corresponds to values of ρ ⪅ 0.8 (which corresponds to an autocorrelation time t _{ a } ⪅ 4.9, using the definition given below in Eq. (17)).
The results indicate that Eq. (14) is able to qualitatively predict a reservoir computer′s performance on a task, even if the input distribution is not uniform and uncorrelated. For high autocorrelation times, however, Eq. (14) with weights evaluated with the multilinear model given by Eq. (15) performs insufficiently.
4.2 Numerical results for benchmark tasks
In this section, we test our method on a set of benchmarking tasks. Here, we once again use the system of two delaycoupled Stuart–Landau oscillators in a ring coupling configuration (see Section 2.4 for details) as the test reservoir. In this setup, we obtain complex patterns for the NMSE of the benchmark tasks, if we scan over the two delay times τ _{1}, τ _{2} (Figure 4a–d). In Figure 6a–d, we apply our new method (Eq. (14)) for the NARMA10, Lorenz X, channel equalization and Mackey–Glass tasks. The prediction of the NARMA10 NMSE nearly perfectly reproduces the structure of the simulated NARMA10 NMSE without any knowledge of the system other than the IPCs, which is to be expected in this case as the assumptions made to derive Eq. (14) hold for the NARMA10 input. The Pearson correlation coefficient between the predicted and simulations NMSE is depicted in the respective insets of Figure 6. It reaches a value of 0.99 for NARMA10. In this case, we evaluated capacities up to 2nd order and up to 20 previous inputs. See Supplementary material for a discussion of the cutoff orders.
For the other benchmark tasks evaluated in Figure 6b–d, the assumption of i.i.d. distributed input does not hold. That is first because these tasks have nonzero autocorrelation time (Figure 7a) and second, the distribution of the inputs strongly deviates from an uniform input distribution (Figure 7b). However, if we apply Eq. (14) for the Lorenz task, the structure is still well reproduced (Figure 6b), with a high correlation coefficient of 0.84. For the Lorenz task, we evaluated capacities up to 5th order and up to 5 steps into the past. For the channel equalization task (Figure 6c), the absolute values of the predicted NMSE are inaccurate, in this case even negative. However, the correlation between the predicted and true NMSE is 0.99, a near maximum value. Therefore, for parameter optimization purposes, the predicted NMSE is usable. For the channel equalization task, we evaluated capacities up to third order and up to 10 steps into the past. Both, the Lorenz and the channel equalization task indicate that Eq. (14) is useful even in the case the assumption of i.i.d. input is not fulfilled. These results show that evaluating IPCs for an i.i.d. input distribution still gives information on the performance of a RC fed with inputs from a different distribution.
If we consider the Mackey–Glass task (Figure 6d), the NMSE is poorly approximated using Eq. (14) with coefficients a _{ n } corresponding to Eq. (15). This is because, in comparison with the other tasks, the Mackey–Glass input shows a high input autocorrelation (Figure 7a), which leads to an ambiguous development of the coefficients a _{ n }, as well as a larger deviation from the i.i.d. input assumption. For the Mackey–Glass task, we evaluated capacities up to 20 previous inputs and up to 3 orders.
To compare the timescales over which the various inputs are autocorrelated, we calculate the autocorrelation time t _{ a } of the input series according to
where ψ(j) is the value of the normalized autocorrelation function at time lag j (see Supplementary material for details). To avoid summing up noise, we only evaluated autocorrelation values greater than a threshold of 0.02 and evaluated j _{max} = 100 time lags. For the NARMA10 task, we obtain a value of t _{ a } ≈ 0 (i.i.d input), for the Lorenz task t _{ a } ≈ 4.4, for the channel task t _{ a } ≈ 0.2 and for the Mackey–Glass task a value of t _{ a } ≈ 44.5. This indicates a great difference in autocorrelation times t _{ a } between the Mackey–Glass task and the other considered tasks. In Section 4.1, we systematically tuned the autocorrelation of a NAMRA10 input in order to investigate how deviations of i.i.d. input harm the performance of Eq. (14). There, the Pearson correlation between the estimated and the actual NMSE was above 0.5 for ρ up to 0.8, which corresponded to an autocorrelation time t _{ a } ⪅ 4.9. This autocorrelation time is similar to the Lorenz task example.
5 Conclusions
We have investigated the relationship between the commonly used information processing capacity and the performance on regression tasks, e.g., time series prediction tasks, in reservoir computing. To characterize the computational capabilities of a reservoir, the total information processing capacity or the IPCs summed over each order are commonly considered. However, these simple sums of IPCs are an insufficient measure for predicting taskspecific performances. We analytically derived an explicit relation between weighted sums of individual information processing capacities and the expected computing error for specific tasks. We have further tested the extent to which the expression for the NMSE that we have derived deviates from the true NMSE when the problem deviates from the strict mathematical assumptions, i.e., the input distribution of the task varies from that used to characterize the IPC. We found high correlation between the predicted and the actual NMSE for NARMA10, channel equalization and Lorenz time series prediction tasks. For the Mackey–Glass task, we found poor agreement. Our results indicate that the autocorrelation time of the input sequence is crucial in determining the accuracy of the trends of the predicted NMSE, with longer autocorrelation times reducing the applicability of our proposed method.
The above derived approach can be exploited to reduce the experimental cost of optimizing a reservoir computing setup, such as a photonic reservoir, for multiple tasks. Before, in order to obtain measures of the performance for different tasks, the input series for each task had to be fed into the reservoir to measure the responses. With Eq. (14), however, one obtains a direct link between the performance of a task and the individual IPCs of a reservoir computer, and the IPCs can be measured via only one sufficiently large input series and postprocessing. By establishing a link between the IPC and the performance for specific tasks, we have demonstrated the utility of the IPC as a means of characterizing physically implemented reservoir computing setups.
Funding source: Deutsche Forschungsgemeinschaft
Award Identifier / Grant number: LU 1729/31
Award Identifier / Grant number: SFB910 B9

Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

Research funding: This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) in the framework of the SFB910 (project B9) and Projektnummer 445183921 (project LU 1729/31).

Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
[1] H. Jaeger, “The ’echo state’ approach to analysing and training recurrent neural networks,” GMD – German National Research Institute for Computer Science, GMD Rep., vol. 148, 2001.Search in Google Scholar
[2] W. Maass, T. Natschläger, and H. Markram, “Realtime computing without stable states: a new framework for neural computation based on perturbations,” Neural Comput., vol. 14, pp. 2531–2560, 2002. https://doi.org/10.1162/089976602760407955.Search in Google Scholar PubMed
[3] S. Hochreiter, “The vanishing gradient problem during learning recurrent neural nets and problem solutions,” Int. J. Uncertain. Fuzziness Knowledge based Syst., vol. 6, pp. 107–115, 1998. https://doi.org/10.1142/s0218488598000094.Search in Google Scholar
[4] L. Gonon and J. P. Ortega, “Reservoir computing universality with stochastic inputs,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 1, pp. 100–112, 2020. https://doi.org/10.1109/tnnls.2019.2899649.Search in Google Scholar PubMed
[5] P. Antonik, F. Duport, M. Hermans, A. Smerieri, M. Haelterman, and S. Massar, “Online training of an optoelectronic reservoir computer applied to realtime channel equalization,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 11, pp. 2686–2698, 2016. https://doi.org/10.1109/tnnls.2016.2598655.Search in Google Scholar
[6] K. Dockendorf, I. Park, P. He, J. C. Principe, and T. B. DeMarse, “Liquid state machines and cultured cortical networks: the separation property,” Biosystems, vol. 95, no. 2, pp. 90–97, 2009. https://doi.org/10.1016/j.biosystems.2008.08.001.Search in Google Scholar PubMed
[7] C. Fernando and S. Sojakka, “Pattern recognition in a bucket,” Advances in Artificial Life, pp. 588–597, 2003, https://doi.org/10.1007/9783540394327_63.Search in Google Scholar
[8] L. Larger, M. C. Soriano, D. Brunner, et al.., “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express, vol. 20, no. 3, pp. 3241–3249, 2012. https://doi.org/10.1364/oe.20.003241.Search in Google Scholar PubMed
[9] K. Vandoorne, P. Mechet, T. Van Vaerenbergh, et al.., “Experimental demonstration of reservoir computing on a silicon photonics chip,” Nat. Commun., vol. 5, p. 3541, 2014. https://doi.org/10.1038/ncomms4541.Search in Google Scholar PubMed
[10] L. Larger, A. BaylónFuentes, R. Martinenghi, V. S. Udaltsov, Y. K. Chembo, and M. Jacquot, “Highspeed photonic reservoir computing using a timedelaybased architecture: million words per second classification,” Phys. Rev. X, vol. 7, p. 011015, 2017. https://doi.org/10.1103/physrevx.7.011015.Search in Google Scholar
[11] M. Nakajima, K. Tanaka, and T. Hashimoto, “Scalable reservoir computing on coherent linear photonic processor,” Commun. Phys., vol. 4, p. 20, 2021. https://doi.org/10.1038/s42005021005191.Search in Google Scholar
[12] S. Sackesyn, C. Ma, J. Dambre, and P. Bienstman, “Experimental realization of integrated photonic reservoir computing for nonlinear fiber distortion compensation,” Opt. Express, vol. 29, no. 20, pp. 30991–30997, 2021. https://doi.org/10.1364/oe.435013.Search in Google Scholar
[13] M. Bauduin, A. Smerieri, S. Massar, and F. Horlin, “Equalization of the nonlinear satellite communication channel with an echo state network,” in 2015 IEEE 81st Vehicular Technology Conference (VTC Spring), 2015.10.1109/VTCSpring.2015.7145827Search in Google Scholar
[14] H. Jaeger, “Short term memory in echo state networks,” GMD – Forschungszentrum Informationstechnik GmbH, GMD Rep., vol. 152, 2002.Search in Google Scholar
[15] M. Sorokina, S. Sergeyev, and S. Turitsyn, “Fiber echo state network analogue for highbandwidth dualquadrature signal processing,” Opt. Express, vol. 27, pp. 2387–2395, 2019. https://doi.org/10.1364/oe.27.002387.Search in Google Scholar
[16] L. Appeltant, M. C. Soriano, G. Van der Sande, et al.., “Information processing using a single dynamical node as complex system,” Nat. Commun., vol. 2, p. 468, 2011. https://doi.org/10.1038/ncomms1476.Search in Google Scholar PubMed PubMed Central
[17] J. D. Hart, L. Larger, T. E. Murphy, and R. Roy, “Delayed dynamical systems: networks, chimeras and reservoir computing,” Philos. Trans. R. Soc. A, vol. 377, no. 2153, p. 20180123, 2019. https://doi.org/10.1098/rsta.2018.0123.Search in Google Scholar PubMed PubMed Central
[18] Y. Chen, L. Yi, J. Ke, et al.., “Reservoir computing system with double optoelectronic feedback loops,” Opt. Express, vol. 27, no. 20, pp. 27431–27440, 2019. https://doi.org/10.1364/oe.27.027431.Search in Google Scholar PubMed
[19] Y. Paquot, F. Duport, A. Smerieri, et al.., “Optoelectronic reservoir computing,” Sci. Rep., vol. 2, p. 287, 2012. https://doi.org/10.1038/srep00287.Search in Google Scholar PubMed PubMed Central
[20] D. Brunner, M. C. Soriano, C. R. Mirasso, and I. Fischer, “Parallel photonic information processing at gigabyte per second data rates using transient states,” Nat. Commun., vol. 4, p. 1364, 2013. https://doi.org/10.1038/ncomms2368.Search in Google Scholar PubMed PubMed Central
[21] Y. S. Hou, G. Q. Xia, W. Y. Yang, et al.., “Prediction performance of reservoir computing system based on a semiconductor laser subject to double optical feedback and optical injection,” Opt. Express, vol. 26, no. 8, pp. 10211–10219, 2018. https://doi.org/10.1364/oe.26.010211.Search in Google Scholar
[22] Q. Vinckier, F. Duport, A. Smerieri, et al.., “Highperformance photonic reservoir computer based on a coherently driven passive cavity,” Optica, vol. 2, no. 5, pp. 438–446, 2015. https://doi.org/10.1364/optica.2.000438.Search in Google Scholar
[23] Z. Q. Zhong, D. Chang, W. Jin, et al.., “Intermittent dynamical state switching in discretemode semiconductor lasers subject to optical feedback,” Photon. Res., vol. 9, no. 7, pp. 1336–1342, 2021. https://doi.org/10.1364/prj.427458.Search in Google Scholar
[24] J. Bueno, D. Brunner, M. C. Soriano, and I. Fischer, “Conditions for reservoir computing performance using semiconductor lasers with delayed optical feedback,” Opt. Express, vol. 25, no. 3, pp. 2401–2412, 2017. https://doi.org/10.1364/oe.25.002401.Search in Google Scholar
[25] Y. Kuriki, J. Nakayama, K. Takano, and A. Uchida, “Impact of input mask signals on delaybased photonic reservoir computing with semiconductor lasers,” Opt. Express, vol. 26, no. 5, pp. 5777–5788, 2018. https://doi.org/10.1364/oe.26.005777.Search in Google Scholar
[26] A. Argyris, J. Cantero, M. Galletero, et al.., “Comparison of photonic reservoir computing systems for fiber transmission equalization,” IEEE J. Sel. Top. Quantum Electron., vol. 26, no. 1, p. 5100309, 2020. https://doi.org/10.1109/jstqe.2019.2936947.Search in Google Scholar
[27] A. Argyris, “Photonic neuromorphic technologies in optical communications,” Nanophotonics, vol. 11, no. 5, pp. 897–916, 2022. https://doi.org/10.1515/nanoph20210578.Search in Google Scholar
[28] J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep., vol. 2, p. 514, 2012. https://doi.org/10.1038/srep00514.Search in Google Scholar PubMed PubMed Central
[29] M. Goldmann, C. R. Mirasso, I. Fischer, and M. C. Soriano, “Exploiting transient dynamics of a timemultiplexed reservoir to boost the system performance,” in 2021 International Joint Conference on Neural Networks (IJCNN), IEEE, 2021, pp. 1–8. Available at: https://ieeexplore.ieee.org/document/9534333.10.1109/IJCNN52387.2021.9534333Search in Google Scholar
[30] K. Harkhoe and G. Van der Sande, “Taskindependent computational abilities of semiconductor lasers with delayed optical feedback for reservoir computing,” Photonics, vol. 6, no. 4, p. 124, 2019. https://doi.org/10.3390/photonics6040124.Search in Google Scholar
[31] F. Köster, D. Ehlert, and K. Lüdge, “Limitations of the recall capabilities in delay based reservoir computing systems,” Cogn. Comput., vol. 2020, pp. 1–8, 2020.10.1007/s12559020097335Search in Google Scholar
[32] S. Ortín and L. Pesquera, “Delaybased reservoir computing: tackling performance degradation due to system response time,” Opt. Lett., vol. 45, no. 4, pp. 905–908, 2020. https://doi.org/10.1364/ol.378410.Search in Google Scholar PubMed
[33] F. Köster, S. Yanchuk, and K. Lüdge, “Master memory function for delaybased reservoir computers with singlevariable dynamics,” 2021 [Online]. Available at: https://arxiv.org/abs/2108.12643.10.1109/TNNLS.2022.3220532Search in Google Scholar PubMed
[34] T. Hülser, F. Köster, L. C. Jaurigue, and K. Lüdge, “Role of delaytimes in delaybased photonic reservoir computing,” Opt. Mater. Express, vol. 12, no. 3, pp. 1214–1231, 2022. https://doi.org/10.1364/ome.451016.Search in Google Scholar
[35] B. Vettelschoss, A. Röhm, and M. C. Soriano, “Information processing capacity of a singlenode reservoir computer: an experimental evaluation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 6, pp. 2714–2725, 2021.10.1109/TNNLS.2021.3116709Search in Google Scholar PubMed
[36] T. Kubota, H. Takahashi, and K. Nakajima, “Unifying framework for information processing in stochastically driven dynamical systems,” Phys. Rev. Res., vol. 3, no. 4, p. 043135, 2021. https://doi.org/10.1103/physrevresearch.3.043135.Search in Google Scholar
[37] D. Brunner, B. Penkovsky, B. A. Marquez, M. Jacquot, I. Fischer, and L. Larger, “Tutorial: photonic neural networks in delay systems,” J. Appl. Phys., vol. 124, no. 15, p. 152004, 2018. https://doi.org/10.1063/1.5042342.Search in Google Scholar
[38] M. Lukosevicius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Comput. Sci. Rev., vol. 3, no. 3, pp. 127–149, 2009. https://doi.org/10.1016/j.cosrev.2009.03.005.Search in Google Scholar
[39] G. Tanaka, T. Yamane, J. B. Héroux, et al.., “Recent advances in physical reservoir computing: a review,” Neural Netw., vol. 115, pp. 100–123, 2019. https://doi.org/10.1016/j.neunet.2019.03.005.Search in Google Scholar PubMed
[40] G. Van der Sande, D. Brunner, and M. C. Soriano, “Advances in photonic reservoir computing,” Nanophotonics, vol. 6, no. 3, p. 561, 2017. https://doi.org/10.1515/nanoph20160132.Search in Google Scholar
[41] M. Goldmann, F. Köster, K. Lüdge, and S. Yanchuk, “Deep timedelay reservoir computing: dynamics and memory capacity,” Chaos, vol. 30, no. 9, p. 093124, 2020. https://doi.org/10.1063/5.0017974.Search in Google Scholar PubMed
[42] S. Ortín and L. Pesquera, “Reservoir computing with an ensemble of timedelay reservoirs,” Cogn. Comput., vol. 9, no. 3, pp. 327–336, 2017. https://doi.org/10.1007/s1255901794637.Search in Google Scholar
[43] A. Röhm and K. Lüdge, “Multiplexed networks: reservoir computing with virtual and real nodes,” J. Phys. Commun., vol. 2, p. 085007, 2018. https://doi.org/10.1088/23996528/aad56d.Search in Google Scholar
[44] C. Sugano, K. Kanno, and A. Uchida, “Reservoir computing using multiple lasers with feedback on a photonic integrated circuit,” IEEE J. Sel. Top. Quantum Electron., vol. 26, no. 1, p. 1500409, 2020. https://doi.org/10.1109/jstqe.2019.2929179.Search in Google Scholar
[45] Y. K. Chembo, “Machine learning based on reservoir computing with timedelayed optoelectronic and photonic systems,” Chaos, vol. 30, no. 1, p. 013111, 2020. https://doi.org/10.1063/1.5120788.Search in Google Scholar PubMed
[46] F. Köster, S. Yanchuk, and K. Lüdge, “Insight into delay based reservoir computing via eigenvalue analysis,” J. Phys. Photonics, vol. 3, no. 2, p. 024011, 2021. https://doi.org/10.1088/25157647/abf237.Search in Google Scholar
[47] A. F. Atiya and A. G. Parlos, “New results on recurrent network training: unifying the algorithms and accelerating convergence,” IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 697–709, 2000. https://doi.org/10.1109/72.846741.Search in Google Scholar PubMed
[48] E. N. Lorenz, “Deterministic nonperiodic flow,” J. Atmos. Sci., vol. 20, p. 130, 1963. https://doi.org/10.1175/15200469(1963)020<0130:dnf>2.0.co;2.10.1175/15200469(1963)020<0130:DNF>2.0.CO;2Search in Google Scholar
[49] M. C. Mackey and L. Glass, “Oscillation and chaos in physiological control systems,” Science, vol. 197, p. 287, 1977. https://doi.org/10.1126/science.267326.Search in Google Scholar
[50] L. C. Jaurigue, E. Robertson, J. Wolters, and K. Lüdge, “Reservoir computing with delayed input for fast and easy optimization,” Entropy, vol. 23, no. 12, p. 1560, 2021. https://doi.org/10.3390/e23121560.Search in Google Scholar
[51] S. Boyd and L. O. Chua, “Fading memory and the problem of approximating nonlinear operators with volterra series,” IEEE Trans. Circuits Syst., vol. CAS32, p. 1150, 1985. https://doi.org/10.1109/tcs.1985.1085649.Search in Google Scholar
[52] S. Oladyshkin and W. Nowak, “Datadriven uncertainty quantification using the arbitrary polynomial chaos expansion,” Reliab. Eng. Syst. Saf., vol. 106, pp. 179–190, 2012. https://doi.org/10.1016/j.ress.2012.05.002.Search in Google Scholar
[53] D. Zhang, L. Lu, L. Guo, and G. E. Karniadakis, “Quantifying total uncertainty in physicsinformed neural networks for solving forward and inverse stochastic problems,” J. Comput. Phys., vol. 397, p. 108850, 2019. https://doi.org/10.1016/j.jcp.2019.07.048.Search in Google Scholar
[54] O. G. Ernst, A. Mugler, H. J. Starkloff, and E. Ullmann, “On the convergence of generalized polynomial chaos expansions,” ESAIM Math. Model. Numer. Anal., vol. 46, no. 2, pp. 317–339, 2012. https://doi.org/10.1051/m2an/2011045.Search in Google Scholar
[55] D. J. Gauthier, E. M. Bollt, A. Griffith, and W. A. S. Barbosa, “Next generation reservoir computing,” Nat. Commun., vol. 12, no. 1, p. 5564, 2021. https://doi.org/10.1038/s41467021258012.Search in Google Scholar PubMed PubMed Central
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/nanoph20220415).
© 2022 the author(s), published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.