Analysis of the ROA of an anaerobic digestion process via data-driven Koopman operator

This work is licensed under the Creative Commons Attribution alone 4.0 License. Nonlinear Engineering 2021; 10: 109–131 Camilo Garcia-Tenorio*, Eduardo Mojica-Nava, Mihaela Sbarciog, and Alain Vande Wouwer Analysis of the ROA of an anaerobic digestion process via data-driven Koopman operator https://doi.org/10.1515/nleng-2021-0009 Received Aug 20, 2020; accepted Mar 23, 2021. Abstract: Nonlinear biochemical systems such as the anaerobic digestion process experience the problem of the multi-stability phenomena, and thus, the dynamic spectrum of the system has several undesired equilibrium states. As a result, the selection of initial conditions and operatingparameters to avoid such states is of importance. In this work, we present a data-driven approach, which relies on the generation of several system trajectories of the anaerobic digestion system and the construction of a data-driven Koopman operator to give a concise criterion for the classi cation of arbitrary initial conditions in the state space. Unlike other approximation methods, the criterion does not rely on di cult geometrical analysis of the identi ed boundaries to produce the classi cation.


Introduction
The biological process through which multiple organisms break down organic matter in the absence of oxygen is called the Anaerobic Digestion process. What motivates the use of the anaerobic digestion process on an industrial scale is its capacity of degrading strong and resilient substrates, the low sludge production, and the possibility of making a pro t out of the production of methane gas [1]. However, the complexity of the process related to the several inhibition phenomena it di cult to operate because knowledge and expertise are necessary to maintain the stability of the process in the desired working condition. Varying operating variables such as the organic load at the input feed, the dilution rate, or an accumulation of intermediate compounds can destabilize the process leading it to a washout condition [2,3].
This paper analyzes the inherent di culties of the process, such as nonlinearities and the multi-stability phenomena with a data-driven linear operator framework. This analysis allows the identi cation of essential conditions in the operation of the reactor for control purposes. The analysis of multi-stability phenomena present in some nonlinear systems is closely related to the concept of attraction or stability regions. This subject is as important as the notion of stability because the stable operating points of a nonlinear system are rarely globally stable [4]. For the anaerobic digestion process, it is essential to know the attraction region of the desired operating point where the biological species interact and coexist, and the attraction region of the di erent washout points where bacterial life has disappeared.
There are some traditional methods to approximate the region of attraction (ROA) of nonlinear systems with multiple asymptotically stable equilibrium points. First, Lyapunov based methods approximate the ROA by computing the level sets of the Lyapunov function [5]. Additionally, model-based techniques approximate the ROA by the backward integration of the system from the vicinity of saddle equilibrium points in its boundary. Finally, energy function based techniques approximate the boundary of the ROA by using the level sets of the local energy function (which are constant along the trajectories) [4].
All of these techniques have their inherent di culties, Lyapunov and energy based techniques require the ability to nd complex functions of the state variables that satisfy certain conditions with respect to the nonlinear state space representation. Finding these functions becomes particularly di cult when the dimension of the state space gets higher. The backward integration of the system generates a set of points that represent the stable manifold of saddle points that give the boundary of the ROA. These points are a set of hyperplanes that ultimately represent the division of the ROA for the di erent asymptotically stable points. However, the analysis relies on the ability to nd conditions according to the geometry of these sets of points to perform an accurate classi cation of the initial conditions. Similarly to the Lyapunov functions, the complexity of this problem grows with the dimension of the state space.
The rst step to determine the proper operation of a reactor and the ROA for feedback control purposes is the development of a dynamic model of the system. There are several such models in the literature, varying from very detailed models with various bacterial populations and substrates [6], to mass-balance systems of two populations and two substrates [7]. These representations give a proper description of the dynamics, even though the task of modeling, parameterizing or calibrating the system by numerical methods is di cult [8]. For the particular case of the AD (see [9] and the references therein), the task depends on the measurements from a live digester or the simulated dynamics of detailed models such as the International Water Association anaerobic digestion model number one (ADM1) [6].
In this paper we propose a new approach based on the integration of such numerical representations from different initial conditions to estimate the discrete-time approximation of the Koopman operator. The approximation comes from the extended dynamic mode decomposition (EDMD) [10,11], where the available spectrum of the approximated operator gives a concise criterion that does not rely on the ability to nd or estimate Lyapunov or energy functions, or the geometrical analysis of identi ed hyperplanes in the state space. The approach relies on the fact that the EDMD algorithm gives a series of eigenfunctions whose associated eigenvalue is unitary. These eigenfunctions are invariant along the trajectories of the system and as a consequence have the ability to capture the stable manifold of saddle equilibrium points that give the boundary of the asymptotically stable points ROA. To achieve this objective in a data-driven way, the proposed algorithm is able to rst approximate the discrete-time Koopman operator with the least amount of data. Regarding this approximation, the contribution of this paper is in the description of the p-q-quasi norm, and polynomial accuracy methods that allow the reduction of the necessary basis of orthogonal polynomials for the accurate approximation. Next, it is necessary to approximate the location of the system xed points. For this purpose, this paper describes how the linear predictors that come from the EDMD algorithm are suitable for the formulation of a multi-start optimization algorithm that accurately locates these points. Addition-ally, this paper describes the methods to determine their stability. As the EDMD algorithm also gives a non-linear model of the system, the linearization and local analysis of this model gives the xed points' stability: asymptotically stable, unstable or saddle. Coupled with the identication of the saddle points, and given that there are several of these for the proposed parametrization of the system, this paper shows how to determine which saddle points are in the boundary of the ROA for their subsequent analysis. Finally, the main contribution of this paper is to show how to set a classi cation criterion for an arbitrary initial condition of the system based on the eigenfunction of the Koopman operator. This criterion determines to which attractor (asymptotically stable point) will an arbitrary initial condition converge with a simple algebraic comparison. It is important to stress that the only use of the di erential equation model is in the generation of the sample trajectories that provide the data to the algorithm, and although there is a linearization of the system that the EDMD algorithm provides, this model is still a consequence of the data-driven method.
Although the proposed algorithm needs a su cient set of data that comes from the numerical integration of a dynamical system representation of the anaerobic digestion process, the fact that it does not subsequently rely on the manipulation of the di erential equation model is an advantage over traditional techniques. Indeed, the algorithm can be extended to operate on real measured data from a digester bypassing the di culties of identifying and parameterizing a dynamical system with the same measurements.
The rest of the paper is organized as follows: in Section 2 we present a detailed description of the dynamics under analysis, and the inherent di culties that the proposed methodology overcomes over the traditional techniques. Section 3 presents the mathematical tools that we need for the development of the algorithm, including a detailed description of the concept of regions of attraction, the Koopman operator, and the method to get its discretetime approximation via the EDMD algorithm. In this section we also include some of our previous results for the connection between the eigenfunctions with the approximation of the ROA, and subsequently in Section 4 we present some improvements on the EDMD algorithm for reducing the approximation of the discrete-time Koopman operator and the accuracy of the calculation. Furthermore, in Section 5 we present the results of the method and a detailed description of the necessary steps to perform the analysis. Finally, in Section 6 we present some analysis and conclusions.

Notation
C denotes the eld of complex numbers. R and R + denote the eld of real and nonnegative real numbers, respectively. For any matrix A ∈ R n×n , A * denotes complex conjugate transpose, A + denotes its pseudoinverse, and ||x|| represents the Euclidean norm. For a complex number λ, |λ| represents its norm. For any set A,Ā denotes the closure of A. The operator • is de ned as the Hadamard product. The vector exponentiation M ±η is de ned term by term. A level set of an arbitrary function h(x) for any constant c is

Problem statement
The model of the anaerobic digestion system under study considers two reactions: acidogenesis and methanogenesis. The following reaction network describes the biological transformations: In the rst reaction, the acidogenic bacteria ξ consumes the organic substrate ξ for growth and produces volatile fatty acids ξ . In the second reaction, the methanogenic bacteria ξ uses the volatile fatty acids as a substrate for growth and produces methane. It is necessary to maintain the balance between the acidogenesis and the methanogenesis states in the operation of the AD process to avoid acidi cation, which is the state of accumulation of volatile fatty acids in the reactor that produces the washout of methanogenic bacteria.
For an ideal continuously stirred tank reactor, the following di erential equations describe the system dynamics of the reaction network (1): with a methane out ow rate where ξ = [ξ ξ ξ ξ ] T ∈ R + is the state vector, u ∈ R + is the dilution rate, ξ in , ξ in ∈ R + are the concentrations of organic substrate and volatile fatty acids in the in uent, a, b, c ∈ R + are the stoichiometric coe cients, q > is the yield for the methane production, and r (ξ ), r (ξ ) are the reaction rates de ned as where the growth functions, f (ξ ) and f (ξ ) are based on Monod and Haldane kinetics respectively as A canonical state space transformation of the AD system can be obtained by considering the partition ξ , and a linear transformation of the states x a = ξ a and [8,12] aṡ where is the stoichiometric matrix, and the reaction rates in the canonical state space are Unlike the four-state representation (2), the canonical state representation is a two-state variable nonlinear differential equation system in the bacterial populations, and a two-state variable linear di erential equation. This linear part is assumed to be fast in comparison with the nonlinear part and therefore its dynamics can be neglected for the analysis of the behavior of the system in the species concentration. As a consequence the problem reduces to a two-state system for which the analysis is easier and can be illustrated in 2D gures of the bacterial concentrations.
Depending on the magnitude of the dilution rate and substrate concentration at the in ow, the system may possess up to six equilibrium points. In [2] and [13], the authors present the regions of the input space for which various numbers of equilibria occur, and their stability. This paper will consider the multi-stability case, where there are two asymptotically stable equilibrium points, one representing the desired working point of the system, where the acidogenic and methanogenic bacteria coexist, and the other, the acidi cation point. The objective is to nd the attraction region of the coexistence point.
Consider the benchmark presented in [14] with system parameters as in Table 1, where the authors present the stability boundaries for the problem using backward integration. For the analysis, the value of the dilution rate is xed at u = .
[day -1 ], and the concentration of substrates at the input are set in a grid, where ≤ w ≤ [g/l] and ≤ w ≤ [mmol/l]. This paper will not consider a unique dilution rate, instead, it considers a three dimensional grid of dilution rate, and input concentrations to perform the analysis. Figure 1 depicts the phase plane orbits of bacteria population for di erent values of the inlet for the dilution rate and substrate concentration. The system has six equilibrium points; two asymptotically stable, the working point x * A in which the species coexist and the acidi cation point x * B of methanogenic wash-out. Additionally, the unstable saddle points are: the point that separates the two asymptotically stable x * C , the two acidogenic wash-o points x * D , x * E , and the total wash-out of the system x * F . Each of these points if of type-k, where k is the number of unstable modes of the equilibrium point.
The objective is to get the separation of the state space for the di erent parameterizations. This separation species if an initial condition converges to either of the asymptotically stable equilibrium points.

. Traditional methods
Traditionally, the ROA in the multi-stability problem comes from the maximal Lyapunov set. This is a domain that contains the equilibrium point where the local Lyapunov function is positive de nite and its derivative along the trajectories is negative, i.e., the estimate of the ROA is the largest level set of the Lyapunov function where the aforementioned conditions hold [5], where the constant value for this largest level set is often called the critical level value.
Even though there are several optimization methods to nd local Lyapunov functions from linear programing or linear matrix inequalities [15], this is not an easy task, especially when the dimensionality of the system grows. Additionally, the level sets of the Lyapunov function are a conservative approach, given that the Lyapunov function is not unique, and it is a greater challenge to nd a function whose level sets maximize the estimate of the ROA.
In addition to the Lyapunov methods, energy function methods give a less conservative approximation of the ROA because the conditions that an energy function needs to satisfy are less restrictive than those of Lyapunov functions. For instance, the derivatives of an energy function along the trajectories of the system are non-positive (in contrast with the positive de niteness of Lyapunov functions, and the negative de niteness of their derivatives). The approximation of the ROA comes from the same maximal level set where its critical level value corresponds to the evaluation of the closest saddle point in the boundary with the energy function.
The feasibility of getting a solution with either of these methods is questionable, starting from the di culties in getting a function that satis es the conditions, coupled with the di culties of calculating level sets. In addition, those level sets are points of a n − dimensional hyperplane of the state space, where the classi cation of an arbitrary initial condition on higher than three dimensional systems is not trivial. This classi cation depends on the geometry of the stable manifold, and the ability to interpolate between the identi ed points to perform the comparison.
Alternatively, the backward integration method, for approximating the boundary of the ROA proceeds as follows: • Locate all the equilibrium points of the system and determine their local stability by the Jacobian matrix. • Find the stable eigenvectors of the saddle points. • Find the intersection of these eigenvectors with an ϵball of the saddle point, and integrate numerically to check which of the asymptotically stable points ROA is under analysis. • Finally, perform the backward integration, or a trajectory reversing algorithm from several initial conditions close enough to the intersection point to get the boundary.
In summary, the method gives a set of trajectories that converge to the saddle point in the boundary. Although this method can yield an accurate approximation of the boundary, it is computationally intensive, and has the same geometry and interpolation problems of the Lyapunov and energy function methods. For a detailed approximation of the ROA in the problem at hand using backward integration we refer to [14]. Within the theoretical framework under consideration, that is, the use of an approximation of the Koopman operator to analyze the system, there are some recent studies [16][17][18][19] that lead to our theoretical framework and algorithm. Most notably, in [16] the authors propose a visual method where the time average of an observable (i.e., an arbitrary function of the state space) approximates the eigenfunction of the Koopman operator whose eigenvalue is approximately unitary. Although this method provides visual information in two-dimensional systems or slices of three dimensional ones, the method does not give a criterion to determine the convergence of an arbitrary point, and is not feasible to analyze higher dimensional systems. Another notable approach is [18], where the authors calculate isostables of the system. An isostable of a stable equilibrium point is a set of points, or initial conditions that converge synchronously to the attractor, that is, they simultaneously intersect subsequent isostables along the trajectory to the stable point. The de nition of an isostable comes from the magnitude of the Koopman operator slowest eigenfunction, whose level sets give the isostables. Similar to the Lyapunov and energy function based methods, this approach requires the calculation of the level sets of a particular function, which yields the same problems related to those methods regarding the classi cation of an arbitrary initial condition in higher dimensional systems.

Materials and methods . Stability boundary
Consider the autonomous nonlinear discrete-time dynamical system (M; T(x); k), with state variables x ∈ M where M ⊆ R n is the nonempty compact state space, k ∈ Z + is the discrete-time, and T : M → M is the di erentiable vector-valued evolution map, i.e., The solution to (8) is the successive application of T from an initial condition x ∈ M at k = , i.e., x k = T k (x ) ∈ M, which is an in nite sequence called a trajectory of the system. Suppose x * ∈ M is a xed point of (8); i.e., T k (x * ) = x * . The linearization principle de nes the local stability of hyperbolic xed points, i.e., points that satisfy the Hartman-Grobman theorem¹ [5,20]. This prin-ciple states that a xed point x * s is asymptotically stable if the modulus of all the eigenvalues of the Jacobian matrix evaluated at the xed point are less than one, and unstable otherwise with an index equal to the number of eigenvalues with modulus greater than one. Eigenvalues with modulus less than one are stable eigenvalues with stable eigenvectors that span the stable eigenspace of the xed point, and conversely for the eigenvalues with modulus greater than one. As the Hartman-Grobman theorem establishes a one-to-one correspondence between the nonlinear system and its linearization, locally, the stable and unstable eigenspaces are tangent to the stable and unstable manifolds of the hyperbolic xed point. The de nitions of these manifolds are for the stable and unstable manifold respectively, assuming there is an inverse for T k . When the index of unstable xed points is equal or greater than one, and less than n, the xed point is called a saddle, denoted byx * . With these de nitions, the theorem characterizing the ROA for asymptotically stable xed points, de ned as Th. 9-(10,11)], where the system satis es the following assumptions: A1 All the xed points on the stability boundary are saddle points. A2 The stable and unstable manifolds of the xed points on the stability boundary satisfy the transversality condition². A3 Every trajectory on the stability boundary approaches one of the saddle points as k → ∞.

Theorem 1.
Consider an autonomous nonlinear discretetime dynamical system (8), satisfying assumptions (A1-A3). Let {x * i } P i= be the P saddle points on the stability boundary of an asymptotically stable xed point. Then, Hence, the stability boundary is the union of the stable manifolds of the hyperbolic equilibrium points on the stability boundary.

. The Koopman operator
The Koopman operator framework provides a set of eigenfunctions that are observables or measures of a particular dynamical system, whose evolution relates to that of the states of the system. This approach allows for an analysis of the eigenfunctions rather than the analysis of the di erential equation [21]. The trade-o with this approach concerns linearity and dimensionality, the nitedimensional non-linear systems being described by an in nite-dimensional linear operator. The linearity of the operator allows having a spectral decomposition that contains information about the underlying dynamical system. Consider the autonomous nonlinear discrete-time dynamical system (8). The operator U k , so-called Koopman operator de nes the evolution of observable functions The spectral decomposition yields tuples {(µ i , ϕ i (x), v i )} i=N of eigenvalues, eigenfunctions and modes that contain information on the underlying dynamical system. The eigenfunctions and eigenvalues satisfy the condition meaning that the evolution of a particular eigenfunction is given by its eigenvalue, and the modes serve to recover the values of the original set of observables A family of polynomial di erence equations has a closed analytical Koopman operator representation given by Bunton in [22]. An extension of this example, where the role of the spectral decomposition is clear, is given in [23].
Given that the closed analytical representation of the Koopman operator is not available for arbitrary dynamical systems, there are available algorithms that calculate an approximation based on data from the numerical integration of a system or real measurements such as the EDMD algorithm [10].

. Extended dynamic mode decomposition
The objective of the EDMD algorithm [10] is to get a truncated approximation of the discrete-time Koopman operator based on sampled data of the underlying system [24,25]. Consider the autonomous nonlinear discrete dynamical system (8), the EDMD algorithm to approximate the Koopman operator in discrete-time requires N pairs of data snapshots, either from a real system or a computationally integrated one at a speci c sampling ∆t. The snapshot pairs, The "extended" part of the EDMD algorithm consists in the approximation of the Koopman operator of a "lifted" space of the state variables, rather than approximating the state space as in the DMD algorithm [11]. The "lifting" procedure consists in evaluating the state space of the system on a vector-valued function of observables Ψ = [ψ · · · ψ d ] T : M → C ×d [10]. The approximation of the Koopman operator will be of dimension d, and it will be denoted U d . This operator has to satisfy the condition where r(x) ∈ F is the residual term to minimize in order to nd U d . This problem is closely related to the least mean squares problem, polynomial chaos expansion [26], statistical estimation [27], and machine learning [28] problems.
A possible solution to nd U d is the ordinary leastsquares solution [10], which has a closed analytical form given by with matrices G, A ∈ C d×d de ned by This solution is suitable when matrix G is not illconditioned. Otherwise, solutions based on the Moore-Penrose pseudoinverse or regularized least-squares give a more accurate result, the former being the most accurate, and the latter being computationally e cient and able to provide sparse approximations of U d . For the regularized least-squares case, the objective is to nd a set of d linear tting functions space, where j indexes the rows of the operator that minimize the error function: where L is the loss function that measures the model t, R is the regularization term, and α ∈ R + is a nonnegative parameter. The choice of L gives di erent types of solutions to the regularized least-squares problem [29], while di erent regularization terms R provide various solutions to the over tting problem that arises from ill-conditioned problems.
The common choices for the loss function are the Ridge regression, absolute value loss, and support vector machines (epsilon-intensive) functions [29]. For the regularization term, the common choices are l , l , and a linear combination of the two (elastic net) norms.
With the solution to (15), and the spectral decomposi- and (12) gives the evolution of eigenfunctions according to their eigenvalue as With the evolution of eigenfunctions, the right eigenvectors or modes serve to recover the original set of observables and the time evolution of observables according to the spectrum of the Koopman operator [10] is To recover the state, we de ne a matrix B ∈ R (d×n) with a unitary value per column in the index of an injective observable-function and zeros otherwise. This matrix selects the functions whose inverse gives the value of the state. If we de ne this set of functions selected by B as Ψ B (x), then, with (23), the approximation of the state evolution map T k (x) = x k from the initial condition x in terms of the approximation of the Koopman decomposi- Given that the approximation of the discrete-time Koopman operator comes from a set of orbits of the system, the comparison between these theoretical orbits against the evolution map of the Koopman operator gives a measure for the algorithms accuracy. The empirical error for a number of N s trajectories, where each has a number of N data points is A low error over a set of test trajectories of the Koopman state evolution map, gives a Koopman operator representation whose eigenfunctions are accurate enough for the subsequent calculations related to them. An error criterion based on the accuracy of eigenfunctions increases the computational e ort and does not improve the results of the approximation.

. Spectrum based region of attraction
The results from Theorem 1 state that the boundary of the ROA is the union of the stable manifold of saddle points in the boundary, meaning that it is necessary to nd the xed points, determine their stability, and get the stable manifold of the saddle points.
An equilibrium point of the discrete-time nonlinear system (8) is an invariant subspace that has the property that whenever the state starts in it, it will remain in it for all future time, i.e., T t (x * ) = x * , where x * is an equilibrium point of the dynamical system. If the evolution of the state evolution map (24) is invariant then the system is at a xed point, and, an optimization algorithm from di erent initial conditions within the state space gives the approximate location of the xed points [23]. (24) is a nonlinear, discrete-time evolution mapping, and under the de nition of xed point, i.e., T k (x * ) = x * it is possible to get the approximation of the hyperbolic xed points by solvinḡ

Remark. Equation
In practice, this procedure poses two problems. First, as the nonlinear mapping (24) comes from a set of observables, the dimension and complexity of these functions a ect the possibility of getting a solution in polynomial time. In high order polynomial expansions, even if it is possible to nd a solution, the computational time is expensively high.
Second, when there is a solution, it is not clear which elements of the subset represent an actual xed point of the system, i.e., not all the points of the subset are xed points of the system, while the converse is true. Considering the data driven case, the xed points of the system are not known a priori and therefore using this de nition to approximate the xed points is not feasible.
Equation (24) is a nonlinear discrete-time state evolution map, and a classical analysis based on the eigenvalues of its Jacobian matrix evaluated at the xed points provides information on stability.
The method to get the stable manifold of these points with the spectrum of the Koopman operator is to build a set of eigenfunctions with unitary associated eigenvalue [23]. This eigenfunction is where k , k ∈ C. This is possible because the product of two arbitrary eigenfunctions is meaning thatφ(x) is also an eigenfunction of the discretetime Koopman operator with associated eigenvalueμ = µ k µ k . With these eigenfunctions, it follows that their level set evaluated at the saddle point gives the stable manifold of the point [23], as stated by Theorem 2. Let (M; T(x); k) be a dynamical system that accepts an operator representation (F d ; U d ; k) based on EDMD. If there exists a nontrivial eigenfunction with associated eigenvalue equal to one de ned as ϕ + (x), then the stable manifold of a type-one saddle point is a subset of the level set Remark. The EDMD gives a set of eigenfunctions of the Koopman operator whose associated eigenvalue is one or close to one. There is a particular eigenfunction with unitary associated eigenvalue that maps the state space into a constant value, i.e., ϕ µ= = . This is called the trivial eigenfunction and does not give information about the system trajectories.
This level set gives the ROA boundary, and therefore, it is possible to nd a classi cation rule of the di erent initial conditions in the state space that converge to any of the asymptotically stable equilibrium points. A geometrical approach to the classi cation problem is identifying the hyperplane that represents the boundary. This is a common approach when using the backwards integration of the dynamical system to nd the set of points that compose the hyperplane. Although this is a possible approach, classifying an initial condition based on the geometry of the manifold is often di cult, especially when the hyperplane describing the boundary is not a function, or the system is of higher than three dimensions. The approximation of the discrete-time Koopman operator from the EDMD algorithm using orthogonal polynomials as observables gives a set of eigenfunctions whose eigenvalue is unitary or close to unitary. Considering the di culties of classifying the initial conditions based on the geometry of the stable manifold of saddles points, we propose two di erent methods based on this set of eigenfunctions to determine the convergence of an arbitrary initial condition.  This criterion is closely related to the geometric one, especially because there is the need for a clear separatrix given by the real part eigenfunction evaluation in the boundary saddle point. This implies that for every pair of asymptotically stable xed points to be classi ed there must be a an eigenfunction that separates the two. When the dimension of the problem starts to grow, especially in the amount of asymptotically stable xed points, the rule given by Denition 3 becomes a combination of the evaluations of the initial condition, compared to the evaluation of the di erent saddle points in the boundaries.

De nition 3 (Saddle classi er
Because of the invariant nature of the eigenfunction with associated eigenvalue equal to one along the trajectories of the system, we propose a second criterion to classify the initial conditions that is independent on the aforementioned boundary of the ROA given by the level set. Theorem 1 states that the stable manifold of type-one saddle points can be captured with an eigenfunction with associated eigenvalue equal to one. Extrapolating this concept to the asymptotically stable xed points, it is possible to nd a unitary eigenfunction that is invariant along its ROA. Remark. The ideal case is to have a real valued constant eigenfunction where C l (ϕ(x)) = ϕ(x). Although this is theoretically possible, i.e., to nd an eigenfunction invariant along the ROA of the asymptotically stable xed point, the necessary conditions or the characteristics for this to happen is still an open question. For C l (ϕ(x)) = ∠(ψ(x)), C l (ϕ(x)) = |ϕ(x)| and C l (ϕ(x)) = R(ϕ(x)) there is a chance of getting an accurate classi cation based on the natural eigenfunction from the EDMD approximation or by constructing the eigenfunction. Nevertheless, the current method to nd an eigenfunction that accurately classi es according to either of these functions of ϕ is by checking the necessary conditions for the eigenfunction to be accurate under each rule. As well as the constant case, the conditions that guarantee an accurate behavior is still an open question.
Remark. Although this eigenfunction is suitable for the classi cation of all the approximated state space into its respective point of convergence, note that the equality is replaced by a subset because not all eigenfunctions with an associated eigenvalue equal to one satisfy this criterion. To the best of the authors knowledge, the criteria to make it an equality are still unknown.
Section 5 exempli es the di erent results of performing the analysis with the di erent classi cation schemes. In general, the available eigenfunctions with a unitary eigenvalue that come directly from the EDMD algorithm or a constructed one from (28) give an accurate result by using De nition 3. For the classi cation based on the asymptotically stable equilibrium points from De nition 4, nding an eigenfunction that satis es this criterion is not trivial.

Polynomial base reduction
The choice of orthogonal polynomials for the approximation of the Koopman operator has the advantage that the rank and dimension of the vector-valued function of observables Ψ can be reduced [30,31], and the state can be recovered by the inverse of injective observable-functions. A sequence of orthogonal polynomials {π α (x)} p α= where π α (x) is a univariate (i.e., x ∈ R) polynomial of degree α ∈ N + up to order p. This sequence is de ned over a range [a, b] where some inner product between distinct elements is zero, i.e., π i (x), π j (x) = for i ≠ j, and satis es a particular ordinary di erential equation. For example, the set of Laguerre Polynomials is de ned over the range [ , ∞], satis es the ordinary di erential equation With these univariate polynomials, every element of the vector-valued function of observables is the tensor product of n univariate polynomials, that is, Consider a full degree and full dimension vector-valued function of observables, the order and dimension for a particular system grows exponentially with the number of state variables n. The maximum degree of the polynomials in Ψ(x) is p n and the dimension of the whole set is The exponential growth of the order produces an overtting problem in the discrete approximation of the Koopman operator via the EDMD algorithm, and the growth in the dimension has the course of dimensionality problem, especially in the de nition of the discrete Koopman matrix from (16) that needs the inverse of matrix G. To avoid these problems it is necessary to restrict the maximum order of the polynomial basis and its dimension.

. Reduction by p-q-quasi norms
The reduction of the order and dimension of the polynomial basis proposed by the authors [23] is via a p-q-quasi norm³ reduction rst introduced in the solution of polynomial chaos problems [30]. De ne the q-quasi norm as, for a polynomial element ψ l (x) consider a maximum degree p ∈ N, a quasi norm q ∈ R + and de ne the set of indexes α l as Given that each element of the polynomial basis is the tensor product of the univariate polynomials in the retained indexes given by (36), this approach reduces the maximum order, and the overall dimension of the vectorvalued function of observables.

. Reduction by polynomial accuracy
Recall from Section 3.3 that the discrete-time approximation of the Koopman operator comes from a set of orbits of the system that gives the linear evolution of a set of observables Ψ(x) given in (15). Consider now the evolution of a single observable function of this set, i.e., where U d l is the l-th row of the Koopman operator matrix, and r l is the residual term or error for that particular polynomial element. The overall error of every element in the polynomial basis over a set of orbits of length N gives a metric of the contribution of that particular polynomial element into the accuracy of the approximation of the discrete-time Koopman operator. All the metrics for the individual elements serve to de ne an overall error criterion ϵ = (ϵ , . . . , ϵ d ) as From the de nition of this error, there are some elements that will remain in the basis for the approximation of the Koopman operator because their error is below a threshold, and others that will be eliminated.
The de nition of the elimination threshold depends on the selection of the injective functions that must be present in the basis to recover the state. Recall from Section 3.3 that to recover the state we need the matrix B that selects the n injective functions that recover the state.
The easiest way to select these injective functions is to choose the order-one polynomial elements in the basis, which gives a set of linear injective functions of the observations whose inverse is linear and easy to calculate. This means that every column of the matrix B is B n = {e n : α n = }.
As a consequence, the error thresholdε that guarantees that the order-one polynomial elements remain in the basis isε = max(ϵ : α l = ), and the multivariate polynomial elements that stay in the set of the reduced vector-valued function of observables In contrast to the previous approach, this reduction only a ects the overall dimension of the vector-valued function of observables. Although, this reduction by polynomial accuracy requires two iterations of the EDMD algorithm, increasing the computational burden, reducing the dimension of the linear predictors on observables from the discrete-time Koopman operator approximation is benecial for the subsequent analysis of eigenfunctions and possible control algorithms.

Analysis of the anaerobic digestion process
The previous sections present the necessary tools to perform the analysis, a methodology to get an approximation of the discrete-time Koopman operator based on orthogonal polynomials, and a couple of degree and dimension reduction methods. This representation gives the possibility to approximate the location of the xed points of the system, provides their stability, and use the set of eigenfunctions with an associated eigenvalue equal to one in conjunction with the type-one saddle points in the boundary of the ROA to determine the convergence of an arbitrary point in the state space. In summary, the required steps to perform the analysis are: 1 From a set of system trajectories, and with the EDMD algorithm of Section 3.3, calculate the discrete-time approximation of the Koopman operator. 2 From the discrete-time representation, use (26) to nd the xed points of the system. 3 Calculate the Jacobian of (24), and evaluate it on the xed points to get their local stability according to the spectral decomposition of the matrix. It is important to remark that these eigenvalues come from the classical stability analysis of linear systems, and are not related to the eigenvalues from the Koopman operator. 4 Analyze the ROA of the system, this includes nding an eigenfunction with associated eigenvalue equal to one for the case where the EDMD algorithm fails to provide an accurate one. This eigenfunction comes from (28).
Given that the anaerobic digestion system in its reduced form (2) is bistable under some operating conditions, where the separatrix between the stability regions is the stable manifold of a unique type-one saddle point, we can show the accuracy of the method analyzing this system. Consider 100 di erent operating conditions of system (2) in a 3D grid of four di erent dilution rates u, ve different values for the organic substrate ξ in and ve di erent values for the volatile fatty acids ξ in with values given in Table 2.

. Train and test data
The EDMD algorithm captures the dynamics of the portion of the state space where the data for the calculation lies. For the case of the eigenfunctions that approximate the ROA of the process, it is especially important to have an even distribution of data on the portion of the state space subject to evaluation.
There are some geometrical properties that hold for the reduced Anaerobic Digestion System [2]. Speci cally, it is important to note that the reduced system (2) is nonnegative by de nition; concentrations cannot be negative, and the canonical representation is also compartmental, where all the trajectories lie inside an invariant set S x given by From the boundary given by this subspace of R we de ne a set of initial conditions for every one of the cases. Figure 2 depicts the selection of such initial conditions and their respective orbits for an arbitrary realization, specically, for u = . , ξ in = , and ξ in = . The uniform distribution of these initial conditions on the boundary gives enough snapshot data for the approximation and testing of the discrete-time Koopman operator. Given that these trajectories come from the numerical integration of (7), an important consideration is the nal integration time. The integration algorithm⁴ stops when a speci c orbit reaches steady state because repeated data 4 MATLAB(R) ode23s in the asymptotically stable xed points a ects the condition number of G in (17), and as a consequence it a ects the accuracy of the approximation of the Koopman operator from (16). When the integration stops, the nal ∆t of the integration is inconsistent with the xed time-step of the integration, therefore, a ecting also the accuracy of the approximation. The solution to this problem is to discard the last value of every trajectory.

. Approximating the Koopman operator
The rst step to approximate the discrete-time Koopman operator is to de ne a set of orthogonal polynomials that act as observables for the system. Consider for example a set of Jacobi polynomials π (η,ν) α (x) of degree α with parameters η and ν that satisfy the di erential equation The solution of this di erential equation for a set of nonnegative integers α = { , , , }, and parameters η = . and ν = is summarized in Table 3, where the subscript on α indicates the univariate solutions for the rst component of the state vector. Table 3: Basis for the rst state variable x . Solution to the Jacobi ODE with parameters η = . and ν = .
Every element of the polynomial basis comes from (36), which gives a total of 16 polynomial elements for the full basis without truncation that correspond to all the available combinations of the products between two univariate polynomials, i.e., the whole basis is The rst reduction method is to apply a truncation scheme on this basis. From the de nition of the q-quasi norm in (38), and the p-q-quasi norm truncation scheme in (39), take for example the values of p = and q = . to reduce the degree and dimension of the basis. Figure 3 depicts the result of applying the truncation scheme, where the resulting number of polynomial elements is 8, e ectively reducing the maximum degree from 6 to 3 and the dimension by half. The rede ned basis for the observables is With the de nition of the train, test and observables set, the discrete-time approximation of the Koopman operator comes from the calculation of matrices A from (18), G from (17), and nally U d from (16).
Recall that for the state evolution map of the approximation of the Koopman operator (24), and the error threshold (43) for the second reduction method, an elemental matrix B with l rows, and n columns with a single unitary element per column in the index of an injective observable needs to be de ned to recover the state and set the error threshold for the elimination of polynomial elements. A simple solution to this problem is to select the univariate polynomial elements of degree one as stated in (42) from the basis. This type of selection also gives the advantage that the inverse of the selected function by B is linear. By applying this criterion, in our application example, that selects the second and fth element of the polynomial basis which are the univariate, order-one, and injective linear functions of the set of observables that de ne the set Ψ B (x) = [ψ (x), ψ (x)], and whose inverse gives the linear function of observations that recover the state as With the de nition of B, the approximation is further reduced via the second method, where the threshold for the elimination of polynomial elements from the basis comes from (41), and (43) as giving a lower degree, and reduced order approximation of the discrete-time Koopman operator that accurately satis es condition (15) for the set of observables.
With the reduced discrete-time Koopman operator matrix U d , the set of observables Ψ(x), the set Ψ B (x), and its corresponding inverse, the linear predictor that approximates the dynamics of the system from an arbitrary initial condition comes from (24). Although it is useful to have this linear approximation of the evolution of the states according to the spectral decomposition of the discrete-time approximation of the Koopman operator, in practice, it is enough to use a linear predictor based on the original set of observables as in (15). Considering again that y = T k (x), Equation (53) is linear, and it describes the linear evolution of observations of the state space. Along with matrix B, it only describes the linear evolution of a set of injective functions whose inverse recovers the state. Given that B selects the order-one observables, (54) is also a linear function. The bene t of having the evolution according to the polynomial basis of observables instead of the eigenfunctions of the discrete-time approximation of the Koopman operator is in terms of accuracy. Equation (24) needs the left eigenvectors W * of matrix U d to get the eigenfunctions as in (20), and the right eigenvectors Ξ to recover the value of the evolved observations. To have this full representation it is necessary to calculate the inverse of Ξ, adding additional error because of the numerical approximation of the higher order inverse. As a consequence, the set of eigenfunctions also su er accuracy issues. For eliminating this induced numerical error, the solution is to get matrix W * as the right eigenvectors of U d , and apply (20) to get the eigenfunctions.
Using the linear evolution of observables (53) and (54) instead of eigenfunctions, the linear predictor for the case of Jacobi polynomials (48) is In conclusion, these changes in the computation of the approximation give a linear predictor and a set of eigenfunctions that do not have an additional inverse numerical error.
There are still some open problems to solve in the approximation of the discrete-time Koopman operator. For our particular case, in which we select a type of orthogonal polynomial and then apply a p-q-quasi norm truncation scheme, there is no current solution to determine the type of polynomial, and the values of the p-q parameters that give the best performance. Therefore, we perform the approximation of the discrete-time Koopman operator for di erent types of polynomials with di erent values for the p-q parameters and evaluate their performance using the empirical error (25) to select the suboptimal polynomial type and parameter values. Table 2 shows the di erent type of polynomials, the values for the p-q sweep, and the minimum, maximum, and average number of training and testing points. Table 4 shows the result of averaging the empirical error given by every one of the polynomials, and the total time required to perform the calculation⁵. Despite the fact that a Legendre polynomial has the lowest average empirical error within the set, there are some cases where the accuracy of the Legendre polynomial basis does not hold well enough to determine an accurate classi er. Figure 4 shows the empirical error for each of the parameterizations in the grid, for the case, u = , ξ in = . and ξ in = . , the Legendre polynomial has an error of . , that is high enough for the EDMD algorithm not to give an accurate approximation of the discrete-time Koopman operator. Therefore, the selection of the type of basis for the whole analysis is Jacobi with parameters η = . , and ν = . Note that some of the cases have a larger empirical error in which the approximation is not ideal. The reason to have these inconsistencies in the results will be covered in the next section. . Jacobi ( . , ) .

. Discrete-time approximation result
Recall from Theorem 1, and De nitions 3 and 4 that it is necessary to have an accurate location and stability of the xed points of the system. Either to determine the stable manifold of the type-one saddle points in the boundary of the ROA for the classi cation criterion 3, or to apply the classi cation criterion 4 on the asymptotically stable ones.
For either of the classi cation criteria to perform correctly it is only necessary to get the location and stability of the working point x * A , the acidi cation point x * B , and the point that separates the these two x * C . Recall from 3.4 that the xed points are given by the solution of (26), and in practice, the solution to this equation from di erent start points gives the whole set of xed points according to the discrete-time approximation of the Koopman operator. The start points for the identi cation of the xed points comes from the trajectories of the test set, where the initial condition of every trajectory is a starting point for the solution. In addition to these points, and for this particular case it was necessary to select a random point along these trajectories because counting only with the initial conditions, there was no available starting point that gave an identi cation of the acidi cation wash-out point.
For the stability of the xed points, consider the linear evolution of observations (56). Although this is a linear function of observations, it is also a nonlinear function of the state that gives an approximation of its nonlinear evolution operator. From traditional control systems theory, the analysis of the eigenvalues from the linearization of the nonlinear evolution mapping evaluated at the xed points gives their local stability. Therefore, as proposed before by the authors [23], where J is the Jacobian matrix and, H is the Jacobian evaluated at a xed point. The local stability of a xed point x * with respect to the eigenvalues {µ i } n i= from the spectral decomposition of the matrix H are given by, • if |λ i | < for all i = , . . . , n then x * is asymptotically stable.
• if |λ i | > for all i = , . . . , n then x * is unstable. • if |λ i | > for some i = , . . . , n k and |µ i | < for some i = n k + , . . . , n, then x * is also unstable but has modal components that converge to it, making it a saddle point.
The application of (58) on (56), and with U d = (U i,j ) the Jacobian matrix J = (J i,j ) for the nonlinear evolution operator on the states is x − Table 5: Anaerobic digestion xed points for u = . , x in = , and x in = .
Theoretical Algorithmic x + x + (62) Evaluating the identi ed xed points in the Jacobian matrix and getting the respective eigenvalues give their local stability. Table 6 shows the result of this process for the same arbitrary case of the anaerobic digestion system. These result comes from a Jacobi polynomial basis with η = . , ν = , p = , and q = . . Figure 5 shows the theoretical, and approximated xed points of an arbitrary realization of the anaerobic digestion system from the solution of (5) from 24 initial conditions. These come from the orbits of the train set, one for every starting point of the train set, and one from an arbitrary point along those trajectories. The error of the working point, the separating saddle and the acidi cation point is low enough for every realization of the system for an accurate analysis of the ROA.
The results of approximating the discrete-time Koopman operator are summarized in Figure 6 that shows in the rst row the most accurate realizations of the anaerobic digestion system, and in the second row, the realizations with the highest empirical error. Although the trajectories for the training set of each of the cases come from the same parameterization for the process with a di erent value for the dilution rate and input concentrations, not all of the orbits for the training set give the same amount of information. The reduced, nonlinear canonical model for the anaerobic digestion system (7) has a movement and bifurcation phenomena for the xed points depending on the input parameters [2]. For the bi-stability case, the displacement of the separating saddle x * C is directly proportional to the values of input concentrations, especially for the volatile fatty acids concentration ξ in (see Figure 1). As a consequence, the size of the ROA for the asymptotically stable xed points changes. The fact that the selection of initial conditions for all the realizations is uniform along the boundary of the state space, makes the available data for the acidi cation ROA scarce, and reduces the accuracy of the algorithm. A solution to this problem is to generate the data with an experimental design that gives enough information on both regions. The reason to avoid that approach is because, it defeats the purpose of analyzing the system without prior knowledge of the system, otherwise we would have to know the ROA to nd the ROA.

. Approximating the ROA
The spectral decomposition of the discrete-time approximation of the Koopman operator gives a set of eigenvalues with their respective eigenfunctions. In the approximation from a Jacobi-type polynomial basis, and under the selected p-q sweep, each of the system gets a subset of the eigenfunctions with a unitary or near unitary eigenvalue. For the reduced-order anaerobic digestion system, and under the selected set of orbits for the approximation of the Koopman operator, the set of eigenfunctions is rich enough so that it contains at least one unitary eigenfunction that characterizes the stable manifold of the separating saddle point for the classi cation of an arbitrary initial condition. If that was not the case, the eigenfunction with unitary associated eigenvalue comes from (28). Several tests over the construction of unitary eigenfunctions suggest that selecting the eigenfunctions with a real valued eigenvalue closest to one improves the performance of the algorithm. Also note that (28) has an in nite number of solutions, and as a consequence not all eigenfunctions constructed this way give an accurate result. The selection of the base eigenfunctions and the solutions to the construction to guarantee an accurate approximation for either of the selection criteria is still an open question.
Also, it is important to note that the constant eigenfunction is also part of the unitary eigenfunctions set, i.e., ϕ + (x) = . In particular, because of the presence of the constant observable when approximating the discretetime Koopman operator, this trivial eigenfunction is often Table 6: Anaerobic digestion xed points for u = . , x in = , and x in = .
. Saddle present in the approximation. As a result, the eigenfunction that gives an accurate classi cation is a near unitary, or a constructed one. In our application example, and for the 100 di erent parameterizations, the approximation of the discrete-time Koopman operator gives a set of eigenfunctions whose unitary or near unitary eigenvalue give an accurate classication by the saddle selection rule (3). This result is depicted in Figure 7, where the associated eigenvalue is real, exactly one, and whose associated eigenfunction is nontrivial. The accuracy of this eigenfunction and the application of the saddle classi cation rule is the comparison of an additional test set of 900 initial conditions of the state space against the algorithm. For the realization under consideration, the error is 3 misclassi ed initial conditions out of the 900.
Consider next a realization where the approximation of the discrete-time Koopman operator gives a real valued nontrivial eigenfunction whose associate eigenvalue is near unitary, i.e., µ = .
. Although the accuracy decreases, with the number of misclassi ed initial conditions being seven out of 900, it is till a good approximation of the stable manifold and the convergence of an arbitrary initial condition in the state space.
For illustration purposes, consider now a case where the associated eigenvalue is not real-valued or unitary, e.g., µ = .
i. The eigenfunction along with the saddle classi cation algorithm gives an error of 35 misclassi ed initial conditions out of 900. The result is given in Figure 9 and shows that the eigenfunction is not monotonic with respect to x , and the non-monotonic portion is completely below the threshold given by the real part of the eigenfunction evaluated at the saddle point. If that was not the case the eigenfunction would give a big classi cation error for the ROA of x * A , meaning that it would classify all the initial conditions in the vicinity of x * A as convergent into x * B . Recall from Section 2 that the driving forces in the reduced anaerobic digestion process are the dilution rate, which is the controlled input of the process, and the uncontrolled uctuations of inlet organic matter and volatile fatty acids concentration. These driving forces produce a displacement of the xed points, and consequently a displacement of the stable manifold of the separating saddle point in the boundary of the ROA [13]. Additionally, consider the canonical system representation as it allows for graphical representations of the bacterial dynamics in a 2D plane. As a consequence the movement of the separating saddle stability boundary can be illustrated in x − x − x i  3D plots where x i = {u, ξ in , ξ in } that sweep over one of the parameters while the other two are constant. Figure 10 illustrates two di erent cases for the x − x − u case, varying the dilution rate between . ≤ u ≤ . as shown in Table 2 and keeping constant values of ξ in = { , . }, and ξ in = { , }. The plot shows the displacement of the xed points, especially how the stable manifold and the working point x * A move toward each other in the x axis, decreasing the methane production rate according to (3), and reducing the portion of the state space that converges to the working point.
Consider now the x − x − ξ in that describes an increase in the organic matter input concentration as ≤ ξ in ≤ for the values given in Table 2 for two di erent constant values of dilution rate and volatile fatty acid concentration of u = { .
, . }, and ξ in = { , }, using the same xed values for the input concentration as the previous case. Figure 11 depicts the approximation of the stable manifold of the dividing saddle, and shows that the relation between the organic matter and the dividing saddle is directly proportional in both axis. As a consequence, an increase in the organic matter concentration  at the input has the potential of increasing the methane production with the drawback of having a reduction of the working point ROA. The nal case considers the same constant values for the organic matter concentration at the input feed of ξ in = { , . } as in the rst case, and evaluates two di erent values of the dilution rate u = { .
, . } while varying the di erent values of volatile fatty acids concentration in the input feed given by Table 2, i.e., ≤ ξ in ≤ . Figure 12 shows the approximation of the dividing saddle stable manifold that gives the boundary of the ROA for the asymptotically stable xed points. In this case, the movement of the stable manifold and the working point are on the x axis and is directly proportional to the volatile fatty acids concentration at the input. As a consequence, there is an increase in the methane production and a reduction of the e ective state space that converges to the working point.

Discussion
In general, not only for the case of the anaerobic digestion system, the analysis of the ROA is static, i.e., it considers a xed value of the input. For the process at hand these are the input concentrations of substrates and the dilution rate. The rationale behind this approach is the movement and bifurcation phenomena present in the equilibrium points of nonlinear systems. For example, consider the movement of the xed points with respect to the input concentrations and dilution rate as depicted in Figures 10 to 12. The working point ROA size is inversely proportional to all the variations in the input, while the converse applies for the movement of the working point. Meaning that a transient and steady state optimization of biogas production bene ts from a dynamical analysis instead of a static one, which can reduce the computational burden for this type of optimizations, and can improve the current heuristic approach [32]. Under the Koopman operator approach and the discrete-time approximation given by the EDMD algorithm, it is possible to formulate an expansion of the state space de ned as the product of the original space and the input space, which is equivalent to setting the inputs as state variables without speci c dynamics. Using a polynomial expansion on the input space rather than considering the approximation given by Korda [33] as an input a ne system gives the possibility to handle it as a closed one, and as a consequence, analyze the movement of the equilibrium points and the stability boundary in a dynamic way.
The approximation of the discrete-time Koopman operator, either in static or dynamic ways via the EDMD algorithm still has some important problems to be resolved. The rst one is the amount of necessary data to have an accurate representation. For the anaerobic digestion process at hand, the analysis is possible based on a process model that can be numerically integrated from a set of initial conditions. Even though the reduction methods and the possibility to test with di erent polynomials give an accurate representation with a reasonable amount of data, the analysis would be unfeasible if the solely available trajectories were from a real digester.
Additionally, the type of data available also poses a problem for the analysis given that the species concentrations are not commonly available, whereas biogas out ow is. This implies that the algorithm needs to be extended so that the eigenfunctions from an approximation with partial information are still able to capture the dynamic behavior of the system to approximate all the necessary information: xed points, stability, unitary eigenfunctions, and classi cation criteria. Furthermore, for the dynamical analysis, it is necessary to get a classi cation criteria that not only depends on the current state but also depends on the input concentrations and dilution rate of the system.

Conclusion
In this paper we replicate the analysis of the region of attraction for the reduced anaerobic digestion process using only data from a set of trajectories of the dynamical system. Speci cally, we use the extended dynamic mode decomposition algorithm to approximate a discrete-time Koopman operator. This operator has a spectrum of eigenfunctions where a particular set of these, which have an associated eigenvalue equal to one, are the basis for formulating a clear algebraic criterion to classify an arbitrary point in the state space into either of the asymptotically stable equilibrium points. In contrast to the traditional methods, this approach has the bene t of being strictly data-driven. The analysis provides a clear criterion that does not depend on the speci c geometry of a manifold or the ability to calculate level sets and interpolate between data-points for the classi cation of an arbitrary point in the state space which is the case for any method that relies on the approximation of level sets. As a result, the analysis can be extended to models with more than two reactions and state variables while preserving a clear and simple criterion for the classi cation.
Given that it is di cult and costly to acquire data from a reactor, this approach also opens up the possibility of developing an analysis based on the commonly available measurements such as the ow of biogas at the output. To make that possible, it is necessary that the approximation and reduction techniques presented here are extended so that they achieve an accurate approximation of the discrete-time Koopman operator for a reduced set of forced system trajectories.

Funding information:
The authors state no funding involved.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

Con ict of interest:
The authors state no con ict of interest.