Study on online soft sensor method of total sugar content in chlorotetracycline fermentation tank

Abstract In order to solve the problem that the total sugar content of the chlortetracycline fermentation tank can not be automatically detected online, a prediction method which combines the output recursive wavelet neural network and the Gauss process regression is proposed in this paper. A soft sensor model between the measurable parameters (inputs) and the total sugar content (output) of the chlortetracycline fermentation tank was established. The soft sensor model was trained by self updating algorithm. Based on field data, the accuracy and generalization ability of the soft sensor model were analyzed. It is shown that the prediction accuracy of the combined model proposed in this paper is better than that of other single models. The results demonstrate the superiority of the method, and MRE and RMSE are used to evaluate the performance of the soft sensor model. It shows that the prediction precision of the soft sensor model based on ORWNN-GPR combination is relatively high in the long period of fermentation, and is suitable for on-line prediction of the total sugar content of the chlortetracycline fermentation tank. The soft sensor method can effectively reduce the labor intensity of the analysts and saves the production cost for enterprise.


Introduction
Chlortetracycline is a tetra ring spectrum antibiotic, widely used in medical treatment, agriculture and animal husbandry. At present, the industrial production of chlortetracycline mainly uses biological fermentation technology to ferment and culture Streptomyces aureus, and uses the metabolism of mycelium to obtain the metabolite of chlortetracycline [1]. The modern biological fermentation industry is produced by a series of complex biochemical reactions using microbes. In the process of production, a large number of parameters measurement are needed to ensure that the fermentation process is suitable for the metabolic state of the mycelium, which is of great significance for improving the production efficiency of the industrial process. Most of the parameters of the process of chlortetracycline fermentation can be directly detected by industrial instruments, but there are still some parameters such as total sugar content, biological potency of chlortetracycline, amino nitrogen content and other parameters, which can only be detected by off-line analysis by artificial sampling [2]. The total sugar content has great influence on the growth and fermentation of microorganism. Due to the large viscosity of fermentation broth, lack of total sugar content online monitoring instrument, and the current detection method is offline analysis of artificial site sampling. It has large labor intensity, large time lag and low measurement efficiency. It is difficult to meet the needs of modern industrial production process [3].
Bai Jianyun [4] used artificial neural network to conduct soft sensor modeling, and achieved NO_x mass concentration on-line detection. Huang Yonghong [5] used fuzzy neural network to study the soft sensor of the key parameters of lysine fermentation process. Zhang Haiying [6] used the least squares support vector machine learning method for soft sensor of cutting force. Qiao Zongliang [7] proposed an improved support vector machine for soft sensing. Zhong Huaibing [8] proposed an on-line soft sensor method based on GPR machine learning principle. The above soft sensor methods have their own characteristics and can be used for soft sensor of relevant parameters in different industrial processes.
Because there are about 25 fermentation tanks in the whole process, samples of each fermentation tank need to be analyzed 3-5 parameters. Considering production and labor costs, at present, the factory determines that each fermentation tank is sampled every 4-8 hours.
In this paper, based on the fermentation process of chlortetracycline, artificial intelligence method was used to establish an online soft sensor model of total sugar content. The measurable data and total sugar content analysis data of chlortetracycline fermentation process was used for the training of soft sensor model, and untrained data were reserved for model verification.
The experimental results show that the soft sensor model has higher prediction accuracy of total sugar content and it can meet the prediction requirement of difficult parameters in industrial fermentation process. There are more than 20 fermentation tanks in the production site of chlortetracycline. Samples are taken from each fermentation tank and several parameters are required to be analyzed after the samples are filtered first. In this way, a lot of time will be spent and the labor intensity of the analysts is also very high. Therefore, the sampling interval of chlortetracycline industry production site is set as 4-8 hours/time, but the prolonged sampling interval will lead to the blind feeding operation because the operators cannot timely understand the total sugar content in the fermentation tank, which will cause the fluctuation of the total sugar content in the fermentation tank and affect the output and quality of the product. Soft sensor of total sugar content is an online prediction method, which can reduce labor intensity and save production cost.

Output recursive wavelet neural network
WNN is a feed-forward neural network with one or more hidden layer structures. It is an extension of the radial basis neural network, and the radial wavelet is used as an activation function in the hidden layer. The wavelet function is obtained by the shift of the parent wavelet through the translation and the scale expansion. The wavelet analysis is to decompose the related original signal into a series of wavelet functions to superpose [9]. The wavelet transform is to transform the ) (t ϕ of a radial wavelet function into the inner product of different signals at different scales, as shown in Eq. (1).
(1) Where 0 > a scale is factor and τ is displacement factor.
In this paper, an improved wavelet neural network is used to model the soft sensor of total sugar content, that is, the Output Recursive Wavelet Neural Network (ORWNN) model [10,11]. Figure 1 shows the structure of ORWNN neural network. There are four layers, namely, input layer, wavelet layer, accumulation layer and output layer. = q , n q ∈ q is the output vector of the input layer, n is the number of input layer nodes, is the output value of an interval unit that is delayed. L 2 : Wavelet Layer In this layer, The Gauss wavelet function is used in this paper, it is n is the number of small wave bases in this layer. Each node in the wavelet layer must perform the operation of the wavelet function, it can be expressed as Where i b and i a is two factors that need to be constantly revised.

L 3 : Summing Layer
In the summation layer, the generalized T-norm is used to calculate the fuzzy neural network, and the output of each node in this layer is L 4 : Output Layer Each node in the output layer is used to calculate the linear combination of input quantities and get the output. The output of the model is Where i ω is the weight value of each node.

Gauss regression model
Gaussian Process (GP) is a ubiquitous and important stochastic process in nature, the sample is a set of joint Gauss distribution [12,13]. Suppose the input and output sample set is{ , Gaussian Process Regression (GPR) model can be described as unknown function claimed, ε is a Gauss white noise with a mean of 0 and a variance of 2 n σ . For data sets that have been processed normalization, the output variables of GPR obey the Gauss distribution whose mean is zero. that is, Where covariance matrix C is a n n × symmetrical positive determined matrix, it is written as The common covariance functions are Constant, Linearity, Squared Exponential, Periodic, Mateŕn covariance and Rational Quadratic et al. [14]. In this paper, the Mateŕn covariance function with noise term is used, and its calculation formula is (9) Where is a set of nonnegative definite hyper parameters representing the covariance function of GPR. ij δ has only two possible values, If i=j, then The maximum likelihood method of log likelihood function is applied to estimate the value of the super parameter set θ. The function is ( ) The maximum likelihood method is used to obtain the set of hyper parameters, that is, (11) Where tr (*) is the operation of finding the trace of a matrix. For a new test sample * x , according to the analysis of the nature of the Gauss process, the test sample and the training sample should belong to the same distribution, and the joint distribution is  x itself, all the elements are obtained by covariance Eq. (9) either. Therefore, the distribution of the predicted the output of the GPR model * y obeys Eq. (13) and Eq. (14).
Where E (*) is the operation of taking the mean, Var (*) is the operation for variance. The final predicted output of the GPR model takes the predicted mean * y , 2 * σ can represent the credibility of the predicted value estimate.

The method of model training and evaluation
A soft sensor model is built based on artificial neural network and machine learning theory, and the cumulative update learning method is used to train the soft sensor model. The experimental data of the process parameters of chlortetracycline normal fermentation tank in a factory were used to form the original data set. The process parameters are shown in Table 1. Fermentation time, temperature, pH, DO, air flow rate, air cumulative flow rate, feeding rate, feed accumulation and ammonia accumulation are easy to measure parameters at the scene, which is the input of the soft sensing model and the total sugar content as the output of the model, the training data sets for the input and output are constructed from the 15 batches of data in the original data sets according to the timing of each fermentation tank, which contains all the fermentation data in the process of production.
Pucheng Zhengda Fujian Biochemical Co. Ltd. In China has nine 120 m 3 fermentation tanks, and there are also more than 10 seed tanks, primary fermentation tanks and secondary fermentation tanks. Several batches of data sets were selected from each fermentation tank to train the soft sensor models, and several batches of data not used for training were left as verification data.
The method of cumulative update training is to use the training data set of historical tank batch to train the model, and the model is tested by the forecast data set. The new input and output data are updated to the fermentation history data set to form a new training data set to achieve the cumulative training of the soft sensor model. The cumulative update training algorithm flow is shown in Figure 2.
The data of the fermentation tank used in this paper are based on the production site of chlortetracycline. The input variable of the soft sensor model, that is, the data of the measurable parameters of the fermentation tank, is the industrial instrument testing data of the fermentation field. The total sugar content in the fermentation tank is the artificial sampling analysis data. The prediction model was trained by the data of multi batch fermentation tank, and some untrained fermentation tank data were used as the test data of the prediction model. The prediction value of the total sugar content was compared with the artificial analysis value, and the prediction accuracy of the soft sensor method was analyzed.
In order to analyze the prediction error of the soft sensing model, the calculation methods of mean relative error(MRE) and root mean square error(RMSE) are introduced. They are an effective method to test whether the soft sensing models meet the requirements of the total sugar content for measurement standard. Where the N is the number of samples of the model, i y is the predicted value of the i sample, i ŷ is the real value of the i sample.

Analysis of experimental results
The fermentation broth of chlortetracycline fermentation process is turbid, its composition is complex and its viscosity is very high. The existing total sugar content detection instrument cannot directly contact the fermentation liquid for detection. Therefore, only laboratory analysts can go to the site to sample the fermentation liquid. The total sugar content can be measured by special instrument analysis after filtration and other operations (here, it is called "manual measurement value"). There are more than 20 fermentation tanks (including primary seed tanks and secondary seed tanks) at the chlortetracycline fermentation site. Laboratory analysts need to sample, filter and analyze one by one. Besides the total sugar content, they also need to detect a number of other parameters, and this process is very time consuming. Therefore, we used the manual measurement value as the real value (benchmark value) of total sugar content and compared it with the predicted value of total sugar content online.
After the training of the soft sensor model, the field process data of two batches of the untrained factory T01 and T02 fermentation tanks were used as the input of the model. The total sugar content was predicted and compared with the total sugar content (set this to real value) measured by the off-line manual experiment, as shown in Figure 3 and Figure 4.
In Figure 3, based on the field data of the fermentation process of two batches (No.1 and No.2) of the T01 chlortetracycline fermentation tank, the prediction results of ORWNN-GPR integrated model, ORWNN model and GPR model are compared with the real values(manual measurement values). The experimental results show that the deviation between the predicted value and the true value of total sugar content in ORWNN-GPR integrated model is smaller than that in ORWNN and GPR models. The results show that the ORWNN-GPR integrated model has better prediction accuracy than the single model and higher online prediction accuracy of total sugar content.
In Figure 4, to illustrate that the prediction method proposed in this paper can be applied to different fermentation tanks, based on the field data of the fermentation process of two batches (No. 1  measurement values). The experimental results show that the deviation between the predicted value and the true value of total sugar content in ORWNN-GPR integrated model is smaller than that in ORWNN and GPR models. It is shown that ORWNN-GPR integrated model has higher accuracy and better generalization ability for online prediction of total sugar content.
The training data set of the soft sensor model is reduced to 50% of the original data set. After the model training, the total sugar concentration is predicted by using the parameter data of 1 batches of T01 fermentation tank to verify and compare the generalization ability of the soft sensor model.
The total sugar content in the fermentation of chlortetracycline is also a complex dynamic change, and the prediction accuracy of the single soft measurement method cannot maintain a high prediction accuracy throughout the fermentation cycle. The ORWNN-GPR combination method can maintain high prediction accuracy for online prediction of total sugar content in the chlortetracycline fermentation process.
In this paper, ORWNN-GPR model and two other ORWNN and GPR models were used to predict total sugar content. The mean square root error RMSE and mean relative error (MRE) were used as indices for statistical analysis, and the results are shown in Table 2.
The prediction accuracy of the total sugar concentration in the ORWNN soft sensor model and the GPR soft sensor model can control the average error within 10%. In the environment with a large number of sample training data, the prediction error of the total sugar concentration in the ORWNN soft sensor model is small.
Under a small amount of training data, the prediction error of total sugar concentration in GPR soft sensing model is small. With the accumulation of training sample data, the accuracy of ORWNN soft sensor model is improved compared with that of GPR model. The ORWNN-GPR combined soft sensing method can ensure higher prediction accuracy in the early stage of model training and the prediction accuracy of the model increases with the cumulative update of training samples.
In the working cycle of the chlortetracycline fermentation tank, the soft sensing method based on ORWNN-GPR model can maintain the high precision of the total sugar content prediction value, effectively solve the problem of long time and serious lag in artificial sampling analysis, and provide rapid and reliable data support for the optimization control of the rate of sugar supplement, which can effectively reduce the cost of production and improve the production efficiency of the chlortetracycline fermentation tank.  The research object and data in this paper are from the actual production site, rather than from simulation and laboratory. Therefore, this manuscript written by our research group is different from the relevant articles published by other research groups.

Conclusion
The method of cumulative update training updates the new input and output data to the fermentation history data set to form a new training data set every time a prediction is performed, and can implement self-renewal of the soft measurement model. The chlortetracycline fermentation production process in this paper is a continuous industrial production process. The cumulative update training method can continuously use the new detection data to train to update the model parameters and maintain the prediction accuracy of the model. The main innovation works of this paper are as follows: (1) A soft measurement model was established between the parameters (input) and the total sugar content (output) of the chlorotetracycline fermentation tank in accordance with the on-line undetectable parameter of the total sugar content in the process of chlorotetracycline fermentation.
(2) Based on the neural network structure and the basic principle of machine learning, this paper adopts ORWNN-GPR combination method to realize online prediction of total sugar content in the chlorotetracycline fermentation process.
(3) The experimental results show that the soft measurement method based on the ORWNN-GPR combination has higher prediction accuracy, effectively reduces the labor intensity of analysts, reduces production costs and stabilizes the production process, so it has better practical application value.
The total sugar content is an important parameter for the on-line automatic measurement of the fermentation process of the chlortetracycline. This paper combines the recursive wavelet neural network and the Gauss regression process to establish the online soft sensor model of the total sugar content of the fermentation tank. The prediction results of ORWNN method, GPR method and ORWNN-GPR method are compared with field data, The experimental results show that the ORWNN-GPR combined soft sensing model is more accurate than the single ORWNN soft sensor model and the GPR soft sensor model, and can meet the online prediction requirements of the total sugar content of the fermenting tank in the process of the production of chlortetracycline. The combined soft sensor method has practical application value.