Prediction of oxygen - blowing volume in BOF steelmaking process based on BP neural network and incremental learning

: In view of the characteristics of dynamic basic oxygen furnace ( BOF ) steelmaking process, prediction models based on backpropagation neural network and incremental learning ( BPNN - IL ) are proposed for total blow oxygen volume and second blow oxygen volume. The incremental learning is to adjust weights and thresh - olds of the BPNN according to the di ﬀ erence between the predicted value and actual value of each heat and to adapt to the change in furnace conditions. The combined BPNN - IL models are trained and tested by actual produc - tion data, and are further compared with multiple linear regression models and BPNN models. The results show that whether it is total blow oxygen volume or second blow oxygen volume, the BPNN - IL models could provide the most accurate prediction and the introduction of an incremental learning method could further improve the predictive accuracy. So the BPNN - IL method is e ﬀ ective in predicting the oxygen - blowing volume in the BOF steelmaking process.


Introduction
Basic oxygen furnace (BOF) steelmaking is a complex process that includes physical and chemical reactions with high temperatures.By blowing oxygen into a molten pool of BOF, the impurities in metal liquid are oxidized and the molten pool is stirred at the same time, thus achieving the goals of decarbonizing, increasing temperature, and changing components of molten steel.Therefore, the control of oxygen-blowing volume in the BOF steelmaking process is very important, which will directly determine the smelting effect and quality of steel and thus affect the end-point control of the BOF steelmaking process.
At present, the control of oxygen-blowing volume in BOF steelmaking process is mainly through static model and dynamic model.The static model is based on the initial conditions of steelmaking and the target requirements of steel grade, and uses the material balance and heat balance methods to calculate total blow oxygen volume and the additives needed to reach the required end-point conditions of the BOF steelmaking process and to provide a guide for technological operation.However, due to the complexity of the steelmaking process, various influencing factors, and unstable operations, the static model cannot be adjusted in real time, so completely relying on a static model for steelmaking is unable to well control end-point carbon content and temperature in BOF.To solve this issue, in the final stage of BOF steelmaking, instruments such as sub-lance or off-gas analyzer are employed to measure the temperature and carbon content of molten steel.According to measured temperature and carbon content, and end-point target parameters, and with the help of a dynamic model, the amounts of oxygen-blowing and coolants can be calculated and adjusted to control the end-point of the steelmaking process, which is called dynamic control.
As the basis of dynamic model control, the precision of the static model will affect the effectiveness of dynamic control, and therefore, the study of static models is also of great importance.At present, static models mainly include mechanism model [1][2][3], statistics model [4,5], incremental model [6], and intelligent model [7][8][9].The intelligent model, in comparison with other models, solves the nonlinear problem of the steelmaking process, achieves great results, and overcomes the problems that other models have, such as excessive influencing factors, difficulties in describing them with exact mathematical equations and statistical methods, and poor control precision.For example, Wang and Han [7] presented a causality-based case based reasoning model for the static control of converter steelmaking.Zhao et al. [8] established a static model for the prediction of oxygen consumption in BOF based on a genetic algorithm and extreme learning machine.Gao et al. [9] proposed a static control model of BOF steelmaking based on wavelet transform weighted twin support vector regression.Li et al. [10] proposed an improved deep belief network model based on deep learning for the converter of a steel mill based on massive historical data.
For most BOFs, the process control relies on static and dynamic models [11], where the static model is used to guide the early and middle stages of BOF steelmaking, while the dynamic model is used to guide the final stage of BOF steelmaking, which directly affects control precision at the endpoint of BOF steelmaking.Therefore, the improvement of the accuracy of the dynamic model can help increase the hit rates at the end-point of BOF steelmaking.To date, the most commonly used dynamic models are mainly the exponential decarburization model and intelligent model.Most of the exponential decarburization models are based on the sublance and off-gas analyzer system, and the amount of oxygen to be blown in the dynamic stages of BOF steelmaking can be calculated by using the exponential decarburization model, such as the dynamic model of Linz-Donawitz (LD) converters which is established by Carlucci et al. [12] Some exponential decarburization models do not depend on the sub-lance and off-gas analyzer system, such as the BOF quasi-dynamic control model which is established by Chen et al. [1] The exponential decarburization model can relatively reflect the regularities of decarburization rates in the final stage of the BOF steelmaking, but the dynamic model established on the exponential decarburization model still has some problems.For example, many parameters in the model are difficult to determine and they play a decisive role in the precision of the model.When the conditions of the furnace and raw materials are unsteady, the precision of the model is often unsatisfactory, so the self-learning and adjustment of the model parameters are critical.
To further improve the control accuracy of the dynamic model, by now many scholars have already built BOF dynamic models using intelligent model technology.For example, Cox et al. [13] and Fileti et al. [14] presented a prediction model of end-blow oxygen and coolant based on artificial neural network, to improve the hitting rate of BOF end-point temperature and carbon content.Rajesh et al. [15] developed a multi-layered feedforward neural network model for the prediction of end blow oxygen in the LD converter using a two-step process.Han et al. [16] established the BOF dynamic control model by case-based reasoning, adaptive-network-based fuzzy inference system, and robust relevance vector machine.
The above-mentioned intelligent model has played an important role in the development of BOF static and dynamic models, but most of them are based on batch learning mode, which requires that all training data should be prepared well at once before learning, and once the samples are learned, the learning process will terminate; no new knowledge acquired anymore.This will not meet the actual requirements of the BOF, because in practical application, training samples cannot be obtained at once, but gradually with time; meanwhile, the information reflected in samples is unsteady and varies with time.If all data should be relearned after new samples arrive, this will waste a great amount of time and space, and therefore, the models of batch learning cannot meet such requirements.Only the incremental learning algorithm can update knowledge in a progressive way, and correct and strengthen previous knowledge, making the updated knowledge adapt to the newly arrived data, without the need to learn all the data.Incremental learning reduces the demand for time and space and can better meet the actual control requirements of the unstable and timevarying conditions of the BOF steelmaking process.
Aiming at the above problems and actual characteristics of the BOF steelmaking process, this article presents a method that combines backpropagation neural network and incremental learning (BPNN-IL) to construct prediction models of total blow oxygen volume and second blow oxygen volume in the dynamic steelmaking process of BOF.The incremental learning method in the practical application can conduct a self-learning and adjust the weights and thresholds of the current BPNN according to the difference between the predicted value and actual value of each heat, to adapt to the change in furnace conditions, to improve the accuracy of the prediction model, and thereby to realize accurate control of oxygenblowing volume in the steelmaking process, reducing oxygen consumption, increasing the hit rate at the endpoint of BOF steelmaking, reducing the number of overblows and reblows, and reducing the production cost.The number of neurons in the output layer determines the dimension of the output vector.The hidden layer plays a decisive role in the structure of the BPNN.For the hidden layer, single layer or multi-layer structure can be selected.For BP neural network model, the prediction results of the output layer are obtained through forward transfer calculation of information, and then based on the errors of predicted values and expected values, the backpropagation calculation of the errors is carried out using the gradient descent method, and the connection weights between input layer and hidden layer neurons, the connection weights between hidden layer and output layer neurons, and the threshold values of hidden layer and output layer neurons are constantly iteratively revised.The ultimate goal is to find a good combination of weights and thresholds parameters to minimize network errors.BPNN model has a strong nonlinear processing ability and has been applied to the prediction and control of the BOF steelmaking process by many scholars [17][18][19][20][21].However, the BPNN models proposed in these studies are based on batch learning mode and do not have the ability of online self-learning.Unstable and time-varying characteristics of raw material and operation conditions in the actual BOF steelmaking production process will affect the generalization ability of the models.
Construction of BP neural network model can be summarized as a selection of model input and output variables, preparation and preprocessing of data, training and testing of the network, and thus determination of optimal network structure.The selection of model input and output variables is the basis of model establishment, which affects directly the final prediction effect of the model.Preparation and preprocessing of data are mainly to obtain effective training samples and test samples.At the same time, to eliminate the impact of different dimensions on data, before training the network, all samples are normalized in the range (−1, 1) according to formula (1) in this study.Training and testing of the network are to determinate optimal network structure, such as the number of hidden layers, the number of neurons in each hidden layer,

Second blow Main blow (First blow)
First measurement (about 85% blow oxygen) Prediction of oxygen-blowing volume in BOF steelmaking process  405 the transfer function, the weights.and biases in the network.In this study, a BP network with three layers is used.The sigmoid tangent function is used as the transfer function of the hidden layer, as shown in formula (2).The linear transfer function is used as the transfer function of the output layer, as shown in formula (3).

Tapping period Reblow
where y is the normalized value of the variable, and x max and x min are the maximum and minimum of each variable "x."

Incremental learning method
Incremental learning method is a kind of intelligent data mining and knowledge discovery technology that is widely used.Its basic idea is a learning system that can continuously learn new knowledge from new samples and save most of previously learned knowledge.With the gradual accumulation of the samples, the learning accuracy is also improved.For the BOF steelmaking process, actual furnace conditions and raw materials with unstable and timevarying characteristics, and many uncertainties, cannot be considered by BP neural network model.Therefore, this article proposes to introduce incremental learning on the basis of constructing the BPNN model, as shown in Figure 2.
In this way, the established prediction model can conduct a self-learning for the change of operation conditions of each heat in practical application, which can improve the prediction performance of the model.The combined model of BPNN-IL is described as follows: First, BPNN model is established according to Section 3.1, and the optimal network structure is obtained.For example, the three-layer BPNN model in Figure 2, through training of effective historical data, obtained optimal network structure is mainly weights w ij and w jk , and thresholds b j and b k , and the number of hidden layer nodes.This step belongs to offline batch learning.
Then, on the basis of optimal BP neural network structure, in the practical application, when the input variable data of the model prediction are collected, the target output is immediately predicted and given by the BP neural network model according to real-time data of the current heat.Subsequently, when the actual output of current heat is obtained, a gradient descent algorithm with momentum term [22] is used to self-learn the weights w ij and w jk , and thresholds b j and b k of this heat according to the difference between the predicted output and actual output, as shown in formulas (4)− (11).The next heat will use the updated weights and thresholds to calculate the target output.This method belongs to online incremental learning.This method could continuously adjust the weights and thresholds of BP neural network according to the difference between predicted value and actual value of the control target of each heat, which enables the prediction model to adapt to the changing conditions of each heat.Here, learning formulas of the weights and thresholds in Figure 2 are the results obtained after substituting formulas (2) and (3).are thresholds of hidden layer node j for current heat (t) and next heat (t + 1), respectively; η and α are the parameters between 0 and 1, which could be determined by experiments and comparisons in the form of 0.001 per change.

Establishment and experiments of prediction model for oxygenblowing volume in BOF steelmaking process
In this article, based on the dynamic steelmaking process of the converter with sub-lance system, prediction models of total blow oxygen volume and second blow oxygen volume have been established using the BPNN-IL method.
The control flow of oxygen-blowing volume in the whole steelmaking process is shown in Figure 3.

Refresh weights and biases for next heat
Calculate the output value of each neuron in the hidden layer : Prediction value of the model Calculate the output value of each neuron in the output layer: Real-time data of current heat : input variables x i (i=1,2, ) Self-learning of weight w jk and bias b k Self-learning of weight w ij and bias b j ( 1)= ( ) ( ) ( Actual value of output variables of current heat : s k (k=1,2, ) Prediction of oxygen-blowing volume in BOF steelmaking process  407

Determination of model input and output variables
The selection of input and output variables of the model is very important, which directly affects the prediction accuracy of the prediction model.For prediction model 1 (namely prediction model of total blow oxygen volume), the output variable is total blow oxygen volume, and the input variables are determined by the influencing factors of total blow oxygen volume.For prediction model 2 (namely prediction model of second blow oxygen volume), the output variable is second blow oxygen volume, and the input variables are determined by the influence factors of second blow oxygen volume.
In this article, based on the characteristics of the actual BOF steelmaking process (e.g., Figure 3) and the analysis of historical production data, input and output variables of prediction models of total blowing oxygen and secondary blowing oxygen have been determined, as shown in Table 1.And through historical production data, Pearson correlation analysis has been conducted for each input variable and output variable of the prediction models, and correlation coefficients have been obtained, as shown in Tables 2 and 3. Here, the Pearson correlation coefficient (R) can be calculated by formula (12).The sample is expressed as ( ) X Y , i i .X ¯and Y ¯are the means of the sample, respectively.
As the correlation coefficient of two variables reflects the degree of influence between them, it can be seen from the correlation coefficients in Tables 2 and 3 that influencing factors with a correlation coefficient greater than 0.1 or less than −0.1 are selected as main influencing factors of total blow oxygen volume.Therefore, the main influencing factors of total blow oxygen volume are scrap weight, BOF endpoint temperature, and hot metal [Si] content.According to the size of the correlation coefficient of secondary blowing oxygen and each influencing factor, the influence degree of each influencing factor on

Data preparation and preprocessing
In this study, actual production data from the converter in the M steel plant is collected according to model input and output variables.Then, these data are pretreated by removing incomplete and obviously wrong data, and screening the data that conforms to normal process range as an effective data sample; for example, TSC

Prediction model of total blow oxygen volume
The data of   The maximum iteration is set to 1,000.The number of iterations for convergence is 27.The second method is BPNN.The BPNN model is also established for prediction of total blow oxygen volume.The model is designed as follows: multi-layer BP neural network is adopted, and the transfer function in the hidden layer is sigmoid tangent function, and the transfer function in the output layer is linear transfer function, and the LM optimization algorithm is used for training the network.Through the analysis of Section 3.1, the input layer consists of 10 nodes representing the 10 factors, and the output layer is composed of just one node representing the predicted total blow oxygen volume.Four BPNN models have been developed separately for prediction by varying the number of nodes in the hidden layer and can be seen in Table 5.The max epoch is 2,000.
The third method is BPNN-IL.The BPNN-IL model is developed based on the best one of the above four BPNN models.For incremental learning in the model, η and α are determined to be 0.015 and 0.03, respectively, by experiments.
To evaluate prediction effect, these models are compared on the same test data from 171 heats.The results are shown in Table 5.The correlation coefficient (R) between the predicted values and actual values can reflect the potential of the models in actual application.For a perfect fit of data, R is equal to 1.For the MLR model, the correlation coefficient is only 0.5362 and it shows that the predicted values are not very consistent with actual values.For the BPNN models, the correlation coefficient has an obvious improvement.For the BPNN (MLP 10-9-1)-IL model, the correlation coefficient is 0.7214.It indicates that the combination of BPNN and IL can further improve the correlation coefficient.
At the same time, from Table 5, when the predictive errors of total blow oxygen volume are within ±1,000 N•m 3 , the hit rate of MLR model is 84.80%, the hit rate of the best BPNN(MLP 10-9-1) model is 87.13%, and the hit rate of the BPNN(MLP 10-9-1)-IL model is 89.47%.When the predictive errors of total blow oxygen volume are within ±800 N•m 3 , the hit rate of MLR model is 71.93%, the hit rate of the best BPNN(MLP 10-9-1) model is 80.70%, and the hit rate of the BPNN(MLP 10-9-1)-IL model is 84.21%.This shows that the BPNN can get the higher prediction accuracy than the MLR, and the introduction of incremental learning method is also helpful for improving the predictive accuracy of the BPNN model.Prediction of oxygen-blowing volume in BOF steelmaking process  411 iterations for convergence is 19.The second method is BPNN.Similar to the BPNN models for prediction of total blow oxygen volume, the BPNN models for prediction of second blow oxygen volume are also established.Through the analysis of Section 3.1, the input layer consists of seven nodes representing the seven factors, and the output layer is composed of just one node representing the predicted second blow oxygen volume.Five BPNN models have been developed separately for prediction by varying the number of nodes in the hidden layer and are shown in Table 6.The max epoch is 2,000.The third method is BPNN-IL.The BPNN-IL model for prediction of second blow oxygen volume is developed based on the best one in the above five BPNN models.For the incremental learning in the model, η and α are determined to be 0.018 and 0.05, respectively, by experiments.

Prediction model of second blow oxygen volume
To evaluate the prediction effect, these models are compared on the same test data from 280 heats.The results are shown in Table 6.For the MLR model, the correlation coefficient between the predicted values and actual values of second blow oxygen volume is 0.8423.For the BPNN models, the correlation coefficient has an obvious improvement.For the BPNN (MLP 10-9-1)-IL model, the correlation coefficient is 0.9226.It indicates that the combination of BPNN and IL can further improve the potential of the prediction model in actual application.
Meanwhile, from Table 6, when the predictive errors of second blow oxygen volume are within ±500 N•m 3 , the hit rate of MLR model is 92.50%, the hit rate of the best BPNN(MLP 7-5-1) model is 95.71%, and the hit rate of the BPNN(MLP 7-5-1)-IL model is 97.86%.When the predictive errors of second blow oxygen volume are within ±300 N•m 3 , the hit rate of MLR model is 77.50%, the hit rate of the best BPNN(MLP 7-5-1) model is 83.21%, and the hit rate of the BPNN(MLP 7-5-1)-IL model is 85.71%.It can be seen that the BPNN(MLP 7-5-1)-IL model achieves the best prediction accuracy in these models.So we think that the incremental learning method can further improve the prediction accuracy of second blow oxygen volume.As shown in Figure 6, for the BPNN(MLP 7-5-1)-IL model, the predicted values of second blow oxygen volume can agree well with actual values.Here, the weights and biases matrices of the best BPNN(MLP 7-5-1)-IL model for prediction of second blow oxygen volume are shown in Figure 7.

Sensitivity analysis
Sensitivity analysis could investigate the influence of input parameters of prediction model on the prediction of output parameters, and it is very important to evaluate the model and study the robustness of model prediction.So, in this article, the sensitivity analysis for the above prediction models of total blow oxygen volume and second blow oxygen volume has been carried out.The tested parameters (namely key input parameters of the prediction models) are changed from the minimum value to the maximum value, and other input parameters are kept at the average level.The prediction models are used to predict and observe the change of blow oxygen volume.

Sensitivity analysis of the parameters affecting prediction of total blow oxygen volume
Through the analysis in Section 4.1, it is determined that the main influencing factors of total blow oxygen volume are scrap weight, BOF endpoint temperature, and hot metal [Si] content.Single factor sensitivity analysis is performed on the prediction model of total blow oxygen volume for these three factors respectively, as shown in Figure 8.It can be seen from Figure 8a and c that the predicted value of total blow oxygen volume increases with the increase in scrap weight and hot metal [Si] content, respectively.Figure 8b shows that when BOF endpoint temperature is less than 1,640°C, the predicted value of total blow oxygen volume decreases with the increase in BOF endpoint temperature.When BOF endpoint temperature is greater than 1,690°C, the predicted value of total blowg oxygen volume increases with the increase in BOF endpoint temperature.When BOF endpoint temperature is between 1,640 and 1,690°C, the predicted value of total blow oxygen volume changes little.Therefore, when BOF endpoint temperature is controlled in this range (1,640, 1,690°C), it is beneficial to reduce the cost and energy consumption.The sensitivity laws of the main factors to the prediction model reflected in Figure 8 are consistent with the actual production law.So it can be seen that the prediction of the model is relatively robust.Prediction of oxygen-blowing volume in BOF steelmaking process  413 weight.Single factor sensitivity analysis is performed the prediction model of second blow oxygen volume for these five factors respectively, as shown in Figure 9.

Sensitivity analysis of the parameters affecting prediction of second blow oxygen volume
Figure 9a shows that the predicted value of second blow oxygen volume decreases with the increase in TSC temperature and the variation magnitude is large.Figure 9b shows that when TSC [C] content is within 0.3-0.75%, the predicted value of second blow oxygen volume increases with the increase in TSC [C] content and the variation magnitude is large, and in other ranges, the predicted value of second blow oxygen volume has little fluctuation.
Figure 9c shows that when BOF endpoint [C] content is less than about 0.05%, the predicted value of second blow oxygen volume decreases with the increase in BOF endpoint [C] content and the change is large, and when BOF endpoint [C] content is more than about 0.05%, the predicted value of second blow oxygen volume changes little.Figure 9d shows that when BOF endpoint temperature is greater than 1,640°C, the predicted value of second blow oxygen volume increases with the increase of BOF endpoint temperature and the change is large, and when BOF endpoint temperature is less than 1,640°C, the predicted value of second blow oxygen volume has little change.Figure 9e shows that the predicted value of second blow oxygen volume increases with the increase in scrap weight.Based on the above analysis and Figure 9, it can be seen that TSC temperature, TSC [C] content, and BOF endpoint temperature have great influence on the prediction of second blow oxygen volume.At the same time, the sensitivity laws of the main factors to the prediction model reflected in Figure 9 are consistent with the actual production law.So the robustness of the model is better.

Conclusion
Aiming at dynamic steelmaking process of the BOF with the sub-lance system, the MLR, BPNN and BPNN-IL methods have been proposed to establish the prediction models of total blow oxygen volume and second blow oxygen volume.To validate their prediction effect, comparative experiments based on the same data set are carried out.The results show that whether it is total blow oxygen volume or second blow oxygen volume, the BPNN-IL method could provide the most accurate prediction among these methods and the introduction of incremental learning method could further improve the predictive accuracy.For the BPNN-IL method, the hit rate of total blow oxygen volume is, respectively, 89.47, 87.13 and 84.21% when prediction errors are within ±1,000, ±900 and ±800 N•m 3 .The hit rate of second blow oxygen volume is respectively 97.86, 95.00 and 85.71% when prediction errors are within ±500, ±400 and ±300 N•m 3 .And the correlation coefficient between the actual value and predicted value of second blow oxygen volume is as high as 0.9226.The experimental results also show that the predicted values of oxygen-blowing volume agree well with actual values.Furthermore, the influence of the main input parameters on the prediction of output parameter for the BPNN-IL method has been investigated by sensitivity analysis, and the robustness of the method can be evaluated.The results of the sensitivity analysis show that the BPNN-IL method has good robustness to predict the total blow oxygen volume and second blow oxygen volume, and could be applied to the actual production process.

Figure 1 :
Figure 1: Blowing process of the BOF steelmaking.
BPNN structure : w ij , w jk , b j , b k , number of hidden layer nodes, etc.

Figure 3 :
Figure 3: Flow diagram of the control of oxygen-blowing volume in BOF steelmaking process.

Figure 4
shows comparison of total blow oxygen volume between predicted values and actual values of the best BPNN(MLP 10-9-1)-IL model, and the results indicate the predicted total blow oxygen volume is close to the actual total blow oxygen volume.Here, the weights and biases matrices of the best BPNN(MLP 10-9-1)-IL model for prediction of total blow oxygen volume are shown in Figure 5.

The data of 1 ,Figure 4 :
Figure 4: Comparison of total blow oxygen volume between predicted values and actual values of the BPNN(MLP 10-9-1)-IL model.

Figure 5 :
Figure 5: The weights and biases matrices of the best BPNN(MLP 10-9-1)-IL model for prediction of total blow oxygen volume.

Figure 6 :
Figure 6: Comparison of second blow oxygen volume between predicted values and actual values of the BPNN(MLP 7-5-1)-IL model.

Figure 7 :
Figure 7: The weights and biases matrices of the best BPNN(MLP 7-5-1)-IL model for prediction of second blow oxygen volume.

Figure 8 :Figure 9 :
Figure 8: (a) Sensitivity analysis between scrap weight and total blow oxygen volume predicted; (b) sensitivity analysis between BOF endpoint temperature and total blow oxygen volume predicted; and (c) sensitivity analysis between hot metal [Si] content and total blow oxygen volume predicted.

Table 1 :
Input and output variables for prediction models content, BOF endpoint [C] content, BOF endpoint temperature, scrap weight, coolant addition, and hot metal weight in descending order.In the same way, the influencing factors with a correlation coefficient greater than 0.1 or less than −0.1 are also selected as the main influencing factors of second blow oxygen volume.So the main influencing factors of second blow oxygen volume are TSC temperature, TSC [C] content, BOF endpoint [C] content, BOF endpoint temperature, and scrap weight.For the selection of input variables of prediction models, the main factors affecting output variables must be reserved, while other factors can be retained selectively.
The MLR model is first established for prediction of total blow oxygen volume.It is implemented on the SPSS software.The optimization algorithm is Levenberg-Marquardt (LM) method.
1,471 heats mentioned before are divided into training set and test set.The data from 1,300 heats are used as training set, and the data from other 171 heats are used as test set.Based on the same training set, three methods are used to construct prediction model of total blow oxygen volume, respectively.The first method is Multiple Linear Regression (MLR).

Table 2 :
Correlation analysis of the process variables for prediction model 1

Table 3 :
Correlation analysis of the process variables for prediction model 2

Table 4 :
Descriptive statistics of model variables

Table 5 :
Hit rate of predicted total blow oxygen volume with different models for 171 heats

Table 6 :
Hit rate of predicted second blow oxygen volume with different models for 280 heats