Evaluation of sodium salt scaling in black liquor evaporators using existing process data

Scaling causedby sodiumcarbonate and sodium sulphate salts is a common problem during black liquor evaporation and mills currently have no proper tools to monitor or follow it up. The method proposed enables automatic evaluation of scaling rates and behaviours, together with washing performance, by using measurements that are already available at most mills, such as the boiling point elevation and the overall heat transfer coefficient. The method identified the scaling correctly in most cases, normally >90%, although fast and sudden scaling proved to be challenging; its performance was also dependent on the quality of the data used. Historical data from threemills was evaluated and it was found that the scaling rate generally increased at higher concentrations of black liquor: one of the mills had almost no scaling below 70% dry solids content. Another mill, however, deviated from this trend and had significant problems with fast and sudden scaling in one effect at around 60% dry solids content. The correlation of process parameters revealed that this scalingwas triggered, to a large extent, by the absence of tall oil brine, which is a side stream from the production of tall oil that is normally fed to the evaporators.


Introduction
Black liquor is an important process stream in a pulp mill. It is generated during the kraft pulping of wood and contains spent cooking chemicals and dissolved wood components. This liquor is processed in the recovery cycle, where the first step is to remove most of the water by evaporation to allow it to be combusted in the recovery boiler. The black *Corresponding author: Erik Karlsson, Chalmers University of Technology, Gothenburg, Sweden; and RISE Innventia AB, Stockholm, Sweden (present address), e-mail: erik.karlsson@ri.se, ORCID: https://orcid.org/0000-0002-0759-9031 liquor is normally concentrated in a multiple effect evaporation plant from about 20 % to 70-85 % dry solids content (DS) (Adams et al. 1997). The liquor contains relatively high levels of sodium carbonate and sodium sulphate, together constituting around 15 % of the dry mass; they form double salts that start to precipitate above approx. 50 % DS (Grace 1976, Schmidl andFrederick 1998).
The abundance of these sodium salts in black liquor implies that the evaporator effects operating above 50 % DS will operate as crystallizers. This, in turn, causes a great risk of scaling, which creates an insulating layer on the heat transfer surface and thus decreases the heat transfer performance (Bott 1997). Normally, the effects are designed with larger heat transfer surfaces to allow operation even under scaled conditions and have an automated washing sequence installed. However, even with these measures, scale formation can be so fast that it creates severe process disturbances leading to lost production capacity for the whole pulp mill DeMartini 2006, Karlsson 2017). It has been shown that one important factors affecting the degree of sodium salt scaling is the crystal species, which is determined by the ratio between carbonate and sulphate (Frederick et al. 2004, Gourdon et al. 2010. Other factors are the concentration profile along the heat transfer surface (Gourdon et al. 2010), DS (Karlsson et al. 2013a) and size of the formed bulk crystals ). In addition, the solubility is important for when the crystallization starts (Bialik et al. 2008).
Even if measures are taken in the industry to mitigate evaporator scaling, there are no systems available (at least not widespread) that quantify the scaling rate and thereby allow for adequate monitoring and follow-up of evaporator scaling. Improved monitoring of important process parameters can also allow for optimization of the operation, which, for examples, has already been shown for the pulp washing process by Shan and Tang (2020). Traditionally and today, the wash frequency is the most common measurement used for determining the extent of evaporator scaling because it has the advantage of being easy to both measure and understand (Grace 1975, Schmidl andFrederick 1998). It is, however, common that washing is carried out according to a predetermined schedule and its frequency does not then provide any information of scaling.
Other interesting process parameters are also logged by the mills automation system, but none measure the scaling rate directly. The scaling rate can nevertheless be extracted from heat transfer measurements, even though it is difficult to isolate from noise and other factors influencing the measurements: it is probably why this is not done in the pulp industry at present. Automatic detection of scaling or fouling can be classified as online or offline: online is used more for controlling the operation and offline for follow-ups. These techniques have been tested for heat exchangers, e. g. those used in district heating (Gudmundsson et al. 2016) and in the food industry (Wallhäußer et al. 2012). Fault detection methods, which is a more general term used in control engineering for the detection of malfunctional equipment, have been demonstrated for use in detecting scaling in, among other things, the alumina evaporation process (Chen et al. 2016). In the studies mentioned, several different methods for detecting scaling were tested; there are many more available, of course, of varying complexity. The more advanced methods use a model to predict clean conditions or subtract variations caused by changes in operation, i. e. variations not caused by fouling. These models can either be physical, based on knowledge of the system (white box), or data-driven (black box) (Kwak et al. 2020). The latter have gained especial attention in the past years due to the large interest in machine learning (e. g. neural network models), but also thanks to progress in the field of statistics (e. g. canonical correlation analysis and co-integration analysis), often in combination with methods from control systems engineering (e. g. Kalman filtering and fuzzy logic) (Chen et al. 2016, Garcia 2012, Kwak et al. 2020, Lalot and Lecoeuche 2003. Under clean conditions, the overall heat transfer coefficient, U, in falling film evaporators is affected mainly by viscosity, which is determined by the temperature and concentration of the black liquor (Gourdon et al. 2016, Karlsson et al. 2013b. During normal operation, however, these parameters are relatively constant (i. e. if washing is excluded) which means that, at least as a first step, scaling can be detected directly from deviations in U. During these kinds of steady-state conditions, methods such as CUSUM (cumulative sum control chart) have been suggested for online drift detection, e. g. of biofouling in seawater heat exchangers (Boullosa-Falces et al. 2020). It should also be mentioned that scaling in black liquor evaporators can occur significantly more rapidly than is common in other applications, making change point detection (or step detection) methods suitable. These methods are also suitable for offline detection, which is the focus in this study.
Access to quantitative data of scaling rates would: 1) help mill personal in their efforts to improve the availability of the evaporator, 2) allow scaling characteristics between different effects and mills to be compared and 3) enable correlation analysis between process parameters as a way of understanding the cause of scaling. Therefore, the aims of this work were the following: 1. Develop an automatic method for offline evaluation of scaling and other related parameters of interest from measurements already available in industry. Validate the method using industrial data. 2. Evaluate scaling in a few pulp mills to improve understanding of the scaling situation in industrial black liquor evaporators. Use correlation analysis to determine which process parameters affect the scaling rate and washing performance.

Method development
The first part in this work was to develop a method for automatic scaling evaluation, which is described in this section in the following order: 1) initial considerations leading to the selected strategy for how to evaluate the scaling rate, 2) method for identification of the operational mode, i. e. if the evaporator effect is in operation or washing, 3) general trend analysis for calculation of the average scaling rate, 4) advanced trend analysis for identification and calculation of the actual scaling rate and 5) calculation of additional parameters (e. g. the washing performance). The data was divided into operational cycles, defined to begin after a wash and end after the subsequent wash, as shown in Figure 1. The cycles were numbered, starting from 1 after the first wash in the dataset, as the data for the cycle before was incomplete. Figure 1 also shows how the data was interpreted and analysed, which will be described in the subsequent sections.

Initial considerations
The overall heat transfer coefficient, U (W m −2 K −1 ), is calculated from the heat load, Q (which is measured as the condensate flow rate,ṁ cond , multiplied by the heat of vaporization, ΔH vap ), the temperature difference between steam and liquor, ΔT (which is constant along the heat transfer surface in the cases of condensation and evapo- Figure 1: Example of results from evaluation of data from 1C at Mill 3. Washes (yellow) were identified using BPE and for each operational cycle (defined to begin after a wash and end after the subsequent wash), the heat transfer coefficient, U, was analysed to find key values (e. g. U scaled , U clean and SR CP ) and fit a model to filter the trend. ration) and the heat transfer area, A, thus: To develop a measure of the scaling rate (SR), U was found to be a better parameter to use than ΔT (which is easier to measure and also commonly used for scaling detection in industry today) because it is normalized in terms of heat load (i. e. not affected by the steam pressure) and therefore has a more direct connection to scaling. U is, however, affected by the properties of the black liquor, where the concentration is the parameter variating most and is important since it affects viscosity. Fortunately, it was found that the influence of concentration could be neglected with reasonable accuracy, and that fluctuations in U due to other reasons (probably measurement uncertainties) generally had a stronger influence. Other factors affecting U are the liquor flow rate, although this is relatively constant because the evaporators have a recirculation pump running at constant speed, and the steam condensation rate, which is dependent on the heat load but has a lesser influence compared to other factors.
It is also desirable that as many properties as possible of the black liquor are known. Unfortunately, online sensors are generally limited in evaporation plants: refractometers, and maybe even density meters are, however, common for measuring DS in a few positions. The boiling point elevation (BPE) is normally also measured in each effect and, since it is based on a simple measurement technique (temperature and pressure), it is generally very robust. BPE could therefore be used as a measure of liquor concentration in the calculations.

Identifying the operational mode
In order to interpret the data, the operational mode of the effect, i. e. operating or washing, had to be identified. Following the trend in the BPE was found to be a reliable method for this: when the BPE was less than 60 % of the mean value, it was almost always due to washing or nonoperation. The exact points for the start and stop of operation then corresponded to a clear change in the trend of the BPE, often referred to as a knee or elbow, where the slope of the curve increased drastically. These two points were found by rotating the curve 15°around the z-axis using linear transformation and then taking the maximum value. Some filtration was also implemented to improve identification according to the following: -If U was lower than 50 W m −2 K −1 or higher than 2950 W m −2 K −1 after ending a wash, the end was postponed by 20 min. -If the wash was less than 20 min, it was not considered as being a real wash and removed.
-If the time between two washes was less than 1 h, they were combined into one.
This gave a robust method that was accurate enough to identify the start and stop of operation; Figure 1 illustrates the identification of washes. Wash 3 in Figure 1 can be considered as being a typical wash that lasted for 2.5 h (the time for scale dissolution, i. e. the duration when the liquor in the effect was below solubility, was around 1 h). Wash 4 was much longer, and was caused by a process stop in other parts of the mill. It can be discussed whether this event should be called a wash or not but, as the evaporators are set in wash mode when not in operation, in principal all process stops can be treated as washes.
Calculating the average growth of scales U was first smoothened, giving U smooth using the LOWESS (locally weighted scatterplot smoothing) method, and then the general trend was analysed by extracting the following for each operational cycle: -U at start (i. e. just after the previous wash), U clean , extracted from U smooth . -U just before wash, U scaled , extracted from U smooth . -Whether scaling occurred, true if more than 20 % of the heat transfer was lost during the operation, U scaled < 0.8·U clean . -The amount of scales formed, δ formed , calculated using the apparent scale thickness, δ (Equation (2)). -The time in operation (or the wash interval), i. e. the time between the washes. -The average scaling rate, SR avg , calculated as δ formed divided by the time in operation.
The apparent scale thickness, δ, was calculated from the definition of fouling resistance, R f , since the scales act as an insulating layer (Welty et al. 2009): Here, k scale is the thermal conductivity of the scales; it was assumed to be similar to burkeite, which was measured by Smith (2000) as being 1.73 W m −1 K −1 . Since the exact composition of the scales was unknown, as well as how they were distributued, δ might not give the true scale thickness: it should be seen as a relative measure that can be understood intuitively.

Identification and calculation of actual growth of scales
The average scaling rate, SR avg , is a relatively robust measure because it is not very sensitive with respect to the quality of the data. It does not reflect the actual scale growth, however, as it assumes linear growth during the whole operational cycle. Scaling can nevertheless begin rapidly after a long period of absence, which was the case before Wash 3 in Figure 1, leading to an urgent need for washing: SR avg will then have underestimated the scaling rate. Similarly, the scaling rate will also be underestimated if a long time elapses before the evaporator is washed after significant growth of scale. Another disadvantage of SR avg is that it does not report when (fast) scaling events occur. Consequently, the trend in U needs to be analysed in more detail to obtain more accurate information of scale growth. The trends of U are normally noisy and often show strange deviations due to measurement errors (e. g. Operational cycle 5 in Figure 1). Therefore, it was important to find an efficient way of processing the signal before the variations of the trends related to scaling could be extracted. Different types of filters were tested but these did not provide satisfying results: mild filtering gave too much fluctuation and strong filtering flattened the fast scaling (for example the event before Wash 3 in Figure 1). More sophisticated methods were required, of which the following were found to be effective: 1. Discovering the abrupt decrease of U due to scaling using change point detection. 2. Fitting a logistic model to U that filters the data but can still detect fast changes in the trend caused by scaling.
These two methods work in distinctively different ways and can therefore be said to complement each other: if the results they give are similar, the results can be considered as robust. Change point detection is more sensitive to fast scaling and is more likely to capture this behaviour compared to the fitted model, which filters the data more. On the other hand, the fitted model is less sensitive to noise and measurement errors, which can lead to overestimations of the scaling rates: it is therefore better at capturing slow scaling behaviours. In addition, change point detection only gives the scaling rate at one point, while the fitted model can be converted to the scaling rate for the whole operational cycle.

Change point detection
Only one change point per operational cycle was considered, as the data also contains fluctuations due to reasons other than scaling: the method was not robust enough to find several cases of fast scaling even though it sometimes occurred in the data. A parametric global method was used, which basically tests how to divide the data in two sets to minimize the total residual error between the data points and the mean value of each set, i. e. it discovers when the fastest scaling event occurred in the operational cycle. Then the scale growth was calculated in this point, but first U was filtered by fitting a line from 1 h before the point to 1 h after. The change point scaling rate, SR CP , was then calculated from this line using the expression for scale growth rate, Δδ/Δt, which is based on the same principles as Equation (2): where t 1 and t 2 are two consecutive times in the data set (e. g. the detected point and the one after), the difference t 2 − t 1 is the sampling time in hours. Mean values of U 5 h before and after were compared to ensure that the change point detected corresponded to an actual drop in U: if it decreased by more than 20 %, then the calculated scaling rate was considered valid. Otherwise, the scaling rate was recalculated with more extensive filtering by fitting a line from 10 h before the change point to 10 h after. Examples of change points detected and corresponding scaling rates (SR CP ) can be seen in Figure 1.

Fit of logistic model
A five-parameter logistic function was found to be the most suitable model to fit to U as it can describe both fast and slow scaling behaviours, but only one event per operational cycle. It has an s-shape and consists of five parameters: where a = minimum asymptote, b = slope of the hill (steepness of the curve), c = inflection point, d = maximum asymptote and e = asymmetry factor. A successful fitting of Equation (4) can be seen in Figure 1 for Operational cycle 5. If the trend in U showed a normal behaviour, the fitted model gave a good representation of the situation. Abnormal behaviours were, however, common, e. g. an increase in U with time (fast or slow), low frequent fluctuations and spikes, all of which could not be explained by changes in the BPE or temperature. These behaviours sometimes gave a poor fit, so fits for R 2 < 0.5 were therefore rejected. Equation (4) was then fitted to U smooth instead, which sometimes improved it: if not, a seconddegree polynomial was fitted (as after Wash 4 in Figure 1).
The polynomial was found to be more robust for the abnormal trends but could not capture fast scaling.
The scaling rate was calculated from the fitted model, SR model , for the whole operational cycle using Equation (3). This data was used to calculate the following: -The initial scaling rate, which was measured 2 h after washing. -The maximum scaling rate, SR model max , which is the maximum value of SR model during the operational cycle.
Some filtering was necessary for SR model max to avoid unreasonable values being attained: -Values prior to 2 h after washing were not used, as the system was often instable then. -Values were not considered if U < 100 W m −2 K −1 , as they were probably measurement errors. -The scaling event had to last for at least 3 h to be accepted as valid.

Categorisation of scale growth
The growth of scale was categorised using two logical parameters calculated from U, namely: -Initial scaling: true if U model dropped more than 30 W m −2 K −1 h −1 2 h after a wash, which indicates that scaling started already at the beginning of the operational cycle. -Fast scaling: true if SR CP or SR mode max , or their sum, exceeded 0.5 mm/h during an operational cycle.

Calculating the wash performance
Each wash was evaluated in terms of how successful the scales were removed. To evaluate if U clean was truly clean it was compared to U estimated from heat transfer correlations, U corr , which was calculated from the correlation for steam condensation from Numrich and Müller (2010), the correlations of black liquid properties from Adams et al. (1997) and the heat transfer correlation for a black liquor falling film from Gourdon et al. (2016)). The following parameters of wash performance were calculated for each wash: -The wash success, i. e. if the evaporator could be considered clean after wash: true if U clean is both at least 80 % of the mean value of U clean and at least 80 % of U corr , or 1.5 times U corr alone. -The amount of scales removed during the wash, δ removed , calculated from the apparent scale thickness, δ, (Equation (2)), which is similar to δ formed , but δ removed used U clean after current wash while δ formed used U clean after previous wash. -The lowest value of the BPE during the wash, BPE min . -The dissolution time, which is approximated by the time the BPE is below 60 % of the mean. -The wash interval, which is the time elapsed since the last wash.

Final treatment of the data
As a final step, any negative values of the calculated scaling rates and scale thicknesses were set to zero. Negative values originated from periods with data of poor quality, e. g. when U was unreasonably low at the beginning of the operational cycle and then returned to normal levels. The results were also made less sensitive to noise by setting the calculated scaling rates to zero if no scaling occurred according to the evaluation of the general trend.

Estimating DS from BPE
The concentration, i. e. DS, of the black liquor in the effects was estimated from the BPE. Three different models were tested and compared to the data provided by the mills. Despite its simplicity, the following empirical model was found to give the best predictions of DS in the evaporator effects: This model is used inhouse at Chalmers, while the other two were from Gullichsen and Fogelholm (1999) and Zaman et al. (1998).

Implementation
The method proposed for the evaluation of scaling was implemented in MathWorks' Matlab R2019b. This had the advantage that the mathematical methods used were already available, e. g. change point detection, curve fitting and regression. The implementation by Cardillo (2020) was used to fit the five-parameter logistic function.

Validation
Data from industrial evaporation was used to validate the model. This allowed the method to be challenged in terms of process disturbances and errors in the data, as well as detecting different type of scaling behaviours. The same data sets as used for evaluation of the mills were used for the validation process as well.

Methods for evaluation of scaling in the mills
The second part of this work was to evaluate industrial evaporator operation: the scaling rates and behaviours were compared between different evaporator effects and different mills. A few process parameters that had the potential of affecting scaling were selected to permit correlation analyses. There are many parameters in the whole pulp mill that can be considered relevant for these analyses, but significant time delays can make them difficult to correlate. Various side streams are added to the evaporator plant (i. e. short time delays) and, as they affect the chemical composition of the black liquor, they have the potential of affecting scaling. Side streams were therefore considered as being the most interesting to be included in this study. In addition, the type of wood has a strong influence on black liquor composition and is changed in a controlled and predictable way: it was therefore also included in the study.

Gathering process data
Contact was established with five mills in Sweden and Finland at the beginning of the project. One of these mills did not have measurements of U and was therefore not included in the evaluation. A second mill was also excluded as their measurement of U was not reliable. For the three remaining mills, the following data was requested as Excel sheets for each effect operating above 50 % DS (i. e. 1 st effect(s) and super concentrator): -Heat transfer: U and ΔT -Liquor properties in the evaporator: BPE and temperature -Side streams: flow rates of e. g. tall oil brine and ash liquor -Type of wood A sampling period of 1-2 year was considered reasonable to give a good overview of the scaling behaviour. A sampling time of 10 min was found to be a good compromise between accuracy, performance and file size, but 1 min is recommended for studying the wash process in detail.
All mills had a super concentrator (SC) heated by medium pressure steam and were comprised of two units, Data for side streams was collected if available, where the flow rate was most important. Mill 1 alternated between softwood and hardwood, and therefore data for when these shifts occurred was collected. Mill 2 and 3 only used softwood.

Correlation analysis
Correlation matrixes were used to find relationships between variables. The correlation coefficient is then a measure of the strength of the linear correlation: 0.0 means no correlation, 1.0 means perfect positive correlation and −1.0 means perfect negative correlation. How far from 0 the correlation should be for it to be of interest can be debated: here, ±0.5 is considered the limit for a clear correlation. The P-value was also calculated, which is the probability that the correlation is just a coincidence, where values below 0.05 were considered as statistically significant.
Timing is important when correlating process variables. SR CP was therefore used in the correlation analysis, as it was the most accurate method for detecting when (fast) scaling occurred. A mean flow rate of the side streams investigated in Table 1 was calculated for the period from 6 hours before the scaling event to 0.5 hour after, since this would allow for some time delay and deviation from when the event actually started. This gave one value for each operational cycle, which could be correlated to the corresponding scaling rate. No additional measures regarding time delays were used: the side streams were added close to the effects investigated (i. e. Effect 1) and the retention time in an effect is relatively short (16-25 min was calculated by Lagerberg-Nilsson (2012)).
The scaling rates in Mill 1 were also correlated to the wood type, which changed at intervals of 1 week or slower. The slow change made the exact timing less critical and therefore all three ways of calculating the scaling rate could be included in the analysis. In this case, however, the time delay could not be neglected as it takes a long time for the change to affect the black liquor: it was estimated as being 2 days, which was then used in the correlation analysis. Correlation coefficients can also be calculated for binary variables such as wood type. Another alternative would have been to use the ANOVA test, but that actually gives the same P-value.
The washing performance, i. e. how clean the evaporator was after washing (U clean ), was also evaluated by correlation analysis. The dissolution time, BPE min , and U clean were considered as the main parameters affecting U clean , and were therefore included in the analysis.

Results and discussion
The results are divided in two main parts: 1) validation of the developed method and 2) evaluation of scaling in the mills.

Validation of the developed method
The scaling behaviour and quality of the data affected the ability of the evaluation method to make an accurate interpretation of the scale growth. Not surprisingly, the best results were achieved when BPE and U had low levels of noise, fluctuations and other types of disturbances unrelated to scaling.
In the case of Mill 1, which had data with the most errors, 1-3 % of the washes were significant misinterpretations due to abnormal behaviour of BPE. In addition, in 3-6 % of the operational cycles were not possible to interpret at all due to abnormal behaviour of U. An example of the latter can be seen in Figure 2: U fluctuates a lot for no clear reason, making sudden jumps at times to extremely low or high values. The same type of errors was also observed at the other two mills but at lower levels. Most of these points were, however, clear outliers that were already treated by the filtration methods suggested (e. g. setting negative scaling rates and scale thicknesses to zero) or could be removed manually. When considering milder errors in the data from Mill 1, these had a more general effect on the quality of the evaluation For at least 80 % of the operational cycles, however, the parameters could be calculated accurately. Effect 1B had most errors, while the super concentrator was much better with an accuracy of the interpretations around 90 % or higher.
The next challenge was the scaling behaviour. The easiest scaling to interpret correctly is slow and continuous growth, as exemplified in Figure 3. Nevertheless, the three different methods for calculating scaling rate do not give exactly the same value, as Table 2 shows. The SR avg naturally gives the lowest values since it calculates a mean value of the whole operational cycle, while for SR CP and SR model max , it is often not clear which is the most correct. As can be seen in the example, it is impossible to know if a particular little bend in the curve is due to scaling or noise, and consequently it is not possible to say which is the correct slope for every point. SR model , however, has the clear advantage that it allows the scaling rate during the whole operational cycle to be calculated, and thereby provides much more information than the change point detection (as shown in the lower graph of Figure 3).
The most challenging behaviour to evaluate was fast and sudden scaling. The longer the evaporator was in operation before scaling was initiated, and the faster it was set to wash afterwards, the more difficult it was found to be. This is because the data points with scaling become fewer in relation to the total: the scaling becomes more difficult to distinguish from all other types of variations in U. In Figure 1, both the fitted model and change point detection successfully identified the fast and sudden scaling event prior to Wash 3. A more challenging example is shown in Figure 4, where only the change point detection identified the fast and sudden scaling before Wash 33.
Effect 1C in Mill 3 had a lot of sudden scaling and was therefore the most difficult evaporator to evaluate in the data sets investigated. In order to obtain a better understanding of the performance of the evaluation method, this evaporator was studied in detail to evaluate the extent to which the method gave correct results, which are summarized in Table 3. In terms of interpreting the trends correctly, the accuracy was above 90 % for most of the parameters. To also have the absolute value fully correct, the accuracy was slightly lower.  The identification of washes, i. e. when they start and stop, worked generally very well. Prior to the start of washing, U was also very stable, thus giving accurate values of U scale . The end of washing, i. e. the start of a new operational cycle, was more difficult to determine since the shift was often more gradual, and U fluctuated considerably. This was the case for most effects at all mills and made the evaluation more sensitive to when the evapo-rator was considered as shifting from wash to operation, and therefore U clean showed larger errors. Figure 2 shows a severe example of these fluctuations in U, so whether the higher errors in U clean are due to poor data quality or an inefficient evaluation method is at least partly subjective: it depends on how much noise the method is expected to manage. The uncertainties in U clean mainly affected the parameter wash success, as it is dependent Table 3: Performance of the evaluation method in 1C in Mill 3. Each figure is a percent of the total number of operational cycles (i. e. 241). 67 % of the cycles had scaling. Correct interpretation refers to situations when the general interpretation was correct, whereas fully correct refers to when the absolute value was fully correct as well. on correct absolute values to determine if the effect is clean. SR avg is a robust measure that had a low rate of obvious errors (Table 3) but, as already mentioned, the true scaling rate is generally underestimated which was not considered here. Although SR CP had more errors, it still had a reasonable accuracy in general since 92 % of the operational cycles were correctly interpreted. SR model max had more difficulties dealing with the sudden drops after a long period without scaling, as seen before Wash 33 in Figure 4, causing an underestimation of the scaling rate for 19 % of the cycles. The performance errors were not mainly due to the model itself, but the fitting of its parameters: consisting of 5 parameters, finding a good fit to noisy data can be difficult. If the R 2 of the model fit was less than 0.5 a second-degree polynomial was fitted instead, which cannot capture fast scaling and cause the tendency of underestimation. In some situations, the fitting improved drastically when U smooth was used instead of U.

Evaluation of scaling in the mills
The results from the evaluation of the mills are divided in two main parts: 1) evaluation of mill data and 2) correlation analysis. First, however, some general lessons learned from discussions with the mills are presented to give an overview of the current strategies for monitoring and control of scaling.

Current strategies for monitoring and control of scaling
Discussions with the mills' personnel confirmed that scaling in the evaporator is mainly monitored in the form of qualitative judgements based on the experience of the operators, and whether or not the rest of the mill is affected. Also, wash frequency was sometimes used for internal investigations. There were three main strategies for washing: 1. The evaporator effect(s) that was the bottleneck in the process was washed when necessary.
2. The other effects with potential scaling were often washed as a preventive measure when this did not disturb the operation; this could be done as they were not bottlenecks. 3. If the rest of the mill had problems that, in turn, lead to a low demand on the evaporators, the evaporator plant (whole or in part) was washed.
These strategies imply that wash frequency is often not a good measure of scaling (i. e. applies only in Case 1). It is an even more uncertain measure when comparing mills, as they most likely have different strategies for initiating washing.

Evaluation of mill data
The mean values of the most interesting parameters used to evaluate scaling in the three mills are presented in Table 4. The parameters related to scaling rate respectively washing performance are discussed in two separate sections. The parameters in Table 4 were also investigated for variations in time, e. g. due to seasonal variations. No clear patterns were observed, so these results were not considered interesting enough to be presented here.

Scaling rates
One of the most striking findings of the scaling evaluation is that Mill 2 had almost no scaling in Effects 1D-1A, which is shown by the low value of δ removed (scales removed during washing) and the scaling rates (Table 4). Effect 1B in Mill 1 was also (almost) free from scaling but, as it is the effect with lowest concentration (around 60 % DS according to the estimation used here, which is still quite high), it can be suspected that it is operating below the solubility limit (or at least the metastable limit) of the sodium salts. All other effects in the study had clear scaling.
A general trend was that the higher the concentration (DS), the higher the scaling: in Mills 1 and 2, the super concentrator had most scaling (both SCB and SCA), which was indicated by the δ removed and SR avg (Table 4 and Figure 5). However, in Mill 3, 1A had the highest δ removed and 1C had the highest SR avg . The super concentrator in Mill 1 had the most scaling (at least in terms of amount) of all effects: about 2.5 mm were formed per day on average, resulting in 7 mm of scales having to be removed during each wash. The theory suggests that operation close to the solubility limit gives the highest scaling rates, but that does not seem to be the case here. Nucleation events can still be the cause of fast scaling in 1C in Mills 1 and 3, whereas for the other effects, it is more related to continuous crystallization. It has, however, been shown that primary nucleation can occur also at around 70 % DS due to a shift in crystal species (Frederick et al. 2004), but then a more fast and sudden scaling behaviour might be expected. Finally, a remark is in place here: the temperature of the liquor correlates strongly with DS, hence an alternative explanation for the higher scaling rates in the super concentrators could be the higher temperature. For the parameter "Fast scaling" (in Table 4), which indicates how many of the operational cycles experienced a scale growth event of 0.5 mm/h or more, the super concentrator at Mill 1 still has the most. This is because the steady growth was not linear and often experienced periods with higher scaling rates instead. Fast scaling was also high for 1C at Mill 3 but there, as also described in the method validation section, it was rapid in combination with being sudden and unpredictable: sometimes 1C operated without any scaling for several days and sometimes fast scaling was initiated as soon as operation started, leading to a situation where it was washed several times per day. In this sense, 1C in Mill 3 was the evaporator with the most severe scaling. Effect 1C in Mill 1 showed the same behaviour of sudden fast scaling but it was much less severe.
The parameter "Initial scaling" provides information of whether scaling starts 2 h after washing or not. In this case, Effect 1A in Mill 3 stands out clearly, with the highest value of 45 %. This indicates that the operation or design can be improved, e. g. there might be a problem with distributing the liquor on the heat transfer surface (i. e. insufficient wetting). Moreover, the figures for Mill 3 were higher in general, so it might also be related to the behaviour of the black liquor. The general trend in the three mills was that scaling started directly after washing for 15 %-30 % of the operational cycles in the effects that had scaling, and it did not seem to be directly connected to how the washes were performed. For the effects with very low scaling, the initial scaling was also naturally low. SCB in Mill 1 had very low initial scaling too, but this might be related more to measurement errors.

Washing performance
The wash success of all effects in all mills was above 75 %, with the exception of the super concentrator in Mill 1 (i. e. SCB and SCA), where SCB had the lowest of only 35 % (Table 4). This was because U clean was considerably lower than U corr : the heat transfer after washing was lower than what can be expected for the operational conditions used. The value of U clean in SCB is also lower than SCA, which runs at higher concentrations and is therefore expected to be lower (since viscosity decreases U). There are three possible explanations here: 1. Permanent scaling, either due to inefficient washing or formation of insoluble scales as calcium carbonate. 2. Measurement errors, probably in the condensate flow rate, as ΔT showed reasonable values. 3. Insufficient wetting of the tube surface: according to the mill personal, the circulation flow was actually lower in SCB than in SCA.
When U clean and U corr for the other effects are compared, the latter is generally lower. This can be due to that the heat transfer calculation methods employed underestimate U in industrial conditions. On the other hand, the absolute values of U during industrial evaporation can probably not be (fully) trusted: the value of U clean in SCA in Mill 3, for example, is among the highest and is actually unreasonably high, especially since it is the effect with the highest DS (and thereby the highest viscosity) and should, in theory, have the lowest value of U (it also has low temperature compared to the other super concentrators so temperature cannot explain the deviation). Finally, it is also of relevance to compare the wash interval. Effect 1B at Mill 1 was the effect with the greatest interval (7.2 days on average), which is reasonable as it also had the least amount of scaling. However, it is interesting that the effects that were washed the most were 1D-1A at Mill 2, which had almost no scaling: they were washed as a preventive measure, as they did not limit the capacity of the plant. The super concentrator is the bottleneck in Mill 2, and it would be advantageous if the load could be redistributed in some way by feeding it with higher concentrations.

Correlation analysis
The results from the correlation analysis related to scaling rate respectively washing performance are discussed in two separate sections.

Scaling rate
The results of the correlation analysis, with parameters potentially affecting the scaling rate, are summarized in Ta- Table 5: Summary of the results obtained from the correlation analysis with parameters potentially affecting the scaling rate. The parameters were mainly correlated to SR CP , but wood type was also correlated to SR avg and SR model max . Only correlations with P-value < 0.05 are given. In Effect 1A at Mill 2, there was a weak correlation (R = 0.27) between the scaling rate and the flow rate of ash liquor, implying that the addition of ash increases scaling. This might be due to the increased load of sodium carbonate and sodium sulphate, especially if the ash is dissolved in the liquor (as it then need to crystallise again in the evaporators). Another explanation might be that the variables are not independent, i. e. they are both affected by a third variable, and then there will be a correlation but not a causality. It should be noted here that although the flow rate of ash liquor is a measure of the amount of liquor that is feed through the ash mixing vessel, it is not necessarily a measure of how much ash is added to the evaporators and might therefore be a poor measure of this.
Mill 2 also showed some correlation for the CTMP flow, with R ≈ −0.2 in Effects 1D, 1C and 1B. The correlation is very weak but, as the result is the same for all three effects, it might actually be that the CTMP decreases the scaling slightly. However, it might once again be caused by dependent variables.
There was a weak correlation between the scaling rate and wood species in several effects in Mill 1, where softwood gave higher scaling rates. As the results were consistent, i. e. the same sign of the correlation coefficient, it indicates that there was a true relationship, although weak. Increasing the time delay to 3 and 4 days actually gave slightly higher correlations, but 2 days were considered as being more reasonable. The correlation decreased for delays of 0 and 1 day compared to 2 days.
The correlation between scaling rate and brine flow in Effect 1C at Mill 3 is examined further in Figure 6, where all the data points are shown. There was a clear negative trend with a very low P-value, but there was also a big spread in the data points. There were some cases with no brine flow and no scaling, but also with high brine flow and high scaling. When studying the trends, it is clear that a drastic increase in concentration caused by washing in other effects (corresponding to the peak in BPE in Figure 4) could also trigger scaling. This can be seen in Figure 6, where BPE tended to be higher at high scaling rates and high brine flows. R = −0.51 corresponds to R 2 = 0.26, i. e. about one quarter of the variation in the scaling rate could be explained by the brine flow. A linear model, with both the brine flow and BPE, would predict the scaling rate just slightly better, as R 2 = 0.28 (adjusted R 2 ).
The same behaviour as seen in Figure 6 are also observed when the trends are studied manually. In Figure 7, a typical period of fast scaling caused by the absence of brine can be seen. The washing frequency was increased drastically after Wash 126 as a counter measure to maintain operation but, as the washing process itself also decreases capacity, the overall capacity of the evaporator plant decreased. As soon as the brine flow returned, the fast scaling disappeared. Not all situations, however, were as clear as in Figure 7; periods of fast scaling and high brine flow, along with those of no scaling and no brine flow, as shown in Figure 6, could be confirmed. Nevertheless, the true dependency might be stronger than it appears in Figure 6, since er-rors in interpretation of the data decrease the correlation.

Washing performance
Several correlations were found when the parameters related to the washing performance were analysed, i. e. U clean , and are shown in Table 6. U scaled showed a positive correlation with U clean in most of the effects studied, especially in Mill 2 where the correlation was relatively clear, i. e. more scales make cleaning more difficult, which is not surprising. However, when studying the trends in more detail, it is obvious that the correlation is not necessarily connected to scaling. This is especially true for Effect 1D in Mill 2, which has (almost) no scaling but still the strongest R: here, the correlation is caused by low frequent fluctuations in U not related to scaling or concentration, which both U clean and U scaled follow.
The results show a negative correlation with BPE min in several effects (even though 1A in Mill 2 showed a positive value), indicating that a lower DS during cleaning gives better washing. This is reasonable but, as the correlation is very weak, it is probably not so important. Some correlation with dissolution time was also found, but the results are contradictory: 3 effects showed a positive R while 3 effects showed a negative R, and the correlation is weak so there is probably no relationship. In summary, it is probable that no true relationships exist for washing performance in spite of several significant correlations being found. This is surprising, as it can be expected that the parameters investigated influence washing, and can probably be explained by two factors: 1) measurement errors that mask the true relationships and 2) the washing procedures used (almost) always removed most of the scales, resulting in there being insufficient variation in the data (i. e. the data was not set up to test these relationships and the results would therefore be poor). Although there is a potential for attaining better understanding of the process by correlating variables, it is important that the quality of the data and the true cause of a correlation are considered to avoid the wrong conclusions being drawn.

Conclusions
The following conclusions can be drawn regarding the evaluation method: -The evaluation method proposed makes it possible to quantify both the scaling rate and washing performance automatically from process data that is already available in most mills. It provides new possibilities for the monitoring and follow-up of the evaporator operation, allows scaling rates to be correlated with other process parameters and enables comparisons to be made with other mills. -The scaling rate can be calculated from the trends in the overall heat transfer coefficient with an accuracy above 90 % for most cases: the change point detection is better at detecting fast and sudden scaling, whilst fitting a five-parameter logistic model gives good filtering for determining the rate of scaling in general.
-Process data from industry is not perfect: the evaluation method needs to be able to handle different types of errors in the data and disturbances in the operation. Interpretation of the trends is also somewhat arbitrary and it will therefore not always be possible to determine the exact scaling rate.
The following conclusions can be drawn regarding the results obtained from the evaluation: -The current strategy for monitoring of scaling is mainly in the form of qualitative judgements or sometimes the wash frequency, where the latter is found to be an inadequate measure of the scaling rate. -Scaling increased with concentration, as most of the scales were formed in the super concentrator operating above 70 % DS. However, it is also possible that this increase in scaling is caused by an increase in temperature. -Closer to the solubility limit, i. e. 55-65 % DS, the scaling behaviour can be fast and sudden, making it difficult to predict: one mill had almost no scaling here, while it was severe at another mill. Above 65 % DS, however, the scaling is strong, continuous and predictable, but not necessarily as fast, as it can be at lower DS. -Tall oil brine can inhibit scaling: a negative correlation between fast and sudden scaling and the flow of tall oil brine was detected in one effect in one of the mills. -No relationship between washing performance and washing procedure was found. -It is important that care is taken when drawing conclusions from correlation analysis of process variables: it is not only true relationships that are the cause of correlations.

Future work
The natural continuation of this work to improve the accuracy of evaluating scaling would be to use more advanced methods. The first step is to develop a prediction model for U. Here, U corr can be used although it probably needs an adjustment term to address the deviations, in absolute value, that appear in industrial operation. An alternative is to use a simpler black box model. The goal is to be able to estimate the variations in U caused by variations in viscosity (i. e. variations in BPE and temperature), and possibly also other variables. Then, control systems engineering methods, such as soft sensor or state observers, can be used to combine the model output with measurements of U. This could also be done in combination with information from other measurements, e. g. ΔT, to provide a better estimation of the scaling rate. A similar approach can probably be used to monitor scaling in effects where measurements of U are absent (but with lower accuracy) and the total load of the evaporator plant (which is normally measured) can then be used to estimate the rate of scaling from ΔT instead. Another, although simpler, development of the method would be to improve the fitting function of the five-parameter logistic faction to capture the fast scaling behaviour better. In addition, scaling caused by calcium carbonate, which in contrast to the sodium salts considered in this study grows much slower and are not soluble under the conditions during washing, can probably be detected by the long-term trend in U clean .
Finally, the evaluation method suggested can also be used to develop a tool for online monitoring of scaling, which can improve the daily operation and help optimizing the washing procedures.