Fault Signal Recognition in Power Distribution System using Deep Belief Network

Abstract Nowadays, electrical power system is considered as one of the most complicated artificial systems all over the globe, as social and economic development depends on intact, consistent, stable and economic functions. Owing to diverse random causes, accidental failures occur in electrical power systems. Considering this issue, this article aimed to propose the use of deep belief network (DBN) in detecting and classifying fault signals such as transient, sag and swell in the transmission line. Here, wavelet-decomposed fault signals are extracted and the fault is diagnosed based on the decomposed signal by the DBN model. Further, this article provides the performance analysis by determining the types I and II measures and root-mean-square-error (RMSE) measure. In the performance analysis, it compares the performance of the DBN model to various conventional models like linear support vector machine (SVM), quadratic SVM, radial basis function SVM, polynomial SVM, multilayer perceptron SVM, Levenberg-Marquardt neural network and gradient descent neural network models. The simulation results validate that the proposed DBN model effectively detects and classifies the fault signal in power distribution system when compared to the traditional model.


Introduction
Since the demand for power is rising, lack of power generation is a fundamental problem in today's life. Under such a circumstance, it is essential to make the best use of existing power transmission potential in a power system. In an electrical power distribution system, there are usually several faults that affect the system. In fact, the main factors like shortage of maintenance, equipment breakdown, fire, animals, trees, etc. are the main sources of these failures in a power system. Therefore, optimal diagnosis of the faulty phase, its location, and classification of signals is an essential task in the case of power transmission maintenance. In fact, determination of fault location is necessary for the clearance of fault and the transmission of power restoration. Initially, recognizing the type of fault is needed, and it is classified as line-to-line, single line-toground, multi-location, transforming and triple-line faults.
Accurate fault location recognition and categorization of fault signals have been done by several techniques in the distribution network [2,10,11]. Those techniques include artificial neural network (ANN)-based faulty-phase selection scheme [1], ANN-based fault location scheme [14], fuzzy schemes [4,21] and combined fuzzy wavelet schemes [17,18], particle swarm optimization [26] and decision tree-based method [9]. Moreover, the introduction of multi-objective fault detection techniques using combined adaptive neuro-fuzzy inference system with wavelet [20,27] has been sustained as a major contribution. However, the above-mentioned methods have suffered from several limitations such as less classification accuracy, dependence on large input space, existence of multiple optimal solutions, time-consuming and computational complexity.
Additionally, the shortage of relevant parameter consideration by almost all techniques has also remained a challenging point.
The next section presents the state of the art of fault signal recognition in the field of distributed systems. As a challenge to those techniques, the proposed method attempts to promote an effective fault signal diagnosis by overcoming those drawbacks. Support vector machine (SVM) has been used for attaining significant performance on classifying binary labels.
The main aim of this article is to propose the deep belief network (DBN) model, as the fault diagnosis problem can be prepared as a sequence of binary classification problem. In addition, the wavelet-decomposed fault signal is extracted, and further, this article diagnoses the fault signal based on the decomposed signal. Thus, the accurate fault diagnosis of the power distribution system is accomplished by the wavelet signal descriptors and DBN model. The advantages of DBNs are exploited in image processing, and although it is applied to two-dimensional image data, it is rarely tested on object recognition and three-dimensional data. The other applications of DBNs are information retrieval, handwritten character recognition, modeling and capturing data motions, machine transliteration, modeling electroencephalography for classification of waveforms and anomaly detection, music emotion recognition, document classification and phone recognition.
The overall structure of the article is organized as follows. Section 2 portrays the literature review along with problem definition related to fault recognition system. Section 3 depicts the power distribution system and signal decomposition. Section 4 provides the fault diagnosis by the DBN model. Section 5 illustrates the results and discussions, and Section 6 concludes the article.

Related Works
In 2014, Zhao et al. [29] had proposed the genetic algorithm (GA) for the recognition of optimal location and dimension of the distributed generation. They generally function on the susceptible nodes in the network. The identification of weak nodes was done by the small world network theory, which has minimized the ranges in DGs. As the weak nodes cause more power fluctuations than active nodes, it is essential to eliminate the susceptible nodes from the DG system. Further, in case of experimental analysis, the GA method with voltage boundary constraints was used to stay away from the entry of bus voltage into the boundary. Thus, the results have shown that the proposed method was effective in determining the optimal location with better convergence.
In 2015, Yadav and Swetapadma [28] had developed three distinct fuzzy interference systems (FIS) and operated on the security of transmission line directional relaying. The faulty stage in the transmission line was identified through this method with diminished computational complexity. In fact, the fault localization scheme was considered as a setting free scheme, and the operation was done in less time. Moreover, the fault recognition system using FIS was accurate, secure and consistent. Its performance was superior to other traditional methods like SVM, decision tree, and ANN. Generally, the fault in both forward and reverse directions are recognized by this method, positioned DG and determined the 10 types of shunt faults. Thus, this method has provided high accuracy with 95% protection to the line length and backup and less computational complexity.
In 2015, Swetapadma and Yadav [24] had suggested the adoption of ANN for carrying out the fault location estimation process. This method has determined the multi-location faults, transforming faults and shunt faults in the thyristor-controlled series capacitor-compensated transmission line. The preprocessing of the three-phase current and voltage signals were done using DB-4 wavelet. Moreover, one-cycle-per-fault and the two-cycle post-fault samples in the phase current were used for the estimation of coefficients. Through this method, various parameters like types of fault, location of fault, fault resistance and fault location angle were determined. One of the main contributions of this method was the recognition of both single and multiple locations of the shunt faults.
In 2012, Decanini et al. [5] have introduced an advanced technique for detection of short-circuit faults in the power distribution system. In fact, the developed algorithm was mainly adopted for improving the restoration process. That improvement was done to provide relevant security and profitability to utilities, which was based on the fuzzy ARTMAP neural network-aided evidence theory. Furthermore, methods such as discrete wavelet transform energy concept and multiresolution analysis were used for the current waveforms -statistical and direct analyses, which have extracted the features of voltage and current.
In 2015, Ma et al. [12] have put forth a fault diagnosis process in the distribution system using a parallel heuristic reduction-based method. The attribute in the power system fault detection was increased, and the proposed parallel scheme reduced the value. The method was accomplished on the Hadoop map reduce platform, which, compared with the other algorithms and outcome, confirmed that the accuracy and reduction process of the system were effective.
In 2016, Jamali and Bahmanyar [8] have developed fault location system for reducing the power outage cost in the distribution system. Moreover, the proposed method has used sparse measurements based on the concept behind iterative state estimation algorithm. Here, the fault was detected by recognizing the entire lines that are linked to the concerned node. The algorithm has used measurements of power, current, and voltage. Further, the simulation experiments were done on real 13.8-kV, 134-node distribution systems using various fault types, resistance and positions, and the results have shown the superiority of the proposed algorithm.

Review
Various classifiers such as NN [5], fuzzy inference system [5,28] and optimization algorithm like GA [29] are the well-known methods being adopted in most of the literature on fault diagnosis. Yet, the advancements and expansion in the power system limit the enactment of these algorithms. Moreover, the main limitation of FIS is that it requires expert knowledge, which is not realistic at all times. Furthermore, the fuzzy logic system does not ensure performance consistency within specific data bounds. On the contrary, there is more delay in NN model [5,24] due to the presence of inadequate and vague training database. Although GA [29] is a renowned optimization algorithm, owing to its flexibility and simplicity, it needs accurate fitness function modeling. The interdisciplinary theory has been accomplished by some of the researchers by developing map-reduce framework [12], rough set theory [12], wavelet transformation [24] and iterative state estimation approach [8]. In fact, those methods are considered as web mining techniques, probabilistic models, domain transformation techniques and nonlinear programming models, respectively.
Although the high searching ability of the map-reduce framework has been affirmed, it has some limitations like built-in optimization process, repetition in similar data access, sticking with map-shuffle sortreduce sequence and complex operation. In addition, the main advantages of the rough set theory is its high efficiency in a distributed system, but it causes delays in handling noisy data and is dependent relative to the overall data of even a simple system. The current domain is transformed into numerous bands by the wavelet transformation. However, it cannot provide an exact model of the distributed system due to the presence of high variations in the coefficient bounds. As reported in [8], the fault signal is accurately localized by the iterative algorithm, yet computational complexity arises in the estimation process due to the bi-level characteristics. Figure 1 shows the proposed model of fault diagnosis and classification of transmission signals in a power system. Here, the features of the signals are extracted using wavelet descriptors and signal diagnosis, and classification are done using the DBN model. The specimen bus system provides the training signals to be used for the DBN model. In fact, the specimen bus system comprises of the normal signal as well as the corresponding fault signals such as transient, sag and swells signals. Before applying the signals to the DBN model, the wavelet descriptors of each signal are determined. Further, the fault signal is diagnosed, and the DBN model classifies the types of signals.

Signal Decomposition
The number of transmission lines is varied in a power bus system along with its length. Let us consider a bus system composed of transmission lines with three different distance (in km), L 1 , L 2 and L 3 . The location of the fault may also vary, as each transmission line has its own load. Therefore, before localizing the fault, the measurements of each transmission line should be combined with each other. However, it a challenging task to localize the fault under such circumstances.
The layout revealing the integration of the diverse power sources in power system is shown in Figure 2, and the signals from the three transmission lines and the combined signals are shown in Figure 3. Let be the combined voltage measurements, where the signal descriptors are denoted as v 1 , v 2 , v 3 … v N and refer to the number of voltage sources. Let N P be the number of observation records and L be the number of observation instances, so that the size of the combined voltage sources is assigned as N p × L. Here, Equation (1) represents the wavelet decomposition of the combined voltage measurements, where ψ m,n (t) indicates the wavelet coefficient [shown in Equation (2)], which uses two integers such as m and n for its description, where m and n indicate the scaling and sampling numbers, respectively. The terms (a 0 , b 0 ) in Equation (2) specifies the constant that is fixed as (2, 1). Moreover, |D(y)| R × N p × L refers to the size of the decomposed signal, where |x| R specifies the number of rows in a matrix x. Since the higher dimension leads to the increase in computational complexity, the dimension of D(y) is essential to reduce. Thus, to reduce the dimension of D(y), principal component analysis (PCA) is used.

DBN Model
In 1986, Smolensky et al. [22] introduced DBN [13,25], which is an intelligent model. It is a type of DNN consisting of multiple layers; each layer contains visible neurons constituting the input layer and hidden neurons constituting the output layer. All the hidden neurons have a connection to the input neurons. However, there is no connection between the hidden neurons and no connection between the visible neurons. In fact, the connection between the visible and hidden neurons is symmetric and exclusive.
The hybrid DBN-DNN consists of multiple layers of stochastic, unsupervised models namely restricted Boltzmann machines (RBMs). These are utilized to initialize the network in the region of parameter space, which finds good minima of the supervised objective. In addition, the RBMs are renowned probabilistic graphical models, which are constructed on two types of binary units such as hidden and visible neurons. Here, the visible units correspond to the components of observation as well as constitute the first layer. The hidden units model the dependencies between the components of observations. The layers are constructively augmented when training one layer at a time that fundamentally augments one layer of weights to the network. This retraining of layers follows unsupervised learning at each layer to conserve information from the input. The stochastic neuron model determines the exact output for a given input. As the output of a stochastic neuron utilized in Boltzmann network is probabilistic, Equation (3) represents the output and Equation (4) provides the probability of sigmoid-shaped function, where T specifies the pseudo-temperature. The deterministic form of the stochastic model is given in Equation (5).
The architecture of DBN model is shown in Figure 4, where the feature extraction task is done by a set of Bernoulli-Bernoulli RBM layers and classification task is done using multilayer perceptron (MLP). The mathematical model reveals the energy of the Boltzmann machine for the composition of the neuron state (binary state) c, given in Equation (6), where w i,j indicates the weights between the neurons and θ i specifies the biases. , ( ) .
The energy definitions regarding the joint composition of visible and hidden neurons (a, b) are expressed in Equations (7)- (9). In these respective definitions, a i indicates the binary state of the visible unit i, v j indicates the binary state of the hidden unit j and u i and v i denote the biases applied in the network.
RBM layer M LP layer ( , ) The probability distribution of the input data encoded into weight parameters is assigned as the learning pattern of RBM. In fact, RBM training can maximize the assigned probabilities, and the weight assignment is determined using Equation (10).
To every feasible pair of visible and hidden vectors, the probability assigned RBM model is given in Equation (11), where Y refers to the partition function expressed in Equation (12).
Since it is a complex task to obtain the sampling of the expectations under the distribution defined by the format, contrastive divergence (CD) learning technique is used. Accordingly, the steps of the CD algorithm are summarized as follows. 1. Select the training samples a and clamp it into the visible neurons. 2. Calculate the probabilities of hidden neurons p b by finding the product of weight matrix W with the visible vector a as p b = σ(a⋅W), based on Equation (13).
3. Sample the hidden states b from the probabilities p b . 4. Calculate the exterior product of vectors a and p b , say, as a positive gradient .
T b a p ϕ + = ⋅ 5. Sample the reconstruction of the visible states a′ from the hidden states b per Equation (16). After that, resample the hidden states b′ from the reconstruction of the visible states a′.
6. Calculate the exterior product of a′ and b′, by its negative gradient Calculate the updated weight as the positive gradient minus the negative gradient, shown in Equation (15), where η specifies the learning rate.
Therefore, Equation (17) provides the squared error of pattern m followed by a mean squared error in Equation (18). .
The procedures of DBN training with the integration of pre-training (RBM) and normal training (MLP) are the following: 1. With the randomly selected weights, biases and other relevant parameters, the network model is initialized. 2. At first, the initialization of the RBM model is done with the input data, serving potentials in its visible neurons and performing unsupervised training. 3. The input to the next layer is obtained by sampling the potentials generated in the hidden neurons of the previous layer. Further, it follows the unsupervised training. 4. These steps are repeated for a desired number of layers. Thus, the pre-training stage by RBM is finished until it reaches MLP layer. 5. The MLP phase provides refined learning by supervised format and is repeated until it obtains the target error rate. Figure 5 shows the pattern of signal descriptors showing normal and fault signals such as transient, sag and swell with its corresponding wavelet descriptors. Let N be the time domain signal and N w be the corresponding wavelet descriptors, which is expressed as x. In fact, the transient signal is distinguished from the normal signal in the region of distorted dimension. At the initial stage, the voltage is assumed to be very low and further tends to rise to particular duration. This condition remains for some time and tends to reduce to very low value. Further, the situation rises to increase the voltage after a definite time, which results in the production of a normal sine wave. With the minimized voltage, the sag signal begins and then upgrades with the movement of a normal sine wave. Moreover, the characteristic of the swell signal is comparatively higher than a normal voltage, where it begins from overvoltage. Since the temporary signal only has slight variation, it is difficult to differentiate the signal's behavior more accurately. Thus, effective differentiation of signals can be done through adoption of wavelet descriptor.

Simulation Procedure
The experimentation regarding fault signal recognition in the distributed power system is simulated in MATLAB and further shows the simulation results. Accordingly, Figure 6 shows the diagrammatic representation of the proposed simulation model. For the simulation, 70% of the data are used for training and 30% of data are used for testing. Three transmission lines are integrated into the bus system with diverse distance at different load conditions. Here, the experimentation is done at load of 10 × 10 3 W, which is further varied to 20 × 10 3 W, and the results are observed using the DBN model, revealing the behavior of fault signals such as transient, sag and swell signals at load variations. The source feeder helps to observe the characteristics of the normal signal. Thus, Figure 7 provides the generated signals at two various loading conditions. In addition, the performance of the proposed DBN model is compared to linear SVM [16], quadratic SVM [3], radial basis function (RBF) SVM [19], polynomial SVM [15], MLP SVM [6], Levenberg-Marquardt neural network (LM-NN) [7] and gradient descent neural network (GD-NN) [23] methods to validate the effectiveness of the proposed fault recognition model.

Performance Analysis
This section determines the performance measures such as accuracy, sensitivity, specificity, precision, falsepositive rate (FPR), false-negative rate (FNR), negative predictive value (NPV), false discovery value (FDR), F1-score, G-mean and Matthews correlation coefficient (MCC) for validating the superiority of the proposed method. The performance analysis of the proposed and existing model for recognizing the normal signal using DBN technique is shown in Table 1. Similarly, the performance analysis of the proposed and conventional models for recognizing one of the fault signals such as transient signal is shown in Table 2. Here, the accuracy of the proposed fault recognition model is 5.26% better than linear, quadratic and polynomial SVM, 2.04% better than RBF SVM and 7.52% better than both LM-NN and GD-NN methods. In the sensitivity measure, the proposed fault recognition model is 90% superior to linear SVM, MLP SVM, LM-NN and GD-NN models, same as that of quadratic and polynomial SVM and 80% superior to RBF SVM model. Moreover,  Table 3 shows the performance analysis on sag signal recognition using both the proposed and conventional models.
Here, the accuracy of the proposed DBN model is 8.89% and 30.67% superior to quadratic SVM and MLP SVM, 3.15% better than linear SVM, LM-NN and GD-NN methods and same as RBF SVM and polynomial SVM methods. In addition, the proposed DBN technique attains high precision, which is better than linear SVM, LM-NN and GD-NN methods by 1.11%, quadratic SVM by 2.13% and worst than RBF SVM and polynomial SVM methods by 9%. In the case of FPR measure, the DBN model is the same as linear SVM, LM-NN and GD-NN models, and 3% worst than RBF SVM, polynomial SVM and MLP-NN methods and 70% better than RBF SVM model. Likewise, the NPV of the proposed DBN model is the same as linear SVM, MLP-NN, LM-NN and GD-NN models, 7.78% worst than quadratic SVM and 3% better than polynomial and RBF SVMs. Furthermore, the fault signal recognition analysis on swell signal is shown in Table 4. The proposed DBN model is

RMSE Evaluation
The RMSE computation of the proposed DBN model and traditional LM-NN methods at a different location of fault from the distribution bus system is shown in Table 5. For the 10-km transmission line, the performance of DBN is 71.74% better than LM-NN for the first localization scenario, 100% better than LM-NN for the second scenario, 60.66% better than LM-NN for third scenario, 51.11% better than LM-NN for fourth scenario and 56.86% worse than LM-NN for the fifth scenario. Even though the RMSE value is higher for the proposed DBN model in some cases, its performance outperforms while taking the mean values. Accordingly, the RMSE performance of the proposed DBN model is 39.52%, 36.96% and 8.70% superior to the LM-NN model for the transmission line with 10, 15 and 20 km, respectively. Thus, the DBN shows its effectiveness in recognizing the fault in power distribution system when compared to the LM-NN model.

Loading Effects
The recognition accuracy of different fault signals such as transient, sag and swell as well as the normal signal in power distribution system using proposed and conventional intelligent models at different load   Likewise, the proposed DBN model attains high accuracy at loading effect of 10 × 10 3 W for recognizing the swell signal, which is better than linear SVM and MLP SVM by 30%, 39.28% better than RBF SVM and same as quadratic SVM, polynomial SVM, LM-NN and GD-NN methods. In addition, the accuracy of the proposed DBN is 2.63%, 11.42%, and 63% better than linear SVM, RBF SVM and polynomial SVM, 30% better than quadratic SVM and MLP SVM and 25.80% better than LM-NN and GD-NN methods. Hence, the efficiency of the proposed DBN model for recognizing different fault signal in power distribution system is ensured. Moreover, Figure 8 shows the mean performance of fault signal recognition in power distribution system at low loading conditions such as 10 × 10 3 and 20 × 10 3 W. As shown in Figure 8A, the accuracy of the proposed method is 10% better than linear SVM and RBF SVM, respectively, 4.21% better than quadratic SVM, 2.06% better than polynomial SVM, 98% better than MLP SVM and 1.02% better than both LM-NN and GD-NN methods. In the case of sensitivity measure, the proposed DBN model is 40% superior to both linear SVM and RBF NN models, 8.89% superior to quadratic SVM, 2.08% superior to polynomial SVM, 91.83% superior to MLP SVM and 8.89% superior to LM-NN and GD-NN techniques. In addition, the specificity of the proposed model is better than linear SVM and RBF SVM methods by 11.11%, quadratic SVM and polynomial SVM by 2.03%, MLP SVM by 42.84%, LM-NN and GD-NN methods by 3.08%. In Figure 8B, for the negative measures, DBN model attains low FDR, which is 90%, 93.33%, 60% and 96% better than linear SVM, RBF SVM, polynomial SVM and MLP SVM and 80% better than quadratic SVM, LM-NN and GD-NN methods. For positive measures at loading effects 20 × 10 3 W, the proposed DBN model in Figure 8C is 11% superior to linear SVM, LM-NN and GD-NN model, 12.5% superior to RBF NN, 15.38% superior to polynomial SVM and 26% superior to MLP-NN methods in case of accuracy measure. Similarly, the precision of proposed DBN is 43%, 50%, 2.29%, 53% and 56% better than linear SVM, quadratic SVM, RBF SVM, polynomial SVM, MLP SVM and LM-NN and GD-NN methods. In Figure 8D, the proposed model is 71%, 33%, 80% and 83% superior to quadratic SVM, RBF SVM, polynomial SVM and MLP SVM and 60% superior to linear SVM, LM-NN, and GD-NN techniques. Hence, the proposed DBN model is effective in recognizing and classifying different fault signal in power distribution system than traditional methods.

Conclusions
This article has suggested the use of fault diagnosis methodology in a power distribution system through the DBN technique. The fault signals considered in this simulation has included transient, sag and swell signal. Initially, wavelet-decomposed fault signals were extracted, and a further DBN model was used to diagnose the fault based on the decomposed signal. Next to the implementation, type I, type II and RMSE measures were determined for the purpose of the analysis. Here, the performance of DBN model in recognizing fault was compared with traditional models like linear SVM, quadratic SVM, RBF SVM, polynomial SVM, MLP SVM, LM-NN and GD-NN methods. From the mean performance analysis on different fault signals, the accuracy of the proposed method was 10% superior to linear SVM and RBF SVM, 4.21% superior to quadratic SVM, 2.06% superior to polynomial SVM, 98% superior to MLP SVM and 1.02% superior to both LM-NN and GD-NN methods at 10 × 10 3 -W load. Likewise, the accuracy of the proposed DBN model was 11% better than linear SVM, LM-NN and GD-NN models, 12.5% better than RBF NN, 15.38% better than polynomial SVM and 26% better than MLP-NN methods at 20 × 10 3 -W load. Thus, the DBN model outperforms the traditional models in diagnosing the fault signals in power distribution system.