An innovative learning approach for solar power forecasting using genetic algorithm and artificial neural network

Abstract Analysing the Output Power of a Solar Photo-voltaic System at the design stage and at the same time predicting the performance of solar PV System under different weather condition is a primary work i.e. to be carried out before any installation. Due to large penetration of solar Photovoltaic system into the traditional grid and increase in the construction of smart grid, now it is required to inject a very clean and economic power into the grid so that grid disturbance can be avoided. The level of solar Power that can be generated by a solar photovoltaic system depends upon the environment in which it is operated and two other important factor like the amount of solar insolation and temperature. As these two factors are intermittent in nature hence forecasting the output of solar photovoltaic system is the most difficult work. In this paper a comparative analysis of different solar photovoltaic forecasting method were presented. A MATLAB Simulink model based on Real time data which were collected from Odisha (20.9517∘N, 85.0985∘E), India. were used in the model for forecasting performance of solar photovoltaic system.


Introduction
Power Plant Based on Renewable Energy System have dragged the attention of Power researchers due to its scattered expression in the last decade. Large scale expansion of these sources have made it to meet the increase in demand of electrical energy. This expansion is not only for economic or political reason but also for creating a suitable environment for our new generation where power will be produced from clean sources like solar and wind with zero environment pollutions. Government is also taking a lots of efforts such as carbon credit incentives, subsidies for installation of solar photovoltaic system promoting green building concept for educational institute etc. From a survey it was found that by year 2035, out of the total Electricity Produced by the country, the Res based electricity generation will count one third of it.
For large scale interconnection of solar photovoltaic system it is required to forecast the daily solar insolation availability of the geographical area where the photovoltaic system is likely to operate from operation and maintenance point of view. It is also required to opine the power engineers about different power quality issues being to be faced throughout the day because of intermittent nature of solar PV output. Unit commitment is another essential parameter for day type of power generating unit. Day ahead unit commitment of renewable energy generating system makes it able to run the reserve power generation system in a more efficient manner which not only minimizes both time and cost and at the same time increases grid reliability by injecting clean power to the traditional grid.
Forecasting/unit commitment for day ahead system helps the generating station engineer to properly manage the power demand and these by maintaining a balance between the generation and demand. Again due to involvement of lots of environmental parameters such as temperature, cloud quantity, dust exact prediction of PV power output become a difficult task. A number of forecasting method have been introduced by many researcher in last decade. All these forecasting are for long term prediction of solar PV system. From literature it can be found that basically there are types of power forecasting method and they are numerical approach, hybrid approach, AI technique approach, physical approach, numerical approach is also equivalent to statistical approach which uses some regression analysis in past historical data to predict the output of forecasted result. A little bit modification to statistical approach i.e. Artificial intelligence (AI) uses some back propagation and forward algorithm to arrive at a particular result. Apart from all these methods physical prediction of solar PV data from weather condition by using some numerical method and satellite images have been used since long time. Combining all these approaches in a single unit can regenerate the hybrid system which has the capability of predicting the solar PV output based on the images taken from the satellite, AI-technique with some numerical analysis can solve the prediction problem. Apart from ongoing discussed forecasting methods, some other statistically used method usually start with mathematical function which describe the linear and nonlinear relationship between the data sets and their behaviour to the environmental parameter with an objective to minimize the variation of mathematical function. In this case the analysis takes a long time to analyse the result and thereby making convergence of the system optimized parameters. This paper present a comparative analysis of all the forecasting method mainly used by researchers over past decade. The paper describes about the artificial intelligence based extremum learning algorithm for forecasting the solar hidden network such as analysing some kind of weight to the hidden layer and arbitrary selection of hidden bias was selected by applying the genetic algorithm to the master real time data which are collected from open source data based on meteorological department. Different section of the paper includes the proposed idea is arranged in the following manner.1st section describes about brief description of forecasting followed by 2nd section which mainly deals with the modelling of PV cells along with different MPPT technique with special focus on incremental conductance method.3rd and 4th section describes about result analysis and comparison with new technique. 5th section describes about the conclusion along with future development.

PV Model
The main aim of solar PV forecasting is to forecast the weather condition such as temperature, solar radiation and to that of PV output for a particular system. A standardised model is always helpful in predicting the performance of solar PV of different capacity under any environmental condition.

Solar PV plant
Different method of PV modelling were described in the literature like one diode modelling and two diode modelling. Actually by increasing the diode in the modelling one can calculate the exact losss occurring in the system. However wolf has proposed a method for describing the mathematical of solar cell with a current source, a diode connected in anti-parallel and two resistor such as series and parallel resistor. According to Wolf Where G represents the solar radiation, G stc represents the standard solar radiation, I ph,stc represents photo generated current during standard temperature condition (STC), T and T stc temperature and temperature at STC respectively. Similarly the maximum power generated by the solar PV module can be written as Where total conversion efficiency is represented by 'η'. This 'η' is for the entire solar PV array, total area covered by the solar PV array represented by A(m 2 ). Solar insolence falling on the array is represented by I (kw/m 2 ) and 't' represents the total ambient temperature of PV array in ( ∘ C). The real time model which was developed in MATLAB simulink model consist of 72 no of cells having total maximum output power of 300 Wp(pmax). Maximum short circuit current is 5.&@ A and a open circuit voltage of 23.4V. The shunt and series resistance representing the lid connection resistance is of 1200 ohm and o.1 miliohm respectively.

Aspect of PV power Forecasting
Short listing the input variable and effect of environmental aspect affect the accuracy of developed model. Prediction of PV generation operating in an environment depends in the following mentioned factor. a) Historical or past decade data of PV generating system. b) Meteorological variable such as environmental temperature, cloud coverage, wind speed,shading due to dust, irradiance and global solar insolation etc. Generally four kinds of forecasting are there and they are as follows.

(1) Intraday Forecasting
In the competitive energy market availability of electrical energy at the point of demand is the most challenging job. Intraday forecasting which is usually from some few sec to minute could able to ensure the availability of storage device connected with solar PV system on the PV system as a whole. This increases the efficiency and reliability of grid connected PV system.
(2) Short term forecasting Economic load dispatch and there by easy distribution of power is an essential part of any power distribution network. Short term forecasting is actually carried out for 2-3 days. Day ahead forecasting enable the power purchaser and also distribution company people to allocate the load according to availability of power or energy.

(3) Medium term Forecasting
Power system network always forces some kind of breakdown which requires periodic maintenance of the network. Medium term forecasting usually varies from 3 to 7 days. This enable the operation and maintenance people to connect the system and bring back them to the level for power transmission and distribution.

(4) Long term forecasting
Long term forecasting usually varies from week to month on to a year also. It involves a lots of parameter and huge rigorous calculation is usually carried out to forecast the power in terms of watt. So from the above discussion it can be found that forecasting of the solar PV power helps in deciding the generating commitment of generating unit, economic load dispatch of power, real time unit commitment, and storage system selection for the electricity market. From the four no of forecasting method short term forecasting is usually carried out by the power researcher for solar PV system.

(5) Data Synthesis
Processing and synthesizing a large size of data always a challenge.In our simulation and analysis work priority was given to minimize the error between two search algo-rithms. The function describing the objective can be written as follows Where X min, Xmax represent min and maximum value of temperature, windspeed between two data sets. This process will be followed in the subsequent iteration till it converge to the maximum on best possible year series. X represent each month of that corresponding for which analysis is being carried out.

(6) Data Analysis
In this research paper different statistical analysis tool were used to analyze the predicted result. in order to analyze how far the predicted data is from the fittest line, root mean square error(RMSE) method is usually used to predict the data set originality and its closeness with respect to fittest line. This analysis is generally used to predict the climate condition and regression analysis in order to verify experimental result. RMSE can be found out by using equation 3.
Where f represents the forecasted value on predicted value and δrepresents the observed value on base value for which forecasted was conducted. Here bar represents the mean of that quantity. Equation 3 can be remodelled as Where Z f i −Zo i represents the difference between two quantity and N represent the sample size of observed quantity. Again difference between two continuous variable can be represented by mean absolute error. MAE generally represents the vertical distance present between the predicted result and identity line. Equation 5 can be used to calculate MAE, Where e i represents the error present between the time varying quantity and n represent the sample quantity. Mean absolute percentage error (MAPE) or mean absolute differentiate error(MADE) is generally used in the statistics to predict the accuracy of the prediction variable. It is usually represented as A t represents the actual value and F t represents the forecasted value."n" represents sampling quantity of the variables. Combination of these three technique can be utilised to predict on forecasting the performance of solar PV system under different weather condition.

Introduction
Loni j. et al. [69] in their paper cloud advection forecasting has demonstrated about the method of forecasting using estimated cloud motion vector. They have collected the data from roof top PV system. Target location are then calculated based on the median of transposed measurement. A correlating approach was carried out to test the accuracy of forecasting. Yarg et al. [71] analyse the solution of forecasting using "Lasso" parameter shrinkage method. The method applied here is based on training on the recent measurement history and motion on "upwind" and "down wind" is assumed static. Achleitner et al. [76] has introduced peak matching algorithm which matches the peak value of data to be measured and PV farm in order to establish the momentary time lag in between the clouds.

Problem Definition
Forecasting method based on NWP/satellite resolution, statistical method, and black box method of comparison is not time dependent. The regression analysis generally uses no of historical data to predict on forecast the solar PV output. However approach based on historical mingle may not be applicable to first changing environment on sometimes not suitable for dynamic analysis. Therefore in this paper AI-based fractional order derivative (AI-FOD) has been introduced to measure the observed values and process them with dynamic change in environmental condition before casting.

(7) Pearson's Correlation coeflcient
It measures the correlation between two variables. The ρrepresents the Pearson correlation coefficient can be evaluated through equation (7) ρ = cov(P,P) σp σP Where the numerator represents the covariance of actual power and the forecastedP and the denominator represents the standard deviation of the quantities respectively. from the above equation (7) it can be concluded that larger the value of Pearson's constant, lesser the error between forecasted value and data.

(8) Skewness and Kurtosis
Skewness represent the asymmetry present in the system probability distribution function. Skewness index is represented by where represent the skewness index, e represents the error present in actual and forecasted result. µe and σe represents the mean and standard deviation present in the forecasted value respectively. Similarly Kurtosis as represented by K means the magnitude of the peak of the distribution. K can be calculated as Where µ 4 represents the 4th moment of mean and σ represents the standard the standard deviation of forecasted error.

Application Of Genetic Algorithm To Forecasting
Optimizing the data set in order to calculate a particular data. Usually involves a long iterative calculation and initial guess to predict the data set. Basically the traditional optimization is of two types and they are continuity assumption and convergence to a particular value based upon initial assumption. In contradiction to the traditional optimization technique, GA algorithm based optimization technique, GA algorithm based optimization technique works only on objective function and its boundary value to find the best possible solution. Some of the distinguished characteristics of GA are as follows.
i) GA works on binary data rather than on the original data. ii) GA usually works on population of data rather than a single point of data, which enable it to find the best possible result on fittest value. iii) GA uses some probabilistic logic and thereby works on objective function hence requirement of extra parameter for evaluation of objective function may be eliminated.
Again for finding out the best fitness function of individual data.
f (x) ind is fitness of each pareto involved. obj ind is objective of each individual chromosome. obj ind(min) is objective of smallest individual chromosome. obj ind(max) is objective of maximum individual chromosome The objective function as shown in (3) must be satisfied with the constraint as Where Gs (x) and Gs(x) represent the calculated and limited value constraints and s s represents the inequality constraints.
Elite individual which are inherent in GA usually double as compared to individual selection. Hence the population of GA must generate sufficient amount of elite individuals, which ultimately aims in preventing their penetration in to the next level of generation. Hence the fitness function as described in (3) can be re modified to Where f (x−1) ind = fitness function at (x-1) and f (u) = 1 population size Eq. (5) tells that the objective function can remove the too much presence on each individual elite member and there by restricts its entry into next generation level. Duplicated individuals are redundant to the population size. Therefore their fitness function is usually set to zero. This is to avoid duplicity.
The selection operator will find out best chromosome i.e. to be transferred to the next stage based on fitness value. Chances of selection of chromosome for lower fitness valued individual is very less as compared to chromosome having larger fitness value. Chances of selection of chromosome having larger fitness value.

Experimental Setup & Result Analysis
Forecasting or unit commitment for solar photovoltaic system, strongly depends upon the solar radiation and temperature, apart from these parameter some other parameter such as humidity and shading is also affect the performance of unit commitment. In this experiment data base has been collected from Odisha (20.9517 ∘ N, 85.0985 ∘ E), India. Based on the historical data, the data validation was carried out through ANN based on MATLAB. A novel criteria is used a two point stop method was adopted for validating the data, the algorithm is shown in Figure 1. All computation were carried out on a PC having 2.4 :GHz processor having 1 GB RAM Linux system. Data such as solar insolation and temperature were validated with artificial neural network based on python language. Data from 2000 to 2017 was collected and were processed with ANN and the best result from each and every month were collected in an excel sheet format. It was found that after applying the Regression and Root Mean Square error function for evaluating the function that all the approximate results are up to the mark and the error present between them is under the tolerance limit .It is worthwhile to mention here that the error calculated using RMSE and MAE has a little deviation to that of the ANN based approximated result. Table 1 shows the Observed and Calculated value of AC System Output along with the calculated system AC output Energy. Column 5 and 6 represents the Observed and calculated temperature using ANN of the Environment. In the present research Environmental temperature is considered to be the variable quantity which ultimately affects the performance of solar Photovoltaic system. Solar Irradiance for each month is assumed to be constant. The MSE and the Hidden layer details involved in the ANN model is shown in Table 2.
From Table 2 it can be concluded that the for validation of 0.0131 the no. of hidden layer in the First part and Second part becomes 20 and 10 respectively. This confirms that for a test result of 0.0131, train was just only 0.0023. Hence the temperature corresponding to it was taken as the final calculated value or best value having the Error between the observed and forecasted value to be minimal. Table 3 & 4 shows the Statistical Analysis such as Regression and Probability Analysis. From the Table 4 it can be found that the probability of prediction lies in the range of 12.5 to 95.833 percentile. The normal error present in the statistical analysis is of 1123.718. Now in order to validate the forecasting GA is applied for each month on the objec- The objective function as shown in equation 14 was derived from the Polynomialisation of Temperature and AC out put (kWh), which is shown in Figure 2 Figure 3 shows the Simulation and data validation of Forecasted value with Genetic Algorithm. It is found that the Forecasted result as shown in the Table 1 is under the Normalcy. Similar to the AC Output Forecast, statistical analysis of Temperature is shown in the −5. Table 5 and 6 shows the Statistical analysis of Temperature Histogram over 18 years and Regression analysis using ANOVA for Temperature respectively. The one-way analysis of variance (ANOVA) is used to determine whether there are any statistically significant differences between the means of two or more independent groups. Here df rep-resents the degree of freedom which is 1 for Regression and 16 for Residual in this present study. Significance F represents the ratio of Mean Square Error to Sum Square Error, which is unity in the present case. This signifies that the forecasted result for solar PV is the best accurate one with respect to temperature. The F-test is used for comparing the factors of the total deviation. In the present analysis of ANOVA F is found to be 22.1812 which is inside the prescribed limit of F.

Conclusion
The level of solar Power that can be generated by a solar photovoltaic system depends upon the environment in which it is operated and two other important factor like the amount of solar insolation and temperature. Application of GA to Forecasting of the Solar AC output system is discussed in this paper. It is found that the forecasting using GA is much more convenient and accurate as compared to statistical method of analysis. In the next paper of this series Optimisation of Solar PV Output with respect to two variable such as Temperature as well as Solar Radiation will be presented. Grid connected solar Photovoltaic issues based on the Forecasted result and their mitigation techniques will be discussed in future work.