The energy transition entails a rapid uptake of renewable energy sources. Besides physical changes within the grid infrastructure, energy storage devices and their smart operation are key measures to master the resulting challenges like, e. g., a highly fluctuating power generation. For the latter, optimization based control has demonstrated its potential on a microgrid level. However, if a network of coupled microgrids is considered, iterative optimization schemes including several communication rounds are typically used. Here, we propose to replace the optimization on the microgrid level by using surrogate models either derived from radial basis functions or neural networks to avoid this iterative procedure. We prove well-posedness of our approach and demonstrate its efficiency by numerical simulations based on real data provided by an Australian grid operator.

The share of renewable energy sources rapidly increases; also due to more and more installed devices like e. g., solar panels at household-level. Hence, households become *prosumers*, i. e., power is not only consumed but also produced. Therefore, energy generation and distribution takes place in a distributed way. In particular, energy can be transmitted bidirectionally between the grid and the prosumers, which results in a paradigm shift in the grid organization. In addition, prosumers may also possess some kind of energy storage device in order to manipulate their power demand profiles by either charging or discharging. From the grid operator’s perspective it might be beneficial that charging decisions are not made based on local information only. Instead taking into account information on the entire grid may improve the system-wide operation, e. g., to flatten the overall power demand within the grid in order to facilitate the power supply. [1]. In order to achieve this goal, communication is needed. In the future, each household shall be equipped with a smart meter which yields so-called *smart homes*. Smart meters collect data and communicate with the grid operator automatically.

A straight-forward way to optimally operate the overall system is to formulate one large-scale optimization problem and to solve it in a centralized way, see, e. g., [2]. This approach, however, is hard to realize in practice. One of the disadvantages is that some central node needs the complete information about the grid, which is, e. g., due to data privacy, not desirable. Alternatives are decentralized or distributed optimization algorithms. In [3] the authors propose a decentralized approach to steer energy storage systems in order to avoid over-capacity of pole transformers while maintaining a high charging amount of energy storage systems in low-voltage distribution systems. The other option mentioned above are distributed optimization methods such as distributed dual ascent [4], Alternating Direction Method of Multipliers (ADMM) [5] or Augmented Lagrangian based Alternating Direction Inexact Newton (ALADIN) [6]. These algorithms use a star-shaped communication topology, i. e., each smart home communicates only with the grid operator and does not share any information with its neighbours. Nevertheless, in every iteration each household has to transmit specific (personal) data to the grid operator, see also [7] and [8] for an application of ADMM and ALADIN to electrical networked systems, respectively. In order to exploit the potential of these algorithms they are typically embedded within a Model Predictive Control (MPC) framework. MPC is a state-of-the-art technique to tackle optimal control problems by solving finite-dimensional optimization problems successively, see e. g., [9] for an introduction to MPC and [10], [11] for MPC approaches in electrical networks.

An alternate option to steer the power demand of local agents besides battery control is to schedule so-called controllable loads. Controllable loads can be shifted in time to avoid bottlenecks in the energy supply, see e. g., [12], [13]. There is also a large potential in the context of stochastic optimization of smart grids. For weather forecasting methods we refer to [14]. How to integrate electrical vehicles into the electricity network under uncertainties is described in [15].

Considering the power networks described so far, it is assumed that exchange of energy within the grid is possible at any time and does not cause any losses or additional costs, which might (approximately) hold for domestic nets, e. g., a town. In this paper, we refer to these grids as microgrids (MGs). In [16], [17], the concept of coupled MGs is used to tackle large-scale problems incorporating several MGs. In the latter, the authors show that even if each single MG is optimally operated, there is still room for improvement if energy can be exchanged among MGs. Therefore, a second optimization problem is solved on a higher grid level in order to optimally exchange energy resulting in a bilevel optimization problem [18].

In [19], the authors propose to replace the distributed optimization routine on the lower grid level by a surrogate model in order to speed-up the calculation and further reduce communication effort. Here, Radial Basis Functions (RBFs) [20] are used to approximate the input-output behaviour of ADMM within the framework of coupled MGs established in [17]. Besides RBFs there are various methods to learn the behaviour of a complex function. Artificial Neural Networks (NNs) are one of the most popular representatives of modern artificial intelligence techniques and are often used in practice due to their success in various application fields, see e. g., the survey article [21]. In [22] the authors forecast loads in a power grid using NNs, whereas in [23] NNs are used in an optimal power flow framework. The main advantage of using surrogates is that communication effort can be reduced.

In this paper, we extend the idea of coupled microgrids established in [17] by proposing an iterative *bi-directional* optimization routine in order to improve the overall performance. Due to its iterative structure, however, our method comes along with a strong need for communication between smart homes and grid operator. As a remedy we present two approaches to reduce the communication effort by substituting the optimization on microgrid level via surrogate models. A main difference compared to [19] lies in the different input-output map that is replaced by the surrogate models, for which we can show that each input uniquely determines an (optimal) output. Furthermore, we also take NNs as potential surrogate models into account and study the performance of the resulting approximations numerically in an MPC framework. Our simulations show that the proposed method approximately recovers the performance based on using ADMM but significantly reduces the communication burden. The effect of applying surrogate models within MPC extends our previous work [19] where a surrogate model based on RBFs was only applied in a static optimization problem.

The paper is structured as follows: In Section 2 we formulate a mathematical model for coupled microgrids that consists of two hierarchy levels, and introduce optimization problems corresponding to each of them. In the consecutive section, we propose an iterative scheme that requires the solution of a distributed optimization problem on the lower level which is solved using ADMM. In Section 4, we investigate the impact of disturbances w. r. t. the lower-level solution on the performance measured in terms of the upper-level objective function. Based on the results, we propose to replace ADMM by surrogates in order to reduce communication effort and computation time. The performance of the optimization scheme incorporating surrogates is analysed in an MPC framework in Section 5.

We consider a system of coupled microgrids (MGs) and call it a *smart grid*. Each MG consists of several residential energy systems (agents) coupled through the grid operator, which can be seen as Central Entity (CE). The coupling of the microgrids is done through a network, where some MGs are connected by a transmission line and others are not connected, cf. Figure 1.

We assume that we have
*κ*, for
*κ* has an average power demand
*n*. Given this, we can compute the total power demand
*κ*, but this is not necessary for the rest of the discussion.

Let
*ν* to MG *κ*. We enforce
*κ* is given by its own total power demand
*k* for *N* timesteps of each MG in a least-squares sense. The objective function is thus given by

(1)
(
z
¯
,
δ
)
↦
∑
n
=
k
k
+
N
−
1
∑
κ
=
1
Ξ
ζ
¯
(
n
)
I
κ
−
∑
ν
=
1
Ξ
δ
ν
κ
(
n
)
η
ν
κ
I
ν
z
¯
ν
(
n
)
2
.

Here, the vector
We are interested in minimizing (1) under the following constraints: All exchange rates
*δ*,

(2a)
min
δ
∈
[
0
,
1
]
Ξ
×
Ξ
×
N
J
(
z
¯
,
δ
)

(2b)
s
.
t
.
∑
κ
=
1
Ξ
δ
ν
κ
(
n
)
=
1

(2c)
δ
ν
κ
(
n
)
·
δ
κ
ν
(
n
)
≤
0
,
κ
≠
ν
∀
ν
,
κ
∈
[
1
:
Ξ
]
,
n
∈
[
k
:
k
+
N
−
1
]
,

where
As we have seen in the previous section, we consider an average power demand at each MG as well as some desired quantity
*i*-th system,
*κ*, can be described by the discrete time system dynamics,

(3a)
x
κ
i
(
n
+
1
)
=
α
κ
i
x
κ
i
(
n
)
+
T
(
β
κ
i
u
κ
i
+
(
n
)
+
u
κ
i
−
(
n
)
)

(3b)
z
κ
i
(
n
)
=
w
κ
i
(
n
)
+
u
κ
i
+
(
n
)
+
γ
κ
i
u
κ
i
−
(
n
)
,

where
The system can be controlled by charging

(4a)
0
≤
x
κ
i
(
n
)
≤
C
κ
i

(4b)
u
_
κ
i
≤
u
κ
i
−
(
n
)
≤
0

(4c)
0
≤
u
κ
i
+
(
n
)
≤
u
¯
κ
i

(4d)
0
≤
u
κ
i
−
(
n
)
u
_
κ
i
+
u
κ
i
+
(
n
)
u
¯
κ
i
≤
1
.

Here,
For a concise notation we introduce the set

The output quantity in (3b) is the power demand
*κ*. The average power demand

(5)
ζ
¯
(
n
)
=
1
I
·
min
{
N
,
n
+
1
}
∑
j
=
n
−
min
{
n
,
N
−
1
}
n
∑
i
=
1
I
w
i
(
j
)

where
Let us for the moment ignore the coupling described in Subsection 2.1. Then,

(6)
min
z
(
κ
)
∈
D
(
κ
)
g
(
z
¯
κ
)

per MG with local objective function
We are interested in optimizing the function (1). This function, in general, depends on *δ* as well as on
*u*, which we have to find in such a way that

(7)
min
δ
∈
D
δ
,
z
(
κ
)
∈
D
(
κ
)
J
(
z
¯
,
δ
)
.

Note that due to constraint (
2c) the optimization of
We propose to tackle the optimization problem (7) in a *bidirectional* way, i. e., we first find an optimal
*δ* being the identity and then optimize (7) w. r. t. *δ* for fixed

Assume that each MG *κ*,

(8)
min
z
(
κ
)
∈
D
(
κ
)
g
κ
(
z
¯
κ
)
=
ζ
¯
κ
+
−
z
¯
κ
2
2
,

where
Based on the updated reference value we solve (8) and (2) to improve the battery usage and the energy exchange and repeat the optimization until some terminal condition is satisfied, e. g., performance improvement less than a pre-defined tolerance or maximal number of iterations exceeded. This procedure is summarized in Algorithm 1. Note that we only update the reference

Neither convergence nor the interpretation of a potential limit of Algorithm 1 is clear a priori. Figure 3, however, experimentally shows convergence of the proposed scheme and a continuous improvement of the upper-level performance index. Here, we ran 10 iterations of the optimization scheme and plotted both the objective function values before and after the energy exchange within each iteration. The values stagnate after four iterations indicating that additional iterations do not further improve the overall performance. The next subsection elaborates on how to solve (8) in a fully distributed way using ADMM.

In this section we briefly discuss how to solve the lower-level optimization problem (6) or (8) using an Alternating Direction Method of Multipliers (ADMM) approach. We consider a single MG and therefore omit the index *κ*. Since the averaged output quantity appears in the objective function (6) or (8), we need to introduce an auxiliary variable *a* in order to decouple the lower-level optimization in the following way,

(9a)
min
z
,
a
g
(
a
¯
)
=
‖
a
¯
−
ζ
¯
‖
2
2

(9b)
s
.
t
.
1
I
∑
i
=
1
I
a
i
−
a
¯
=
0
,
z
i
−
a
i
=
0

(9c)
z
i
∈
D
i
∀
i
∈
[
1
:
I
]
.

Note that (
9c) is a short-hand notation for the battery dynamics (
3)–(
4), and yields a fully decoupled constraint in the variable
(10a)
z
i
ℓ
+
1
=
arg
min
z
i
∈
D
i
z
i
⊤
λ
i
ℓ
+
ρ
2
z
i
−
a
i
ℓ
2
2

(10b)
a
ℓ
+
1
=
arg
min
a
∈
R
I
N
g
(
a
¯
)
−
∑
i
=
1
I
a
i
⊤
λ
i
ℓ
+
ρ
2
‖
z
i
ℓ
+
1
−
a
i
‖
2
2

(10c)
λ
i
ℓ
+
1
=
λ
i
ℓ
+
ρ
(
z
i
ℓ
+
1
−
a
i
ℓ
+
1
)

until some termination condition is satisfied. Note that (
10b) is an unconstrained optimization problem and can be solved explicitly. The problem (
10a) can be solved in parallel by each battery in the MG introduced for our model in Section
2.2. Note that scheme (
10) assumes communication within the MG, more precisely, each system
According to Theorem 3.1 in [7] the optimization scheme (10) converges in the following sense.

*Consider Problem* (9) *with g being strictly convex, closed and proper and let the iterates*
*be computed according to* (10)*. Then the following following statements hold true:*

According to [7] and the references [5, Section 3] and [4, Appendix C] therein, problem (6) fulfils the assumptions of Theorem 1.

This section is dedicated to surrogate models for the optimization routine (10) within a single MG. For simplicity of notation we omit the index *κ*.

Due to the distributive structure of ADMM, the residential energy systems do not need to share information with their neighbours but only with the CE, see also the star-structure in Figure 2. In each iteration *ℓ* of ADMM, subsystem *i* has to transmit its solution

(11)
φ
(
w
¯
,
x
(
k
)
,
ζ
¯
)
=
z
¯
,

for all feasible
Figure 4 (top) shows that if the approximation (11) is sufficiently accurate, the impact on the performance of the optimization scheme is negligible. Here, the costs

Note that (11) might yield approximations to the solution

The following proposition states that for equality in (11), a proper mapping is defined. For a concise notation we replace the index
*i* here.

*Consider φ given by* (11)*, where*
*describes the optimal solution of* (6) *computed via ADMM, i. e.,*
*. We assume all hyper-parameter to be fixed meaning that*
*in* (3)*–*(4) *are constant over time for all*
*. Then φ is a mapping, i. e., for all*
*, there exists a uniquely determined*
*such that*
*is the solution to the optimization problem* (6)*.*

First note that ADMM yields the unique solution of (6), see e. g., [5]. Furthermore, there are no constraints on

Radial Basis Functions (RBFs) are used to interpolate functions based on a set of sampling data. We briefly recap some basics on RBFs. For a detailed introduction to theory and application see e. g., [20], for a similar approach where RBFs are used to replace ADMM we refer to [19].

Let

(12)
z
¯
≈
φ
RBF
(
χ
)
=
∑
m
=
1
M
ψ
m
(
χ
)
α
m
+
q
(
χ
)
,

where
In Figure 5, a possible fit via RBFs is visualized. Here, we interpolated given data from two-weeks of optimization (4540 data points) based on sampling data picking each 25-th data point to train (12). Then, we tested
`Matlab` toolbox `DACE` [27]. Note that the evaluation time of the RBF approximation grows with the number of data points used. Already with 180 data points to train (12) with

Neural Networks (NNs) are a state-of-the-art method in artificial intelligence frameworks. Based on huge amounts of data
*l*-layers as an approximation to the mapping (11), i. e.,

(13)
z
¯
≈
φ
NN
(
χ
)
=
σ
W
[
l
]
…
σ
(
W
[
2
]
χ
+
b
[
2
]
)
…
+
b
[
l
]
,

where
The overall goal of the approximation (13) is to be sufficient in the sense of the MPC performance shown in Figure 6. Our experiments in Figures 4 (bottom) and Figure 5 show that with one hidden layer of ten neurons only, a satisfying approximation on a 24-hours time window can be achieved if the training data is large enough. Note that NNs benefit from big data. In our case study, we trained the NN only on data corresponding to two weeks.

Model Predictive Control (MPC) is a method to tackle optimal control problems on an infinite time horizon by solving a series of finite dimensional optimization problems instead, see e. g., [9] for an introduction to non-linear MPC.

Consider the optimal control problem (6). In order to provide an optimal control sequence over an arbitrary long time horizon we use MPC. To this end, at current time instance
*δ*. Then, only the first instances

Note that Problems (6) or (8) and (2) have to be solved in order to determine

(14)
∑
κ
=
1
Ξ
ζ
¯
(
k
)
I
κ
−
∑
ν
=
1
Ξ
δ
ν
κ
⋆
(
k
)
η
ν
κ
I
ν
z
¯
ν
⋆
(
k
)
2

are realized at each time step
We compare the performances using ADMM, RBFs, and NNs on the lower-level, i. e., in Step 4(b) of Algorithnm 1. In all numerical simulations we set
^{[1]} The battery parameters were randomly chosen with mean values

Results on the MPC closed loop can be found in Figure 6 and Table 1. In Figure 6 the closed-loop performances of ADMM (black line) compared to perturbed ADMM, and ADMM (black line) compared to the two surrogate models are visualized. Similar to the open-loop case, small disturbances in ADMM have little impact and RBFs outperform the NN. The first column of Table 1 compares the sum of all MPC closed-loop performances using ADMM, RBFs and a NN while in column 2 the average runtimes of these approaches are reported. Note that when using a surrogate, we call ADMM once per MPC iteration. As elaborated in [7] in each ADMM iteration an *N*-dimensional vector has to be transmitted twice. Hence, both surrogates reduce the need for communication. Two great advantages of ADMM are that the local optimization (10a) can be parallelized and the global optimization is independent of the size of the MG. However, a single function evaluation such as (12) or (13) is faster than running the entire ADMM optimization routine.

Table 1

closed-loop cost | runtime [ms] | |

no control | 12,228 | — |

ADMM | 4,416 | 2.5 |

RBFs | 4,529 | 1.2 |

NNs | 5,598 | 0.05 |

Note that in column 2 of Table 1 we ignored the communication between smart homes and CE which is needed to apply ADMM in practice. However, the runtime of ADMM impairs when executed in an actual smart grid while surrogates do not require additional communication.

In order to improve the performance of the NN, more sampling data has to be generated to increase the training set significantly. To avoid large offline computation times, we chose

In this paper we recalled an optimization problem arising in large-scale electrical networks. We proposed an iterative *bidirectional* optimization scheme to tackle this problem in a distributed way, and showed numerically that a small error on the lower level does not have noticeable impact on the performance. Based on this observation, we replaced the lower-level optimization by surrogate models using radial basis functions and artificial neural networks. The numerical results show the potential of using these surrogates to reduce communication effort and computational time in MPC while preserving the overall performance.

1. S. Parhizi, H. Lotfi, A. Khodaei and S. Bahramirad. State of the Art in Research on Microgrids: A Review. IEEE Access, 3(1):890–925, 2015.10.1109/ACCESS.2015.2443119 Search in Google Scholar

2. D. E. Olivares, C. A. Cañizares and M. Kazerani. A centralized optimal energy management system for microgrids. In 2011 IEEE Power and Energy Society General Meeting, pages 1–6, 2011. Search in Google Scholar

3. R. Okubo, S. Yoshizawa, Y. Hayashi, S. Kawano, T. Takano and N. Itaya. Decentralized Charging Control of Battery Energy Storage Systems for Distribution System Asset Management. In 2019 IEEE Milan PowerTech, pages 1–6, 2019. Search in Google Scholar

4. D. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Computation: Numerical Methods. Belmont, MA, USA: Athena Scientific, 1989. Search in Google Scholar

5. S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning, 3(1):1–122, 2011. Search in Google Scholar

6. B. Houska, J. Frasch and M. Diehl. An Augmented Lagrangian Based Algorithm For Distributed Nonconvex Optimization. SIAM J. Optim., 26(2):1101–1127, 2016.10.1137/140975991 Search in Google Scholar

7. P. Braun, T. Faulwasser, L. Grüne, C. M. Kellett, S. R. Weller and K. Worthmann. Hierarchical distributed ADMM for predictive control with applications in power networks. IFAC J. Syst. Control, 3:10–22, 2018.10.1016/j.ifacsc.2018.01.001 Search in Google Scholar

8. A. Engelmann, Y. Jiang, T. Mühlpfordt, B. Houska and T. Faulwasser. Towards Distributed OPF using ALADIN. IEEE Trans. Power Syst., 34(1):584–594, 2019.10.1109/TPWRS.2018.2867682 Search in Google Scholar

9. L. Grüne and J. Pannek. Nonlinear Model Predictive Control. Theory and Algorithms. Springer, London, 2 edition, 2017. Search in Google Scholar

10. M. Khalid and A. V. Savkin. A model predictive control approach to the problem of wind power smoothing with controlled battery storage. Renewable Energy, 35(7):1520–1526, 2010.10.1016/j.renene.2009.11.030 Search in Google Scholar

11. A. Parisio, E. Rikos and L. Glielmo. A model predictive control approach to microgrid operation optimization. IEEE Trans. Control Syst. Technol., 22(5):1813–1827, 2014.10.1109/TCST.2013.2295737 Search in Google Scholar

12. G. Graditi, M. L. Di Silvestre, R. Gallea and E. R. Sanseverino. Heuristic-Based Shiftable Loads Optimal Management in Smart Micro-Grids. IEEE Trans. Ind. Informat., 11(1):271–280, 2015.10.1109/TII.2014.2331000 Search in Google Scholar

13. P. Braun, L. Grüne, C. M. Kellett, S. R. Weller and K. Worthmann. Model Predictive Control of Residential Energy Systems Using Energy Storage & Controllable Loads. Progress in Industrial Mathematics at ECMI 2014. Mathematics in Industry, 22:617–623, 2016. Search in Google Scholar

14. R. R. Appino, J. Á. G. Ordiano, R. Mikut, T. Faulwasser and V. Hagenmeyer. On the use of probabilistic forecasts in scheduling of renewable energy sources coupled to storages. Applied Energy, 210:1207–1218, 2018.10.1016/j.apenergy.2017.08.133 Search in Google Scholar

15. R. R. Appino, M. Muñoz-Ortiz, J. Á. G. Ordiano, R. Mikut, V. Hagenmeyer and T. Faulwasser. Reliable Dispatch of Renewable Generation via Charging of Time-varying PEV Populations. IEEE Trans. Power Syst., 34(2):1558–1568, 2018. Search in Google Scholar

16. R. H. Lasseter. Smart distribution: Coupled microgrids. Proceedings of the IEEE, 99(6):1074–1082, 2011.10.1109/JPROC.2011.2114630 Search in Google Scholar

17. P. Braun, P. Sauerteig and K. Worthmann. Distributed optimization based control on the example of microgrids. In M. J. Blondin, P. M. Pardalos and J. S. Sáez, editors, Computational Intelligence and Optimization Methods for Control Engineering, volume 150 of Springer Optimization and Its Applications. Springer International Publishing, 173–200, 2019. Search in Google Scholar

18. A. Sinha, P. Malo, and K. Deb. A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications. arXiv preprint arXiv:1705.06270, 2017. Search in Google Scholar

19. S. Grundel, P. Sauerteig and K. Worthmann. Surrogate Models For Coupled Microgrids. In I. Faragó, F. Izsák and P. Simon, editors, Progress in Industrial Mathematics at ECMI 2018, 30, 1 edition, Springer International Publishing, 2019. DOI: 10.1007/978-3-030-27550-1. Search in Google Scholar

20. M. D. Buhmann. Radial basis functions: Theory and implementations, volume 12. Cambridge university press, 2003. Search in Google Scholar

21. O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed and H. Arshad. State-of-the-art in artificial neural network applications: A survey. Heliyon, 4(11), 2018. Search in Google Scholar

22. H. Zhang, F. Xu and L. Zhou. Artificial neural network for load forecasting in smart grid. In 2010 International Conference on Machine Learning and Cybernetics, volume 6, pages 3200–3205, 2010. Search in Google Scholar

23. P. Siano, C. Cecati, H. Yu and J. Kolbusz. Real Time Operation of Smart Grids via FCN Networks and Optimal Power Flow. IEEE Trans. Ind. Informat., 8(4):944–952, 2012.10.1109/TII.2012.2205391 Search in Google Scholar

24. K. Worthmann, C. M. Kellett, P. Braun, L. Grüne and S. R. Weller. Distributed and Decentralized Control of Residential Energy Systems Incorporating Battery Storage. IEEE Trans. Smart Grid, 6(4):1914–1923, 2015.10.1109/TSG.2015.2392081 Search in Google Scholar

25. E. L. Ratnam, S. R. Weller, C. M. Kellett and A. T. Murray. Residential load and rooftop PV generation: an Australian distribution network dataset. Internat. J. Sustain. Energy, 2015. Search in Google Scholar

26. J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 2006. Search in Google Scholar

27. S. N. Lophaven, H. B. Nielsen and J. Søndergaard. DACE-A Matlab Kriging toolbox, version 2.0. Technical report, 2002. Search in Google Scholar

28. I. Goodfellow, Y. Bengio and A. Courville. Deep Learning. MIT Press, 2016. Search in Google Scholar

29. C. F. Higham and D. J. Higham. Deep Learning: An Introduction for Applied Mathematicians. arXiv preprint arXiv:1801.05894, 2018. Search in Google Scholar