1 Introduction
Research on multiple objective optimization (MO) has been attracting significant attention of the engineering community since 1980s; with the aid of fast computers solutions to many complex optimization problems have been made possible. The Vector Evaluated Genetic Algorithm (VEGA) [1] is one of the earliest examples of Multi-Objective Evolutionary Algorithms (MOEAs). The more recent developments include NSGA-II [2] and its modified versions as well as Particle Swarm based methods [3]. A comprehensive review of problem definitions and non-EA based solution methods may be found in [4].
There is an increasing number of indicator-based MOEAs that have been proposed in recent years; the indicator is used as a fitness measure for a set of Pareto points, and – by optimizing the indicator function – the MO problem essentially becomes a single objective optimization problem as the solver only needs to locate the optimal value of the indicator value and update the generation based on it. One of the best-known indicators is the hypervolume [5]; it has been successfully applied to both EAs and surrogate-based algorithms. Despite its unique feature of being strictly monotonic to Pareto improvements [6], it suffers from high computational cost for higher dimensions.
The general opinion favors EAs as advantageous in solving MO problems by often being population based, thus multiple solutions can be obtained in a single run. However, solutions to practical problems may be expensive in terms of computational time and effort. In the context of electromagnetic devices, the finite element method is a common design tool; it often takes hours or even days to obtain a single solution, therefore surrogate model based algorithms are often preferred.
In this study we propose a new indicator focused Localized Probability of Improvement (LPoI) approach for MO problems; its implementation requires the predicted mean and mean square errors to be available, hence it is not applicable to other EAs, but for Gaussian based surrogate models (including those relying on kriging) it has the advantage of being linearly scalable to problems with higher number of objectives.
2 Kriging theory
Modern engineering design often involves implementation of deterministic computer simulation; in electromagnetic design, time consuming finite element models (FEM) are often built to represent the actual devices. Designs are analyzed and optimized before being put into production. In these types of problems, the optimization can be a very time consuming process due to a large number of FEM calls needed. Therefore surrogate modeling techniques are often used to reduce the number of expensive FEM simulations.
where x is the location of any design site.
where xi and xj are a pair of observations, k is the problem dimension, while θn and pn are hyperparameters controlling the shape of the correlation function.
where y denotes all observations and R is the correlation matrix.
The kriging prediction and the predicted mean square error (MSE) at a given location x are given as follows
with
3 Localized probability of improvement
where yt is the target of improvement, ŷ the kriging predicted mean at location x, ŝ is a square root of the mean square error at location x and Φ(⋅) is the cumulative distribution function.
where ŷn, ŝn, yextn and PoIextn are the corresponding measures of the nth objective function.
where yref is the calculated reference point.
where Yn is the collection of the nth objective values for all of the points in that Pareto set.
Taking a bi-objective problem as an example, assuming the reference point yref is to be determined for Pareto points P1 and P2, the coordinates of P1 and P2 are therefore denoted by [P1.x1, P1.x2] and [P2.x1, P2.x2], respectively. Note that xn is the nth objective value at the location in the search space associated with P. The x1 and x2 coordinates (in the objective space) of the reference point are thus described as follows
where yrefn, ŷn, ŝn, PoIrefn and PoIextn are the corresponding measures of the nth objective function.
where PoIext, as described by (9), is due to the fact that the minimum of each individual objective function is always present in the Pareto front, thus the PoI at each location x, over the optimal target of that function, is always considered. This term also contributes to the diversification of the Pareto front.
Furthermore, LPoIref – as described by (15) – can be treated as a maximum of the minimum potential improvements to a local target. This term helps to improve the Pareto front both towards the origin and in the direction of the objective value. It contributes to the diversification of the Pareto front, while the max-min method also contributes to the uniformity of the Pareto front.
To obtain the next infill sampling point, the algorithm finds the location x associated with the maximum LPoI measure in the objective space.
The parameter p – as seen in (7) and (10) – is associated with the magnitude of target improvement; it controls the convergence rate of the algorithm. A smaller amount of improvement will guide the solver towards existing Pareto points, while a larger value will encourage the exploration of the design space. It is crucial to use a proper p, since too small a value may lead to a false Pareto front, while a large value may result in a slow convergence rate or zero probability of improvement at all unknown sites. Thus it is advisable to dynamically adjust the value while monitoring the convergence.
where LPoIprev is a complete set of LPoI at previous iteration.
The next infill point is taken at the location with a maximum LPoI. Therefore, the solver tends to minimise the localised probability of improvement and converges towards the Pareto front. When the design space is well explored, or p is especially small, the solver will converge towards existing Pareto fronts; at this stage, it is common for the LPoI to be equal, or come close, to 1 at multiple unknown sites (extremely likely to improve over the target point). In order to obtain a uniformly distributed Pareto front, the algorithm selects candidates which have the largest Euclidean distance to existing Pareto points compared to the next infill sampling points. For this reason, the maximum value of LPoI can be capped between 0.9 and 1 for faster exploitation of the existing Pareto front without degrading the overall performance.
4 Test examples
The top graph in Figure 1 shows the kriging model (solid line) after 45 iterations, with the red crosses plotted at the true Pareto ront, while the bottom plot shows the proposed indicator value for the unknown sites. As can be seen, the algorithm has correctly converged to all four Pareto point clusters in the search space and thus further sampling will lead to more exploitation on the Pareto front. The sampled design sites in the objective space are plotted in Figure 2, where the red dots indicate the location of the true Pareto front. The improvement direction imposed by the two improvement targets are illustrated in Figure 2, where the yellow arrows show the improvement direction for the first improvement target, and the blue arrows indicate the improvement direction for the second improvement target.

The kriging model and the LPoI criterion in the search space
Citation: Open Physics 15, 1; 10.1515/phys-2017-0117

Existing design sites in the criterion space after 20 iterations
Citation: Open Physics 15, 1; 10.1515/phys-2017-0117

Existing design sites in the criterion space after 45 iterations
Citation: Open Physics 15, 1; 10.1515/phys-2017-0117
5 Solving the new TEAM problem
The problem is specified as follows: given the current density J within the coil, and prescribed flux density, find the optimal r distribution of radii r(z), —d ≤ z ≤ d that yields the prescribed flux density B0(z).
An initial arrangement of turns was given in the extended paper of [7], the width of each turn w and the height h are 1 mm and 1.5m m, respectively.

Example of radii distribution; geometry and magnetic flux lines
Citation: Open Physics 15, 1; 10.1515/phys-2017-0117
The model consists of 20 turns connected in series, symmetrically distributed, hence there are 10 radii which need to be optimized (the main objective f1). Two additional objectives were proposed to complement the first objective f1. The three objectives f1, f2 and f3 may be described as follows:
f1: find the optimal distribution of r, so that the discrepancy between the prescribed flux density B0 and the actual induction field B is minimized;
f2: minimize the sensitivity function;
f3: minimize the power loss related function.
Mathematically the three objective functions are expressed as
where B+ = B (r (ξl + Δξ), zq), B— = B (r (ξl — Δξ),zq), l = 1, nt and q = 1, np. Δξ = 0.5 mm. At this stage it was suggested to consider only two objectives at the time, f1 and either f2 or f3.
The optimization results are illustrated by Figs. 5 and 6, where objectives 2 and 3 are plotted against objective 1, respectively. The globally optimal points A and B (defined by the closeness to the respective utopia points) are defined by the radii distributions [11.4, 8.6, 9.1, 12.1, 8.9, 8.3, 7.0, 6.4, 6.8, 5.9] and [7.2, 10.6, 7.2, 6.6, 9.0, 5.2, 9.2, 5.0, 5.4, 6.9], respectively.

Pareto front of f1 and f2 in the objective space
Citation: Open Physics 15, 1; 10.1515/phys-2017-0117

Pareto front of f1 and f3 in the objective space
Citation: Open Physics 15, 1; 10.1515/phys-2017-0117
6 Conclusion
A novel approach to kriging-based multi objective optimization is put forward relying on the Localized Probability of Improvement. For illustration purposes a bi-objective test problem is provided, as well as the recently introduced TEAM benchmark problem. It is shown that the proposed method addresses efficiently both the diversification and uniformity of the Pareto solution, is computationally efficient and is linearly scalable to higher number of objectives.
References
- [1]↑
Schaffer J.D., Multiple objective optimization with vector evaluated genetic algorithms, Proceedings of the 1st International Conference on Genetic Algorithms, 1985, 93-100
- [2]↑
Deb K., Agrawal S., Pratap A., Meyarivan T., A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II, in Schoenauer M. et al. (eds), Parallel Problem Solving from Nature PPSN VI, 2000
- [3]↑
Parsopoulos K.E., Vrahatis M.N., Particle swarm optimization method in multiobjective problems, SAC’02 Proceedings of the 2002 ACM Symposium on Applied Computing, 2002, 603-607
- [4]↑
Marler R.T., Arora J. S., Survey of multi-objective optimization methods for engineering, Structural and Multidisciplinary Optimization, 2004, 26, 6, 369
- [5]↑
Zitzler E., Thiele L., Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach, IEEE Transactions on Evolutionary Computation, 1999, 3, 4, 257-271
- [6]↑
Knowles J., Corne D., On metrics for comparing nondominated sets, Evolutionary Computation, Proceedings of CEC’02, Honolulu, 2002, 711-716
- [7]↑
Barba P.D., Mognaschi M.E., Song X., Lowther D.A., Sykulski J.K., A benchmark TEAM problem for multiobjective Pareto optimization of electromagnetic devices, IEEE Transactions on Magnetics, 2017, PP, 99