Mathematical morphology based on stochastic permutation orderings

: The extension of mathematical morphology to multivariate data has been an active research topic in recent years. In this paper we propose an approach that relies on the consensus combination of several stochastic permutation orderings. The latter are obtained by searching for a smooth shortest path on a graph representing an image. This path is obtained with a randomized version nearest of neighbors heuristics on a graph. The construction of the graph is of crucial importance and can be based on both spatial and spectral information to enable the obtaining of smoother shortest paths. The starting vertex of a path being taken at random, many different permutation orderings can be obtained and we propose to build a consensus ordering from several permutation orderings. We show the interest of the approach with both quantitative and qualitative results.


Introduction
Mathematical morphology (MM) is a nonlinear image processing framework with many applications in filtering, segmentation and classification [19,37]. The classical construction of algebraic morphological operators relies on complete lattices [35] that impose the need of an ordering relationship between the elements to be processed. Within this theoretical background, morphological operations can be described as mappings between complete lattices. A complete lattice (T, ≤) is a non-empty set equipped with a total (or a partial) ordering relation [2,11], such that every non-empty subset P of T has a lower bound ∧P and an upper bound ∨P. Total orders are generally preferable to avoid the appearance of false colors, even if, as mentioned by Chevallier and Angulo [11], they can introduce irregularity issues. In this context, images are modeled by functions mapping their domain space Ω, into a complete lattice T, and are expressed as f : Ω ⊂ Z l → T ⊂ R n where l is the image dimension, n the number of channels, and T is a non-empty set of vectors. If MM is well defined for gray scale functions, there exists no general admitted extension that permits to perform morphological operations on vectors since there is no natural ordering of vectors [4].
To cope with this well-known issue, several approaches have been proposed (see [3,23,43,44,47] and references therein that provide a recent overview of the field of multivariate mathematical morphology). Some recently proposed ones define an ordering relation between the vectors of a set T with the use of h-orderings [18]. This corresponds to defining a surjective transform h from T to L where L is a complete lattice equipped with the conditional total ordering [18]. We refer to ≤ h as the h-ordering given by: (1) Then, T is no longer required to be a complete lattice, since the ordering of T can be induced upon L by means of h [3]. When h is bijective, this corresponds to defining a space filling curve [9] or equivalently a rank transform [26].
Of course there are many other alternatives to h-orderings for establishing a total ordering on a multivariate domain. For colors defined in specific color spaces, many approaches have been proposed. In the CIELAB color space, Valle and Valente [41] have proposed a total order based on the distance between colors and their relative positions. Bibiloni et al. [6] have proposed a fuzzy approach for multivariate images that totally order the colors of an image in a specific neighborhood. For colors spaces that contain a hue information, Burgeth and Kleefeld [8] have propose to represent colors as tensors in the HSL color space viewed as a bi-cone manifold. For the more general case of multivariate images such as multispectral or hyperspectral images, some recent approaches have also been proposed. A good overview can be found in papers by Deborah et al. [13,14]. They provide a state-of-the-art narrowed to multivariate ordering relations that are directly extensible to the spectral domain. In particular, they argue that to obtain a valid multivariate ordering in the spectral domain, the characteristics of the data must be taken into account. In our case this is typically what we want to avoid and we want to define orders that can be applied to any kind of data in an unsupervised manner without any assumptions (of course this has drawbacks as explained in [14]). In this case, the distance metric used to compare multivariate vectors can be of prime importance, as stated in [22]. In contrast, very few recent approaches have also considered the challenging case of patches associated to pixels. The interest of such a representation has been revealed with the advent of the nonlocal means filter [7]. Some works have proposed to extend morphological processing with patches from non-flat structuring elements [36], adaptive PDEs [40], nonlocal means extension in the max-plus algebra [45], manifold learning [23], to spatially-variant morphological operators [38]. For patches, the use of a priori information is much more difficult to express than for a specific spectral domain. In this paper, we are particularly interested in patch-based morphological processing.
Most approaches for ordering vectors (including h-orderings) have focused on building an ordering on the definition domain of the spectral vectors of the set T, without taking into account how the corresponding vectors are located in the image support. This provides a general framework, but it can be interesting to find the best order for a given image rather than looking for the best order in general, as argued in [11]. Recent h-ordering approaches have searched to define total orders that are adapted to only the set T of spectral vectors of an image [23,43,46]. In addition, this adaptation can consider also the spatial organization of the spectral vectors in the image. Indeed, some vectors of the set T will never have to be compared since they will not appear altogether within a given structuring element. Consequently, having an ordering that considers spatial proximity in addition to spectral proximity can be of interest. Some works have considered that trend of spectral-spatial ordering with the use of hierarchical clustering [10], binary partition tree [42] or local Hamiltonian path [27].
In this paper we propose to investigate the use of stochastic permutation orderings that are obtained by constructing greedily a global Hamiltonian path on an image, starting from a randomly selected pixel. Permutations ordering have recently been considered in [11,42]. In the approach [42], the authors propose to construct permutation orderings from a binary partition tree constructed on the image. Then among all the possible permutation orderings induced by the tree, they select one ordering that optimizes a criterion based on dissimilarity. A similar approach is explored in [11] with a top-down hierarchical clustering, but it is much more computationally demanding and only a subset of the set of vectors T is considered. The advantage of these approaches is that the ordering can take into account both spatial and spectral information by means of the induced tree, whatever its construction is top-down or bottom-up. Our approach will also construct several permutation orderings. However, instead of selecting the best one according to a given criterion, we propose to combine all of them into a consensus ordering that makes the most of all the permutation orderings. We will compare the obtained orderings regarding their smoothness. In addition, as we will built the stochastic permutation orderings on graphs representing images, we can easily take advantage of both spec-tral and spatial similarity by varying the graph construction, as explored in [42] and in our preliminary work [25] that we extend here.
The paper is organized as follows. In section 2 we present the construction of a global Hamiltonian path on an image with a one-dimensional permutation ordering on graphs. We study the induced orders in terms of ordering smoothness and of processing differences. The influence of the underlying graph is also considered to account for the interest of using spatial-spectral graphs. In section 3, as the use of an arbitrary stochastic order is not satisfying, we propose to combine several of them into a single consensus ordering with different ranking aggregation methods. We study the latter and show their interest for processing multivariate images with colors or patches. In section 4 we present some results and comparison with the state-of-the-art, as well as how to use several stochastic permutation orderings to construct a watershed-based segmentation. Last section concludes.

Hamiltonian path
We consider that the domain Ω of the image is a graph G = (V, E) where vertices V = {v 1 , . . . , vm} correspond to pixels and edges e ij = (v i , v j ) connect vertices. The classical graph used to represent images is a grid-graph where pixels are connected with 8-adjacency. However, as we will explore it in the sequel, it can be interesting to go beyond this classical grid-graph construction and to consider graphs of the higher connectivities (either by taking into account spatial or spectral similarity, or even both). The notation v i ∼ v j is used to denote two adjacent vertices. With this definition, images are represented as graphs signals [24] that associate vectors to vertices and are defined by the mapping f : G → F ⊂ R n where F is a non-empty set of vectors (we will consider only RGB color vectors, i.e., n = 3). To each vertex v i ∈ G is associated a vector v i = f (v i ). The set T = {v 1 , · · · , vm} denotes the vectors associated to all the vertices of the graph (with a row-major numbering of the image graph vertices). Consequently, one has |V| = m. For the sake of simplicity, we consider that each vertex v i of the graph has a specific associated vector v i even if ∃v j |f (v j ) = f (v i ). Therefore, there can be duplicated vectors in T and F ⊂ T. We will use the notation T[i] = v i to denote the i-th element of the set T (which is in fact a multi-set since it can contain duplicated elements).
To construct an image-adaptive h-ordering, we propose to construct a space-filling curve [9] on the graph G (we will consider grid-graphs for the moment). This corresponds to the construction of a global Hamiltonian path on all the image: a path that goes through all the vertices of the grid graph and traverses each vertex only once. This was previously explored in [27] but with local Hamiltonian paths. However, with such a local ordering, the associated dilation and erosion are not strictly speaking morphological operators since they do not commute with the sup. We consider a more general approach and build an order for all the set T in the form of a global Hamiltonian path.
Given the set T, the construction of an Hamiltonian path amounts to define a sorted permutation P = PT of the vectors of T with P a permutation matrix of size m × m. Let σ be a permutation of the index set J = {1, · · · , m}. If σ(i) = j, then P ij = 1 and 0 otherwise. The induced permutation is is the identity. Any permutation is not of interest, and spatial and spectral constraints have to be taken into account [11]. To this end we consider the construction of a smooth permutation. The smoothness of an ordered set is expressed by the Total Variation of its elements: The optimal permutation operator P can be obtained by minimizing the total variation of PT: This optimization problem provides a permutation such that the corresponding Hamiltonian path is the shortest one and is equivalent to solve the traveling salesman problem. This problem being too computationally demanding for large sets, it can be solved using a randomized version of nearest neighbors heuristics as presented in [33,34]. We recall the principle of this algorithm in Algorithm 1. This algorithm starts from an arbitrary vertex and continues by finding the two nearest unexplored neighbor vertices and choosing one of them at random.
After the construction of the permutation, we define the h-ordering from the constructed permutation as h(v i ) = σ(i) and this defines the complete lattice (T, ≤ h ). Given a graph signal f : G → T, a new representation is obtained in the form of the pair (I, P) with I(v i ) = σ(i). When a graph signal is encoded in this way, the spectral information is not directly carried by the index I, but is stored in a separate piece of data called a palette: the set P of sorted vectors. Figure 1 presents the obtained permutation and index from an original image. A drawback of the described ordering is its greediness nature. Indeed, as can be seen in Figure 1, the last part of the permutation is not necessarily smooth. As we will see it in the next sections, this can be problematic for morphological processing, in particular for a dilation.
The original graph signal f can be directly recovered since f (v i ) = P[I(v i )] = T[i] = v i . As exposed in [23,24], from such a representation, morphological operations can be defined and applied to the graph signal. The erosion and dilation of a graph signal f at vertex v i ∈ G by a structuring element B k ⊂ G are defined as: A structuring element B k (v i ) of size k defined at a vertex v i is a subgraph of the processed graph. Given a vertex v i and a radius size k, the structuring element is composed of all the vertices that can be reached from v i in k walks: The number of vertices in a given k-hop neighborhood B k (v i ) depends on the vertex v i when the graph is irregular. However, the associated erosion and dilation are symmetry preserving operators [20]. Moreover, as stated in [46], morphological operators defined using a constructed h-ordering ≤ h inherit the standard algebraic properties of morphological operators [39]. Indeed, as the palette P is built only once and remains fixed [44], the morphological processing operates on the index image I which is a scalar image.

Study of the induced orders
We just have presented an approach to build a complete lattice by the construction of a global Hamiltonian path. However this construction depends on one important thing: the starting vertex, that is taken at random. One question that naturally arises then is: is this really a problem ? We propose to study the induced orders to show that they can be very different and that this can lead to very different results. First we compare the induced permutation orderings in terms of their smoothness, i.e., we compare their ‖P i T‖ TV values with P i a permutation obtained by our global Hamiltonian path approach on the set T. A global Hamiltonian path representing a solution of the traveling salesman problem, the smoothest the path, the "better" it should be.
To study this, we consider the 24 images of the Kodak dataset [21]. We have chosen this dataset because it contains very colorful images. For each image I i , we compute 10 different permutations orderings on its set of colors T i with different starting vertices. This leads to 10 different palettes P k i = P k i T i for an image i with k ∈ [1, 10] and we measure their TV as ‖P k i ‖ TV . We compute then for each image the average and standard deviations of the 10 different paths TV measures. Figure 2 shows a radar view of these average and standard deviation values for the 24 images. First it can be seen that the smoothness varies a lot among the images, assessing the fact that in some images the colors are very different and this complicates the obtaining of a smooth ordering. But the most important point is on the variation of the results. Indeed, as it can be seen, the standard deviation can be very high for some image. This means that starting from an arbitrary vertex has a very strong influence on the quality (in terms of smoothness) of the induced permutation ordering. Now this result is not that surprising and was somehow attended. The next question is: does this influences much the results of the morphological operators if they use two different permutation orderings for the same image? This question is this time of prime importance. Indeed, if two users consider Algorithm 1 to build a complete lattice on the same image, they will potentially obtain very different orderings, as we just have shown it. If the morphological operators using these orderings do not produce similar results, this can be very problematic in terms of reproducibility. So we now study the variations of the results obtained with the standard morphological operators (erosion, dilation, opening, closing) with different orderings on the images of the Kodak dataset. To do so, we compute the result of a given morphological processing with a structuring element B k of size k on an image I i with 10 different permutation orderings. Then we compute all the pairwise mean differences between the 10 different results : ∑︀ i∈ [1,24] ∑︀ i∈ [1,24] |g(I i ) − g(I j )| for a given processing g. This provides 90 × 24 values and we compute an histogram of these values. Ideally for each processing, the histogram should have high peaks for small values of mean differences and the range of the histogram should not grow too much. Unfortunately this is absolutely not the case. Figure 3 presents the obtained result for erosion ϵ, dilation δ, opening and closing with square structuring elements of increasing sizes on a grid-graph. For a given processing, the shape of the histogram should have its peaks close to zero. As it can be seen, this is not the case. In addition, the larger the structuring element is, the most spread out are the results among the different permutation orderings. This confirms that the results obtained by two permutation orderings can be very different. In addition this difference increases when one considers large structuring elements. So this confirms our intuition that taking a specific permutation ordering from a random starting vertex does not allow a good reproducibility of the results. Figure 4 illustrates this visually for an image of the Kodak dataset. Two different orderings are constructed and the results of the morphological processing are very different (the structuring element is a square of size 3 × 3). In particular, for the second ordering, black pixel are not at the beginning of the order and therefore erosion and opening do not produce the classical effect of making the image darker. Dilation and closing results are much closer but still exhibit differences.

Study of the graph influence
In Algorithm 1, there is no restriction on the considered graph. However, in the previous experiments we have restricted ourselves to 8-adjacency grid graphs, but there is no reason to do so. Now we consider several possible graphs for the construction of the permutation. The graph can use only spatial or spectral information, or both.
-B-adjacency graph (denoted G B ): this graph construction connects each vertex v i to all the vertices contained in a square box of size (2B+1)×(2B+1) around v i . 8-adjacency grid graphs do correspond to graphs G 1 . -Global K-Nearest Neighbor graph (denoted G K * ): this graph construction connects each vertex v i to its K nearest neighbors (in terms of spectral distance) within the set of all vertices. To have a fast construction of such graphs, we use FLANN [29,30]  The first graphs G B consider only the spatial proximity information to connect vertices whereas graphs G K * consider only the spectral proximity information and are fully nonlocal graphs [17]. It can be interesting to construct graphs that combine the use of both spatial and spectral proximity to infer the set of edges. For graphs G K * , it is even recommended since their construction cannot ensure that the graph is connected. In this special case, Algorithm 1 will not be able to continue the construction of the path in the vertices of the neighborhood if they have been all explored. This case was foreseen in Algorithm 1, and in the case where |N(v σ −1 (i) )| = 0, meaning that no vertex is available to continue the path, the search of the nearest neighbor is performed on the set of all the remaining vertices J \ I. We will call this a restart in the algorithm. This enables to cope with the problem of having graphs that are not connected (but this can also append on connected graphs if one is trapped in a local minima). Obviously having too many restarts occurring during the execution of the algorithm has a dramatic influence on its computational cost since when a restart occurs a search is performed on the whole set of unexplored vertices, that can be huge. One way to cope with this is to lower the number of restarts and this leads to consider graphs of the higher connectivities. This can be easily obtained by considering the union of the set of edges from the spatially and spectrally connected graphs. Given such a graph construction, with Algorithm 1, a spectral-spatial permutation ordering is then obtained. This can be very interesting since both spatial and spectral proximities are taken into account to build the permutation. This can also reduce the computational complexity and ensure that relevant nearest neighbors are found. In addition, the spectral proximity used to determine the nearest neighbors within the set of adjacent vertices can consider any distance measure. In particular, instead of using a classical L 2 distance between the vectors v i associated to vertices, one can consider patches to obtain a smoother ordering. Figure 5 presents several permutation orderings obtained with different graphs. It can be seen that with G 1 the permutation is not very smooth and the result obtained with a box-constrained graph is much better (graph G 10 ). Finally the permutation ordering can be very different by considering both spatial and spectral Original image  information. Last rows of Figure 5 illustrates this with two permutations built on the union of a 8-adjacency grid graph G 1 and a nearest neighbor graph G 20 * with measuring spectral similarity either with color vectors v i or patches p 3 i of size 3 × 3. As it can be seen, the use of both spatial and spectral information in the graph construction can produce smoother orderings. This also puts forward the importance of the distance measure used in Eq. (2). We have considered here a simple L 2 norm between vectors, but for specific multivariate vectors a more adapted metric could be considered [22].
We confirm these visual results by a more quantitative study, the result of which are shown in Figure 6. We have considered again the images of the Kodak dataset. For each one we have built 10 different permutation orderings using either a spatially-based graph G l , a spectrally-based graph G k * , or a spatial-spectral-based graph G l ∪ G k * with values of k varying between 3 to 21 with an increment of 2 and l = ⌊k/2⌋ (for spatial graphs, k corresponds to the size of the square box). For a given value of k, we compute the smoothness of the 10 permutation orderings (normalized by the number of pixels) for the 24 images and compute the average of these 240 values. We do the same with the number of restarts, that we give in percentage (with respect to the number of vertices of the graph). For spatially-based orderings, both smoothness and percentage of restarts do decrease as the connectivity of the graph becomes higher. For spectrally-based orderings, the average smoothness is much smaller but not the percentage of restart. This means that a large number of restarts do occur when the connectivity of the graph is not high and does not ensure connectedness. The range of the values is much more important here, since even with a G 21 * graph, the percentage of restarts is higher than with a spatially-based graph G 2 , meaning a very huge computational cost. A good compromise is obtained with spatial-spectral-based graphs. The number of restarts is lower than for both spectrally and spatially based graphs. The smoothness is lower than for spatially-based graphs and a little higher than for spectrally-based graphs. We can therefore conclude that spatial-spectral graphs are better suited for the use of our permutation orderings, both for ordering smoothness and computational cost reasons.
As we will see it in the next sections, spectrally-based orderings would be preferable as spatially-based orderings can introduce severe artifacts due to the lack of visual coherence of the induced orderings. Theoretically, if we consider a complete graph in the spectral domain G * * on scalar values (e.g., for a gray scaled image), the constructed ordering will recover exactly the classical scalar ordering if one starts from the darkest vertex. However, the computational complexity of using spectrally-based orderings makes them difficult to use in practice, this is why we will consider spatially-spectrally-based ordering as a compromise. This also enables to ensure that the considered graph is connected, which is important as we have shown it above.  To end these studies on the induced orders, we present the Hamiltonian paths that are obtained on a small image. This is for illustration purposes only but this enables to really clear up the point on the matter. In Figure 7, from an original image and a random starting vertex, we construct one Hamiltonian path from for different graph topologies. The starting vertex, even if taken at random, is set as the same for all graphs. We present the constructed paths and the obtained palettes. This directly confirms what we have shown in Figure  6. With spatial graphs, few restarts can be seen (characterized by edges joining very far vertices) and this is reduced when each vertex is connected within a larger neighborhood. This also enhances the smoothness of the palette. With spectral graphs, the palette is much more smooth, but a lot of restarts can be seen. With spatial-spectral graphs, we have a behavior in between the other two types of graphs. Finally it is interesting to notice that some of the important structures of the image are revealed by the edges of the graph.

Consensus ordering
Algorithm 1 is a randomized process that starts from an arbitrary vertex of the graph. As a consequence, at each run of the permutation ordering, if a different starting vertex is chosen, a different result is obtained. We have seen in the previous section that the produced orders can be very different, even if we can control their quality by adjusting the graph topology. This is problematic because choosing one best permutation ordering among several ones is a difficult task [42]. In this paper, we tackle this problem in another way. In [33,34], the authors have made the most of this by using several permutations, but this was for denoising purposes. We propose to proceed differently and to combine several permutation orderings into a single one. We denote as h i a given permutation ordering and we consider a set of l available orderings. The principle of combining several orderings (that can also be called ranks or permutations) into a single one is known as ranking aggregation, the aim of which being to provide a consensus ranking [15]. Finding the consensus permutation can be formulated as an optimization problem, which is usually NP-hard [16] and heuristic ranking aggregation methods have been proposed. In this paper we consider the following two ones [15] and propose also a new third one adapted to our context.

Instant-Runoff aggregation
This aggregation method determines the ranking position of the item with the fewest votes and iterates on the resting items. The process is as follows. One begins by counting the first place votes according to the rankings with respect to the remaining items with indicates the first place votes according to permutation h i and is defined by The item with the fewest votes is assigned the lowest position in the obtained consensus rank.

Borda count
This methods determines the ranking position of all items simultaneously. It assigns each item v j (j ∈ J) a score B i (v j ) based on the positions provided by the permutations h i (i = 1, . . . , l): According to the scores, Borda count generates a final score using mean aggregation BS(v j ) = 1 The fused ranking is obtained by sorting the aggregated scores of items.

Weighted Borda count
We propose to modify the Borda count score in order to take into account the smoothness of each permutation. Indeed, in our case, we aggregate permutation orderings that represent smooth shortest paths and we would   like to obtain a consensus permutation that is also smooth. To do so, we consider a weighted score defined by where ∇P i (v j ) is a 1D-gradient measured on the permutation ordering from the right and left neighbors of v j in P i . Figure 8 illustrates the principle of permutation ordering combination. Given an original image (appears on first row), five different permutation orderings (P 1 , · · · , P 5 ) have been generated using different starting vertices with a spectral-spatial graph G 15 ∪ G 21 * . The obtained indexes I i and palettes P i of each ordering h i are shown in second and third rows of Figure 8. The obtained permutations can be very different from one to another due to the stochastic nature of Algorithm 1. Last rows of Figure 8 presents the different consensus permutations that can be obtained with Instant-Runoff, Borda count or weighted Borda count. As it can be seen, the Borda count provides a more coherent consensus than the Instant-Runoff, and this is enhanced with the proposed weighted version of the Borda count. In this example we have combined 5 different orders, but a question naturally arises: how many orders do we need to combine to have a consensus ordering that becomes stable (or consensual/coherent)?

How many orders to combine ?
To answer to that question, we have, for all the images of the Kodak dataset, performed the following experiment. We have generated 15 different consensus orderings with a number of combined orders that varies between 3 to 31 with an increment of 2. The number of combined orderings for a consensus ordering h i is therefore 2 × i + 1. Then we have computed the mean of the absolute difference between two consecutive consensus orderings h i and h i+1 to evaluate their proportion of agreement. These values are averaged over all the images of the Kodak dataset: 1 24 ∑︀ j∈ [1,24] |P j i − P j i+1 | with P j i the palette obtained for the j-th image of the Kodak dataset with the consensus ordering h i combining 2 × i + 1 orders. Results are presented in Figure 9. As it can be seen, the difference between two consensus orderings becomes stable only when a minimum of 25 orders are combined. As the Kodak dataset contains a set of very different images (both in terms of colors and spatial structures), we believe that this number of orders to combine could be lower when the set of images to be processed are similar. Indeed, in this case, similar local configurations and colors will be found in the palette and the consensus will be easier to define. Figure 10 presents consensus ordering results for several images of the Kodak dataset with 31 orders obtained on a spectral-spatial graph G 15 ∪ G 21 * and combined with the three presented consensus methods. As evidenced, this is in accordance with our observations on Figure 8, and there is much gain in terms of coherence between each one of consensus, from Instant runoff to weighted Borda count. One thing remains however to be set. Indeed, the usual complete lattices for colors such as the Lexicographic ordering do place dark colors at the beginning of the order and bright colors at the end of it. With our approach we have no control on that aspect (just have a look at the palette of the airplane in the 4-th row of Figure 10). Using the produced palette as is, an erosion will make the image brighter, which is not necessarily what could be wanted. To cope with this aspect, we will systematically reorder the final palette to have its first element having the lowest norm. To do this we simply compare the norm of the first and last colors of the order and if the last color has a lowest norm that the first one, we reverse the ordering. In this paper we have considered this simple approach to establish the extrema but more complex strategies could be considered. In a supervised setting, if the user provides two subsets Bg , Fg ⊂ T such that Bg ∩Fg = ∅, we can try to directly construct h-orderings that satisfy the conditions: h(b) = ∧T and h(f) = ∨T for b ∈ Bg, f ∈ Fg. To do so, a dummy vertex v ′ can be added to the graph and its distances to any vertex v i defined such that f (v i ) ∈ Bg ∪ Fg are set to 0 and to +∞ for the others. The dummy vertex is then removed at the end of the path construction.

Original image
Instant-Runoff I Borda count I Weighted Borda count I Instant-Runoff P Borda count P Weighted Borda count P

Computation complexity
We analyze the computational complexity of our approach with respect to the state-of-the-art Lexicographic total ordering approach. The latter has a complexity of O(n · m · log(m)) due to applying a quicksort on m vectors of dimension n. In our case, our approach has several steps that have the following complexities with a weighted Borda count consensus performed on l orders constructed on a spatially-spectrally-based graph G B ∪ G K * of m vertices described by vectors of size n: -Construction of G B : it has a complexity of O(n · m · K B ) with K B = (2B + 1) 2 − 1 as this corresponds to add edges between each vertex and all the vertices contained in a square box of size (2B+1)×(2B+1). -Construction of G K * : a naïve implementation being too costly, we use FLANN [29] that has a com- )︁ [30]. -Construction of an Hamiltonian path: once the graph is constructed, this has a complexity of O(m(K B + K)), as each vertex has at most K B + K neighbors. -Given l orderings, computing the consensus ordering has a complexity of O(l · n · m) plus the complexity of a quicksort applied on the aggregated fusion scores of O(m · log(m)).

Finally it gives a complexity of O(n
)︁)︁ +O(l·m·(K B +K +1))+O(m·log(m)). Obviously this complexity is higher than for the Lexicographic ordering. Computing the l Hamiltonian paths can easily be done in parallel, but the graph construction is the most costly part of the approach, that can be lowered only with small values of B and K. If we compare this complexity with those of palette reordering methods [32] that also solve an Hamiltonian path problem to order the colors of a palette, one has complexities of O(m 3 ) for Zeng's method [51], of O(m 4 ) for Memon's method [28] and O(m 2 · log(m)) for Battiato's method [5]. For the adaptive ordering proposed by Chevallier and Angulo [11] the complexity is very high and in O(m 3 ). Our approach is therefore comparable with approaches that consider similar problems of color ordering with the minimization of a cost function.

Results
In this section we provide some studies and experimental results of the proposed approach and compare it to state-of-the-art approaches. First we analyze the constructed orderings with respect to other state-of-the-art orderings. This is shown in Figure 11. The following orderings are considered.

Original image
Bit mixing [9] Lexicographic [  -Two ordering relations with a prioritization function: Bit mixing [9] and Lexicographic.
-Our proposed stochastic orderings: the ordering ≤ S is obtained with only one Hamiltonian path whereas ≤ SwBc is the combination of 31 orders with weighted Borda count. Given the studies we have led in the previous section, we will consider orders that are constructed on a spatial-spectral graph G 15 ∪ G 21 * . The bit mixing ordering provides the lowest ordering smoothness value. This was expected as this ordering was specifically designed so as to minimize the topological distortion induced by a mapping from R 3 to R. The lexicographic provides a less smooth order. These two orders are defined in the spectral domain: they do not consider any spatial constraints from the image. If we consider a single stochastic ordering ≤ S based on our proposed Hamiltonian path, one can see that we can obtain an ordering that has a smoothness very close to that of the bit mixing, assessing that the use of spatial constraints to built an ordering can be of interest to built smooth orderings. Unsupervised [23] and supervised approaches [44,47] provide less smooth orderings but the organisation of the colors in the ordering is more visually coherent than the one obtained with a single Hamiltonian path where strong color changes are present (even if this does not affect much the smoothness). When we consider the combination of several Hamiltonian orderings with weighted Borda count ≤ SwBc the ordering is visually close but less smooth, due to the difficulty to efficiently optimize this problem. The use of patches to compare pixels instead of colors can enhance the smoothness of the ordering. What we can conclude is that using one single Hamiltonian path ordering provides orderings close to classical ordering relations with a prioritization function and that the combination of several Hamiltonian path orderings pro-vides orderings close to supervised and unsupervised approaches. However having a smooth ordering does not ensure that the obtained morphological processing of images will be efficient. In Figures 12 and 13 we show the morphological processing (erosion, dilation, opening, and closing with a square structuring element of size 5 × 5) of an image with an ordering ≤ S obtained from a single Hamiltonian path. As it can be seen the results are not satisfying at all for dilation, and combined operators closing and opening. This shows that even if an ordering as a low smoothness values, it is not necessarily suited for morphological processing. We believe this is due to the fact that such orderings are not coherent and : (i) contain many color changes, (ii) the end of the ordering can contain many different colors that were explored at the end of the path due to the greediness nature of the Hamiltonian path construction. This is typically what affects the behavior of the dilation: the supremum between two colors of the ordering is not enough coherent with the rest of the neighboring colors of the ordering. This problem can be attenuated with the consensus orderings presented in Section 3. Figure 12 presents such the obtained orderings. As it can be seen, the smoothness of the ordering increases from instant runoff, Borda count to weighted borda count. We have also measured this on the whole Kodak dataset and found the same result (the average smoothness is: Instant runoff: 19.25, Borda count: 65.80, weighted Borda count: 72.89). However, the coherence of the ordering visually increases conversely to the smoothness and this has a strong influence on the result of the morphological processing. As it can be seen on Figure 12, the instant runoff has enhanced the erosion, closing and opening results but not of the dilation. This is due to the fact that the instant runoff ordering still has similar drawback of color changes and contains many different color at its end. In particular, similar white colors are considered as lower in the ordering than brown colors whereas this the converse at the the end of the palette. This is a strong drawback toward the use of such an ordering. Only the use of the weighted Borda count consensus ordering enables to obtain better results for the dilation, assessing the interest of this consensus. Now that we know that only the results with weighted Borda count consensus ordering are the most satisfying, we compare this approach with classical RGB Lexicographic ordering processing. Figure 13 presents examples of erosion, dilation, opening, and closing with a square structuring element of size 5 × 5 for both these approaches. The results are close visually but there are some important differences. In the original image, there are few black pixels, and the consensus ordering has not placed these black colors at the beginning of the ordering. This results in having the dilation produced by our approach that enlarges black areas. On can remark that the edges are better preserved with the Lexicographic ordering than with our approach, especially for the dilation, but ours produces visually sharper results. As these findings are only visually assessed, we have led a more quantitative analysis. For the Kodak and BSDB datasets, we have measured the values of several quality measures: PSNR (Peak Signal to Noise Ratio [48]), SSIM (Structural SIMilarity [49]), smoothing level (the ratio ‖f − f 0 ‖/‖f ‖ with f the filtered image and f 0 the original), and Tenengrad criterion [12,50]. The PSNR estimates the fidelity to the original image (the higher the better). The SSIM estimates the perceived change in structural information with respect to the original image (the higher the better). The smoothing level estimates the level of simplification applied on the image (the higher the smoother). The Tenengrad criterion value is correlated with perceived sharpness [12] (the higher, the sharper). Figure 14 provides the mean values of these quality measures for the morphological processing with Lexicographic ≤ Lex and weighted Borda count orderings ≤ SwBc . The level of simplification is higher with the ≤ Lex erosion than for the ≤ SwBc erosion and similar for the other operators. The Tenengrad criterion is always higher for the ≤ SwBc , assessing that the results are sharper. The PSNR values are higher for ≤ SwBc with erosion and dilation but lower for closing and opening, and this does not provide a significant difference. However, for the SSIM, the results are always in favor of ≤ Lex assessing less structural changes and a better preservation of the edges. To end up theses results we can conclude that the ≤ SwBc produces sharper results but that less preserve edges than the classical ≤ Lex . In Figure 15 we show openings and closings by geodesic reconstruction with the Lexicographic ordering ≤ Lex and our proposed consensus ordering with weighted Borda count ≤ SwBc . We can see similar defects of our approach than with classical openings and closings in Figure 13. As we had mentioned it previously, as our approach relies on a classical L 2 distance between the vectors v i associated to vertices, this can highly influence the results. In particular, if we compare the colors in the L * a * b * color space instead of the RGB color space, we can obtain a result that is more competitive with Lexicographic ordering.  Figure 15: Comparison of openings and closing by reconstruction with the Lexicographic ordering ≤ Lex and our proposed consensus ordering with weighted Borda count in the RGB or L * a * b * color spaces (≤ RGB SwBc and ≤ L * a * b * SwBc ).
In Figure 16 we provide a comparison with the approach of [42] that is similar in spirit to our approach. Indeed, they have also considered permutation orderings but that are obtained from the leaves of a Binary Partition Tree. Then, they have proposed to select the best permutation ordering among the possible ones with a top-down sorting of the BPT with respect to minimum dissimilarity. As in their paper, we have performed erosion, dilation, opening, and closing with a square structuring element of size 11 × 11. Our approach competes well with their results. This is particularly visible on the dilation and closing results where we have a much better preservation of the boy shape. These results are in favor of considering the combination of several permutation orderings as we have proposed, instead of selecting one over the set of possible ones.
In Figure 17 we provide a comparison with the approach of [11] and classical lexicographic and bitmixing orderings. In [11], Chevallier and Angulo have proposed an order adapted to an image. Their approach consists in the definition of a cost function that measures the quality of the order regarding rank based operators with respect to a topological criterion. As they have shown it on the morphological processing of the original image of Figure 17, the lexicographic ordering produces important aliasing around the blue spots. The bit-mixing ordering is better but still produces artifacts for the closing. With our approach and [11] the results are different but both preserve the regularity of boundaries. The image-adapted ordering proposed in [11] is a spectrallybased one whereas ours is a spatially-spectrally-based one. As a consequence the complexity of the approach of [11] is very high and in O(m 3 ), almost not computationally feasible without the quantization they propose. The authors mention a processing time of 8 minutes for an image of 256×256 pixels to construct the ordering. Our approach takes only 2 minutes for the same image size. This shows that even if our approach has a high complexity, it is competitive with other similar approaches of the state-of-the-art that build an order adapted to a given image.
In Figure 18 we show the interest of modifying the distance used in Algorithm 1. The latter searches the nearest neighbors of a vertex in terms of color distance but other features can be considered. To show the interest of modifying this distance, we consider the use of patches p 5 i to describe a pixel instead of its color v i . As evidenced in Figure 18 this effectively has a strong effect on the obtained results of a closing with a square structuring element of size 5×5. We had already shown in Figure 11 that the use of patches enables to produce smoother consensus orderings. But patches are also known as an efficient texture descriptor and using this patch feature can enable to better preserve the texture while still simplifying the image (see the shirts of the polo players, the grass, and the face of the man in suit, etc.). This is clearly assessed by the PSNR and  [42] Our approach Figure 16: Comparison between the approach of [42], and our approach with a 11 × 11 SE (one approach per line).
SSIM scores. We believe that this is a strong benefit of our approach as there exist very few works that have considered the use of patches for morphological simplification. Indeed if specific orders can be conceived for specific color spaces such as CIELAB [41], this is much more difficult to conceive an ordering for patches. However, other approaches have also considered the use of patches with morphological operators. In [45], the authors have proposed patch-based morphological operators as an extension of non-local means in the max-plus algebra. Figure 19 presents the results of classical MM, non-local morphology with [45] and with our approach. Patches of size 5×5 patches are considered, and the structuring element is a square of size 5×5. As it can be seen, the approach of [45] has much enhanced the background noise in the image (particularly visible for the dilation). Their results preserves well the texture but provides few simplification of the image and acts more like a gamma correction operator. In contrast, our approach has an interpretation that is similar to the classical morphological operators while leveraging the simplification effect to preserve important edges and textures in the image. The gradient image illustrates this and enables a much more accurate extraction of the edges.
To conclude these experiments, we show an innovative use of several permutation orderings for segmentation. This constitutes a novel approach that we will call stochastic permutation watershed since it is related to the stochastic watershed [1]. The latter is based on several segmentations obtained from random markers to build a probability density function using a Parzen estimator. This probabilistic gradient is then combined with the initial gradient to produce a final gradient that can be used for watershed segmentation. The stochastic nature of this approach comes from the use of random markers. We propose to make the most of our approach to conceive a novel stochastic permutation watershed. The principle is the following. We built M stochastic permutation orderings using Algorithm 1. This provides M representations (I i , P i ) with i ∈ [1, M]. Since the orderings can be very different as we have seen in Section 2, the minima extracted from index images I i will not be located at the same position in all images and we can obtain different segmentations from these minima, as in the original stochastic watershed. However in our case the germs are not randomly generated but are the minima of the index images generated from stochastic permutation orderings. Figure 20 presents an image and its original gradient (computed with the Di Zenzo approach [52]) and the minima extracted on several index images from different permutations (they have been dilated by a factor 2 for better visualization). As it can be seen, these minima are not always at the same position and this is due to the stochastic aspect of the permutation orderings. From these minima, as in the approach [1], watershed segmentation are computed from the minima, the segmentation boundaries extracted and averaged with a Parzen estimator. This probabilistic gradient is then combined with the original gradient by computing their mean. The results are shown in first line of Figure 20. The probabilistic gradient enables to well capture the object boundaries in the image and when combined with the original gradient, it provides a more robust gradient estimator. Finally, Figure 21 presents two examples of segmentation with our stochastic permutation watershed, using 11 different permutation orderings. The combined gradient is estimated using permutations based on color or patches enabling to obtain different gradients. As it can be seen, the patch combined gradient has lower gradients in areas where the texture is the same, and this enables to slightly enhance the segmentation.

Conclusion
In this paper we have considered the use of stochastic orderings for mathematical morphology. The latter relies on complete lattices that can equivalently be formulated as space-filling curves. Therefore, we have proposed to build complete lattices as permutation orderings that are obtained by constructing a Hamiltonian path on a graph that connects the pixels of an image. To do so a randomized version nearest of neighbors heuristics on a graph is used. This algorithm starts from a vertex taken at random. We have shown that depending on the starting vertex, the constructed paths can be very different and this has a very strong influence on the results of morphological operators. In addition the underlying graph topology is also of prime importance and we have put forward the interest of graphs that account both for spatial and spectral Original image Classical MM NL MM [45] Our approach Figure 19: Comparison between classical MM, NL MM [45], and our approach with a 5 × 5 SE (one approach per line). First line presents the original image. Columns present respectively an erosion, a dilation and a morphological gradient.

Original image Original gradient Probabilistic gradient Combined gradients
Minima extracted from several stochastic permutation orderings similarities. Instead of choosing an arbitrarily permutation ordering, we propose to make the most of several ones and to combine them into a single consensus ordering by favoring the path smoothness. This consensus ordering needs a certain number of orders to be combined to become stable and we have studied this experimentally. The experimental results as well as their comparison with state-of-the-art algorithms have shown that our approach has some benefits and is competitive with existing approaches. In particular, with a patch-based ordering, we can perform image simplification with texture preservation, Nevertheless our approach has some limitations. The introduction of spatial constraints for the construction of an Hamiltonian path is efficient in terms of smoothness of the ordering and ensures connected graphs, but this does not always produce coherent orderings and affects the performance of morphological operators. This notion of coherence should be more formally defined. Indeed, in palette reordering methods [32], in addition to the notion of smoothness of the constructed ordering path, the entropy of the index image is also considered to ensure that the latter does preserve the level lines of the original image. In addition, it might be therefore beneficial to construct the Hamiltonian paths only in the spectral domain. This will however require further developments as there is a computational challenge to address [31]. We plan to address these issues in our future works.