Void-hole aware and reliable data forwarding strategy for underwater wireless sensor networks

Reliable data transfer and energy efficiency are the essential considerations for network performance in resource-constrained underwater environments. One of the efficient approaches for data routing in underwater wireless sensor networks (UWSNs) is clustering, in which the data packets are transferred from sensor nodes to the cluster head (CH). Data packets are then forwarded to a sink node in a single or multiple hops manners, which can possibly increase energy depletion of the CH as compared to other nodes. While several mechanisms have been proposed for cluster formation and CH selection to ensure efficient delivery of data packets, less attention has been given to massive data communication processes with sink node. As such, failure in communicating nodes would lead to a significant network void-holes problem. Considering the limited energy resources of nodes in UWSNs along with the heavy load of CHs in the routing process, this paper proposes a void-holes aware and reliable data forwarding strategy (VHARDFS) in a proactive mode to control data packets delivery from CH nodes to the sink in UWSNs. In the proposed strategy, each CH node is aware of its neighbor’s performance ranking index to conduct a reliable packet transmission to the sink via the most energy-efficient route. Extensive simulation results indicate that the VHARD-FS outperforms existing routing approaches while comparing energy efficiency and network throughput. This study helps to effectively alleviate the resource limitations associated with UWSNs by extending network life and increasing service availability even in a harsh underwater environment.


Introduction
The recent progress in underwater wireless sensor networks (UWSNs) has drawn a great attention because of its extensive real-world applications, for instance marine data gathering, equipment monitoring, pollution monitoring, catastrophe monitoring and prevention, offshore exploration, underwater robotics, and marine military activities [1][2][3]. Sensors in the UWSNs are spread in different depths to collect and forward data to the sink node located on the water exterior. The sink node then further processes these data packets or forward them to the data center [4,5]. In the data forwarding process, each node selects either a candidate node or a set of candidate nodes to transfer data toward the sink node. Based on various priorities and criteria, the candidate nodes are selected as the next-forwarding node. The criteria are the use of decided metrics for the selection of the forwarding nodes based on efficient energy or link quality. Then, the data can be forwarded to the destination by the highest priority node [6][7][8].
The design and development of an underwater-based routing protocol are difficult tasks because of many factors, as the underwater circumstances are characterized by energy limitations, low bandwidth, large propagation delays, and high packet error rate [9][10][11]. Several routing protocols for UWSNs have been proposed with the aim of improving data packet delivery with minimizing energy consumption. In UWSNs, cluster-based routing algorithms are widely preferred because of their acceptable performance in terms of energy conservation [12][13][14]. In a cluster-based routing procedure, a cluster is made by several adjacent sensor nodes, and a sensor node is selected as the cluster head (CH) using the election algorithm. After the collection of sensor readings in their own clusters, the CHs complete the task of data transmission successfully through multi-hop routing [15,16]. The data delivery from clusters to the sink node must be noted as each of the CHs transfers a large amount of data through intermediate nodes, which act as bridges between sink and clusters. However, a repeating data transfer gradually exhausts the energy of intermediate nodes, and this process makes their lives shorter than other nodes and results in the loss of connection. The failure of these bridges results in a disconnection between the CHs and sink, and this phenomenon is called the void-hole problem in sensor networks. This problem results in the reduction in network lifetime considerably. Therefore, several data forwarding approaches have been suggested for the data routing in UWSN for the purpose to minimize such uneven consumption of energy and also to handle the network void problems [17][18][19]. The most recent void-handling solutions are proposed for shallow waters with a limited number of void areas. There are also some critical nodes outside the void-hole area, and these nodes serve as a bridge between clusters and sink node through the nodes inside the void-hole area. The failure of such nodes could disrupt the communication of data among the sink node and clusters. By avoiding the loss of bridge node connection and thereby achieving greater throughput; an effective void-handling technique can reduce the total transmission cost per packet. Therefore, when choosing the next forwarder node, the selection of a routing path that is reliable with a high link quality should be taken into consideration [20,21].
In this context, this study proposes a void-holes aware and reliable data forwarding strategy (VHARD-FS) for UWSNs, focusing on next-hop node selection by covering all of the CHs to deal with the multi-holes environment under varying depth in UWSNs. Further to this, the present study also addresses network stability by considering the node performance ranking index (Node-PRI) metrics, in which node status such as residual energy, node depth, and void-indicator are considered to identify whether the current forwarder is a void node or not. At this stage, CHs transmit data packets via the most reliable and energy-efficient paths.
The remainder of this article is arranged as follows: Section 2 reviews and evaluates the current trends of the related works, whereas Section 3 outlines the important aspects of underwater networks along with assumptions and energy model applied in the proposed strategy. Section 4 provides a detailed description of VHARD-FS, whereas Section 5 explains the outcomes of performance evaluation and simulation of the proposed strategy. The conclusion of this paper and ideas for prospective research are presented in the last section.

Related work
The routing protocols of void-handling techniques can be classified into two classes, namely depth-based and location-based [22,23]. The void node in the location-based class is determined according to the geographical advancement of the neighboring nodes. In other words, when a particular node is unable to connect to its neighbor nodes within the shorter Euclidean distance, it is announced as a void node [24]. Conversely, the void node in the depth-based class is defined according to the depth advancement of the neighboring nodes toward the water surface [25]. The vertical distance from water surface to each node is referred to as depth information. When a node is unable to connect to any of neighboring nodes with lower depth, it is known as a void node. These void nodes required various handling techniques because of the variety of features in these classes. The provision of void-handling and scalable routing services in UWSNs is a challenging task. Therefore, several routing protocols have been proposed to overcome these challenges; this section describes some existing protocols in UWSNs.
Yan et al. [26] proposed a protocol called depth-based routing (DBR) which is not in need of information about the dimensional location of sensor nodes but only needs information about the local depth, and the local depth can be obtained easily with an inexpensive sensor. The main benefit of the proposed protocol is its efficient handling of the dynamics of networks without the need of a localized service. However, there is no recovery mechanism in the DBR protocol. When the packet reaches the void node, it is dropped after a few attempts. Therefore, it generates a local extreme issue in the sparse environment.
The most significant concern for UWSNs routing is energy efficiency; Liu Guangzhong et al. introduced a routing protocol called depth-based multi-hop routing (DBMR) [27]. Multi-hop mode of each node was used for sending packets and hence reduced the cost of communication. The DBMR can be described over two phases called sending packets and route discovery. The multi-sinks of the DBMR help to minimize energy consumption particularly for nodes which are located closer to the sink. While the use of residual energy in weight calculation influences energy consumption balance, the DBMR experiences the lack of the issue of communication void avoidance which is the main drawback of this protocol.
Li et al. [28] introduced an improved version of LEACH protocol called LEACH-L protocol which works over two phases, comprising of the initial and the updated phases. The main difference is that all nodes are updated round by round in the LEACH protocol, whereas in the improved version only a few nodes are updated locally. In this protocol, the CH position changes gradually because the current CH estimates the residual energy of the adjacent node and the node with the highest energy is selected as the subsequent CH. For every iteration in the LEACH protocol, the relative distance between CH and its members can vary significantly. Conversely, in the LEACH-L protocol, the relative distance remains about the same. This can reduce energy consumption and thus the improvised version outperforms the original LEACH protocol, which can be seen as more suitable for UWSN.
Wu et al. [29] suggested a two-tier clustering-based routing (TTCB) protocol for a two-dimensional shallow underwater monitoring approach. The TTCB protocol consists of clustering and routing approaches using a heterogeneous underwater architecture for nodes and base station. As the process is initiated, the first-level CH node is selected. Two regulatory factors according to clustering intervals and nodes energy are considered to ensure reasonable distribution of CHs. Then, the CH eligibility threshold is set based on the initial energy of a node along with the residual energy. In addition, the energy consumption of the last iteration can be considered by CH. When a node crosses energy consumption threshold, it self-elects as CH and broadcasts strong acoustic signals. The TTCB protocol enhances the viability of the network and is therefore claimed to be more appropriate for large numbers of nodes.
The hydraulic pressure-based anycast (HydroCast) is another efficient and reliable pressure-based routing protocol designed for UWSNs [30]. HydroCast has been derived in accordance with two modes, namely greedy-routing and void-handling. The next forwarding node in the HydroCast can be selected using greedy routing mode. For this purpose, each receiving node calculates a link quality metric, and then nodes are sorted according to their depth information. In other words, the closer the node to the sink, the higher priority over others followed by nodes with shorter holding time. Data packet with embedded ID broadcasts via forwarder node to its neighbors' nodes. When the extracted ID is not listed at the receiver nodes, data packets will be discarded. Otherwise, data packets could be forwarded based on calculated holding time. When data packets reach a void node, they can detour to a shallower node, which tackles the communication void problem. The authors of the HydroCast protocol have not considered an energy metrics when handling forwarding nodes, which results in a high network overhead. Further to this, the voidhandling approach in this protocol requires several iterations, which in turn increases energy demand and network overhead. The principal advantage of the HydroCast protocol is that it can address the communication void problem. Table 1 provides a head-to-head comparison of different routing protocols. In this table, we have highlighted several parameters to be measured and compared to have an overall view of such protocols and their main strength sides as well as drawbacks.

Network assumption and definitions
In this section, we highlighted both the key network features and assumptions of the proposed routing strategy as well as the important aspects of the adopted energy consumption model of the acoustic channel and the attenuation effects present in this channel in an underwater environment.

Network structure and assumption
The structure of UWSN in the current form typically consists of underwater sensor nodes (CH or cluster member), sinks, underwater acoustic, radio channels, and monitoring center. The sensor nodes are deployed randomly with different depths to collect oceanographic data. We assume a multiple static sinks network architecture to increase network reliability as well as network throughput. This architecture can also minimize energy consumption with respect to sinks nearby nodes. The static sinks are equipped with unlimited radio frequency and energy acoustic modems. Moreover, using acoustic links, the sink communicates with sensor nodes and uses radio links for connection to the monitoring center. All underwater sensor nodes are outfitted with an acoustic modem to communicate; sensor nodes can generate and relay data packets. The cluster-based architecture for the UWSN is shown in Figure 1, in which CHs manage each of the clusters. Other nodes operate as cluster members and report their sensing results to the CH, which can reach the adjacent CHs in a single-hop manner to forward the collected data toward the sink nodes.
Because of reflection and refraction; shallow water can impact acoustic communication through ambient noises, surface, and multi-path effect in addition to temperature gradients. Therefore, the cluster's size has an inverse relation with noise attenuation; the smaller is the attenuation of noise, the greater the cluster's size and vice versa [31]. For energy balance, we consider that the cluster's size in deep water is larger than the shallow water.

Energy consumption model
The underwater environment's conditions for acoustic communications, including the working frequency range and the signal attenuation, must be considered in establishing the underwater energy model to calculate the specific energy consumption [32]. Equation (1) shows the attenuation of acoustic channel over distance d, for the signal of frequency f.
where k expresses the signal's spreading factor, and A 0 is the normalized coefficient. The propagation geometry is described using the spreading factor (1 ≤ k ≤ 2); for a practical scenario, k is given as (k = 1.5). Moreover, based on equation (2) the q is defined as: where the parameter α is related to the signal frequency f [33], each warrants a different calculation of the α, as shown in equation (3): (3) Supposed f (kHz) and d (km), by equation (4), the transmitting power consumption P t can be obtained as: The factual transmitting power illustrates in equation (4) if the transmitting power is P t . Equations (5)- (7) show the transmitting, receiving, and fusion energy consumption, respectively. The size of transmitted data is l, r is a constant depending on the receiver, and r 0 is the energy consumption that compresses every package. The total node energy consumption is expressed as in equation (8): ( ) = E l r lt, fl 0 total tl rl fl (8) 4 Proposed VHARD-FS strategy VHARD-FS deploys the sensor nodes in a cluster-based underwater environment; thus, the sensor nodes in the monitoring area are organized to configure clusters and assign a CH to carry out the routing process. The proposed VHARD-FS strategy comprises two main phases. The first phase involves the discovery of potential neighboring CHs; in this phase each CH node produces a control packet (Hello_Msg) to detect its potential neighboring CHs. Figure 2 shows the structure of Hello_Msg, which is a set of information that includes a unique ID for source node, residual energy, type of node, node depth, and void-indicator to identify the number of nodes that reside in its transmission range. On the contrary, the second phase commences once a CH node received data from its cluster node members. The second phase includes transmission route discovery based on developed Node-PRI.

Discovery of potential neighboring CHs
The sensor nodes in the VHARD-FS strategy possess a neighbors table to save energy consumption resulting from the over control messages exchanging among network nodes. Therefore, an algorithm for discovering the potential neighboring CHs is required to find the most appropriate neighbor CH as the next forwarder node. Therefore, VHARD-FS could ensure data transferring with high reliability from CHs in the intensive load zone toward the sink node. The procedure for discovering the potential neighboring CHs begins by broadcasting a Hello_Msg from sink node with depth set to 0.
For each CH receiving a Hello_Msg, if the value of Msg_broadcasting is false, the receiver node updates its neighbors table with all neighbors whose packets are received. The neighbors table comprises four particular elements (Node ID, residual energy, node depth, and void-indicator) for each neighbor nodes. This node also updates the Hello_Msg with its own parameter and finally propagates to other neighbors' nodes within its communication range. Otherwise, that is, the Hello_Msg has already been sent by the node, then the node will only update the neighbors table. The previous steps will be repeated until all CHs in the network are covered. Then, the node performance ranking index (Node-PRI) is calculated for each node in the neighbors table depending on three metrics, namely residual energy (E r ), node depth (d th ), and voidindicator (V ind ). The node performance metric represents the nodes' status in terms of their energy resource, presence of communication void, and its depth to the sink node. Formula (9) for estimating the Node-PRI value for node i depends on three involved node-related metrics.
where E r (i) is the residual energy of node i at an instant, and E ini (i) is the node's initial battery energy level. The node performance ranking index based on (E r ) has indicated that the node with highly remaining energy corresponds to top Node-PRI (E r ), which reduces the energy exhaustion probability. The d th (c) is the depth of current node, whereas d th (i) indicates the potential neighbor node's depth i. In addition, R c is the each node's transmission range. The node with a low depth toward the sink node corresponds to high Node-PRI (d th ), which leads to low energy consumption. The V ind (i) is an indicator of the number of nodes within the communication range of node i at an instant, and N is the total number of nodes. The node that has a high number of neighbor nodes corresponds to top Node-PRI (V ind ), which leads to void-hole problem handling and minimal packet loss. α, β, and γ are the values used to satisfy the normalizing criteria of importance that range from 0 to 1. The node with the highest performance ranking index is the most suitable candidate as a next forwarder node among all the potential neighboring nodes. The procedure of discovering the potential neighbor CHs within the node radio range is shown in Algorithm 1.
Algorithm 1. Discovery of potential neighboring cluster heads (Phase I) In this phase, each CH is well aware of all its neighboring cluster heads and can use the information in the neighbors' table to send data packets to the sink node.

Transmission route discovery based on developed Node-PRI
In this process, a group of CHs is considered as the input and the routing path that would transfer the sensed data to be the output of this process. Thus, the proposed VHARD-FS strategy uses the metrics of Node-PRI in the neighbors table, which was created during the first phase for routing path formation. Each CH alerts its neighbors for possible forwarder nodes within its radio range. This condition implies that all nodes can use the neighbors' information to send data packets to the sink. In this phase, the data from the cluster member nodes are collected by the CH. Then, the data are aggregated and forwarded to its potential next node. The node with the highest performance ranking index (Node-PRI) value will be selected as the most reliable next forwarder node by the current node. However, to even energy dissipation and to handle void-holes problem caused by node failure, VHARD-FS monitors the developed performance ranking index of nodes in routing path. When values are lower than set limits, then new routing path formation is admitted. The routes are created by picking the most desirable neighbor in every step. The proposed procedure for this phase is described in Algorithm 2.
Algorithm 2. Transmission route discovery based on developed Node-PRI (Phase II) 1.
The set of CHs initiates a transmission route discovery 2.
For each node i ∈ CHs 3.
If node i can't communicate directly with the sink node 4.
Node i calculate Node-PRI value for each neighbor CHs in its neighbors table 5.
Setup its next forwarder node to the node ID with highest performance ranking index value 6.
Start data packet transmission 7.
Start data packet transmission directly toward sink node 10. End for 11. End For route discovery or creation in the proposed strategy, each node needs to be aware of its neighbor nodes only. Consequently, the information of the whole route between the source and the sink node is not needed. Therefore, it needs less memory as the routing table at each node contains only the information about the nodes within its communication range. Implementing the HARD-FS strategy is a straightforward process as the algorithm does not require complex calculation or requirements. Figure 3 shows the visual representation of the proposed VHARD-FS.

Performance evaluation
In this section, the experimental setup, simulation results, and discussion of the results of the performance evaluation of VHARD-FS are presented. Several performance metrics were used to assess the efficiency of the proposed strategy. The performance of VHARD-FS strategy is evaluated by comparing its experimental results with DBR [26] and DBMR [27], which were also performed in a simulation platform to guarantee that all approaches were operated on the same platform and under the same circumstances and simulation parameters. Furthermore, VHARD-FS was tested and validated to prove its effectiveness on energy efficiency, void-holes problem handling, and reliable data forwarding.

Simulation setup
The performance evaluation has been done using the data from the MATLAB simulator with a multiple-sink model of conventional methods. There are three sinks deployed on the water's surface, and 400 sensor nodes were deployed randomly in a three-dimensional space of 500 m × 500 m × 500 m for 1,000 s of simulation. The simulated work is performed five times and uses the average of the readings to produce graphical results. The main simulation parameters and their values are abstracted in Table 2.

Experimental results and discussion
In this work, the performance of VHARD-FS, DBR, and DBMR is evaluated in terms of total energy consumption of the network, dead nodes (stability period), throughput, control packets overhead, data delivery delay, and average packet loss. Figure 4 shows the simulation results regarding the network's total energy consumption per node among VHARD-FS, DBR, and DBMR approach. The total network energy consumption is given by accumulating the energy consumed by each node during the network lifetime. Figure 4 clearly shows that the DBR approach records a higher energy consumption than the proposed VHARD-FS. The highest energy consumption in DBR has resulted from the transmission of more redundant packets. The energy consumption of the proposed VHARD-FS strategy is approximated to DBR and DBMR at the first 200 s because of the need for further energy to the initial phase represented by discovering and establishing a link with a potential neighbor. However, the proposed routing strategy (VHARD-FS) consumes less energy over network simulation time, and such superiority is attributed to considering the node performance ranking index (Node-PRI) including residual energy, node depth, and void-indicator selecting the energy-efficient and reliable paths. Figure 5 shows the stability period of VHARD-FS, which effectively increases the network stability period by reducing the number of dead nodes and outperforms both DBR and DBMR during every time step in a simulation. The stability period in both DBR and DBMR is influenced by considering the depth metric only, which leads to the premature death of the low depth nodes because of the high data forwarding rate on these nodes. The superiority of the VHARD-FS strategy in terms of the network stability period has  attributed to its consideration of the node performance ranking index. Therefore, it balances energy consumption and lowers the number of dead nodes in the network. Figure 6 shows the network throughput, which is computed as the total number of packets received successfully at the sink node in a specific period. It is seen that a high throughput was produced by the DBR and DBMR during the first 400 s. This result is because of producing a large number of redundant data by these approaches. On the contrary, the VHARD-FS strategy reduces the number of data packets because of adopting the clustering mechanism, which results in a high aggregate rate by CH nodes. However, with network simulation time passing, the results have proved that the throughput of VHARD-FS is showing stable throughput which is clearly better than DBR and DBMR which are declining with increase in time. However, with network simulation time progression, the results show that throughput of VHARD-FS shows clearly better stable throughput than DBR and DBMR which suffer from failure of some of its nodes and void-holes in the network. Figure 7 shows the number of control packets required to establish and keep the routing structure between the source nodes and sinks. As shown, the experimental results on the first 200 s showed that the numbers of control packets required in DBR, DBMR, and VHARD-FS were relatively similar. As the simulation time progresses, the VHARD-FS requires fewer control packages, the control packets required by     Void-hole aware and reliable data forwarding strategy for UWSNs  575 VHARD-FS to send the data from the source nodes toward the sinks were 35.22 and 23.51% less of the control packets used by DBR and DBMR, respectively. Figure 8 shows that the VHARD-FS has obtained a relatively low end-to-end data delivery delay across the simulation time. The calculation of delay covers the duration from the production of the packet at a sensing node until the packet is delivered successfully at any of the sinks. The void area in the routing path of DBR and DBMR approaches can dramatically increase the end-to-end delay in the network duty. The improved performance of VHARD-FS in terms of end-to-end data delivery delay is mainly attributed to the void-handling metric on next-hop selection, which covers all bridge nodes between the CHs and the sinks.
The average number of packets lost because of collisions and void-holes occurrence during packets transmissions has been also analyzed to compare VHARD-FS, DBR, and DBMR. In Figure 9, the VHARD-FS generated the minimum average packet loss rates during the network lifetime, which ranged from 0.02% to 0.18%. By contrast, the other approaches have generated 0.09-0.38% average packet loss rates as VHARD-FS avoids the void-holes by continuously tracking the existing path starting from source and ending to the destination.

Conclusion
Although most of the existing cluster-based routing approaches developed for UWSNs could reduce the total network energy consumption, these approaches have not considered the network isolation caused by the void-holes problem, which is defined as the isolation of the sink caused by the power exhaustion of the bridge nodes leading to the sink node. This study has presented a novel VHARD-FS between the CHs and the sink node, with the aim of reducing network energy consumption as well as preventing node failures by addressing the void-holes issue. The VHARD-FS strategy is specifically designed to address the energy holes problem through the delivery of data packets from CHs to the sink node using a forwarder node performance ranking index. According to our simulation analysis and results, we can claim that the VHARD-FS strategy is a promising solution for the void-holes issue while also extending the network stability. Further investigation to define the optimum values of selection parameters using optimization techniques can also improve the use of performance ranking index based on various objectives.