Topological indices for random spider trees

: In this study, we characterize the structure and some topological indices of a class of random spider trees ( RSTs ) such as degree - based Gini index, degree - based Hoover index, generalized Zagreb index, and other indices associated with these. We obtain the exact and asymptotic distributions of the number of leaves via probabilistic methods. Moreover, we relate this model to the class of RSTs that evolves in a preferential attachment manner.


Introduction
Initiated in 1736 by Euler and developed in the 19th century by the Englishmen A. Cayley and J.J. Silvester, graph theory has become a very powerful practical and theoretical tool (Abbas et al., 2021a,b;Afzal Siddiqui et al., 2021;Ahmad et al., 2022;Alatawi et al., 2021;Imran et al., 2021;Nadeem et al., 2021;Raza et al., 2021Raza et al., , 2022Zuo et al., 2021). A graph G is determined by two sets (V, E), the set of nodes and edges. The edges and nodes are interpreted according to the problem to be modeled. Highlighting the trees as a very important and studied family of graphs, which from its origin has proven to have many applications in different areas. In mathematical chemistry, trees are used to characterize the molecular structure of chemical compounds; in this context, the nodes represent the molecules and the edges the chemical bonds (Kier and Hall, 1986). One relevant class of trees for chemical studies are the trees with a given number of pendants. A node is called pendant if it has degree 1. Ducoffe et al. (2018) proved that the trees with n pendants (n 3 ≥ ) that maximize the modified first Zagreb connection index must be spider trees or double stars. On the other hand, Shiu (2008) reported that spider trees are used to study hexagonal systems that model benzenoid molecules and unbranched catacondensed benzenoid molecules.
The structural information of a graph can be represented in different ways: matrices, polynomials, topological indices, etc. The topological indices quantify the structural information contained in the graph and are independent of the numbering of the nodes and edges; hence they are called topological. The theoretical and practical interest of topological indices have experienced explosive growth from its introduction, resulting in countless papers published that are able to position them as a useful tool in multiple practical problems of computer science (Gutman et al., 2018), physic (Estrada, 2010), ecology (Pineda-Pineda et al., 2020), and chemistry (Kashif et al., 2021;Rao et al., 2021;Reždepović and Furtula, 2020;Shao et al., 2022). As a summary, the first research in this area appeared in the report by Wiener (1947) giving rise to the now well-known Wiener index to analyze and correlate the physicochemical properties of alkanes. In 1971, Haruo Hosoya continued the research on topological indices by introducing the Z index of Hosoya (Hosoya, 1971). On the other hand, the Zagreb index appeared for the first time in Gutman and Trinajstić (1972). Then, it was defined by Randić (1975), the Randić index, considered possibly the most studied and applied topological index at present, giving way to generalizations such as the "molecular connectivity indices," introduced by Kier et al. (1975).
In the development of applications, it has become natural to conclude that random graphs are an appropriate and useful tool to analyze phenomena that evolve over time, since many important characteristics are difficult to capture using deterministic models. In this sense, it is important to mention that some works perform studies of topological indices on random graphs. For a better treatment we refer interested readers to Aguilar-Sánchez et al. (2021), Kazemi (2021), Li et al. (2021aLi et al. ( , 2021b, Martínez-Martínez et al. (2020), Pegu et al. (2021), and Zhang and Wang (2022). In particular, motivated by the substantial increase in interests in random tree models and considering the arguments put forward in the previous paragraphs, in this article, we considered a class of spider trees that are incorporated with randomness, called random spider trees (RSTs), and we investigated several useful topological indices of this random class, including degree-based Gini index, degree-based Hoover index, generalized Zagreb index, and other indices associated with these. Specifically, a central limit theorem is developed for the asymptotic distribution of the number of leaves in an RST.
Notation: denotes the set of real numbers. The expected value and the variance of a random variable X in (Ω, ℱ, P) are denoted as E(X) and V(X). On the other hand, X ⁓ F means that the random variable X has a distribution function F and M X denotes the moment generating function of the random variable X. n p Bin 1, ( ) − represents a random variable with binomial distribution with parameters n 1 0 − ≥ and p in [0,1]. p Ber( ) represents a random variable with Bernoulli distribution with parameter p in [0,1]. N μ σ , 2 ( ) represents a random variable with normal distribution where μ in is the mean and σ 0 2 > is the variance of the random variable. χ λ k , 2 ( ) represents a random variable with noncentral chi-squared distribution where k 0 > is the degrees of freedom and λ 0 > is the non-centrality parameter. For probabilistic convergence we use → P to denote convergence in probability and → D to denote convergence in distribution. Let r 0 > , → L r denotes convergence in r-mean. Given two real-valued functions f x ( ) and g x 0 , if there exists a positive real number N and a real number

RSTs
A spider tree is a connected tree with a centroid of degree of at least 3. All the remaining nodes are classified into two categories: internal nodes of degree 2 and leaves of degree 1, respectively. Thus, except for the centroid, all the nodes in a spider tree have degrees of at most 2. The class of RSTs considered in this study evolves in the following way: at time 1, an RST starts with a seed graph containing a centroid and three leaves. At each subsequent stage, the leaves and centroid will be able to recruit new nodes (at time n): 1) The centroid will be selected with probability p, p 0 1 < < . 2) A leaf will be selected with probability p L 1 , Note that only the centroid and leaves are qualified for recruiting new nodes. If the centroid is selected, a new leaf is attached to it; if a leaf is selected, a new leaf is attached to the (selected) leaf, and the recruiter is converted to an internal node. Finally, we have that at each stage the generated graph is a spider tree with n 3 + nodes.

Leaves
In the following, L n denotes the number of leaves in an RST at time n, with n 1 ≥ . For n 2 ≥ , by the construction of the model it follows that P L L L p 1 .  Proposition 1. For n 1 ≥ and p 0 1 < < , the following statements hold: , .

A class of RSTs that evolves in a preferential attachment manner
In a very recent article (Ren et al., 2022), the authors inspired by the seminal paper (Barabási and Albert, 1999) introduced a class of RSTs that evolves in a preferential attachment manner as follows. At time 1, an RST starts with a seed graph containing a centroid of degree 3 and 3 leaves. At each subsequent point, the probability of a qualified node recruiting a newcomer is proportional to its degree. If the centroid is selected, a new leaf is attached to it; if a leaf is selected, a new leaf is attached to the (selected) leaf, and the recruiter is converted to an internal node. Consequently, for n 2 ≥ , where v is a qualified node at time n, I v n , indicates the event that node v is chosen as recruiter at time n, deg i n , 1 − is the degree of a node i at time n 1 − and Q n 1 − denotes the set of qualified nodes at time n 1 − . Then, for n 2 ≥ , it follows that: 1) The probability that the centroid recruits a newcomer at time n is 2) The probability that a leaf recruits a newcomer at time n is L 1 2 n 1 − . Therefore, we can conclude that the class of RSTs that evolves in a preferential attachment manner (preferential model) is the model presented in Section 2 with p 1 2 = .

Topological indices
The purpose of topological indices is to study the structural properties associated with a graph and its invariants using a certain numerical value. The idea of capturing the information in numerical form is to be able to compare the graphs according to the property to be studied. Let G = (V, E), then many important topological indices (TI(G)) can be defined as follows: ∞ and deg v is the degree of a node v. In Section 3, we will study the indices that satisfy Eq. 1 in the model introduced in Section 2. At each stage, the generated tree has three types of nodes, centroid, leaves, and internal, for which their degrees are L n , 1 and 2, respectively.
Proposition 2. Let TI n be the value of the topological index at stage n. For each n 1 ≥ , we have Proof. Note that I L n 1 3 n n + + = + , where I n is the number of internal nodes in the tree at stage n, it follows that: By Eq. 2, we immediately get the mean and the var-

Generalized Zagreb index
Zagreb index was introduced by chemists Gutman and Trinajstić (1972). Later, some of its general mathematical properties were pointed out and its relationship with other quantities of interest in chemical graph theory was shown (Gutman and Das, 2004). In fact, Zagreb index and its variants have been used in the studies of quantitative structure-property/activity relationships (QSPR/QSAR) (Devillers and Balaban, 1999;Khadikar et al., 2001;Sardana and Madan, 2002), while the overall Zagreb indices exhibited a potential applicability for deriving multilinear regression models. Nowadays, as an indicator of its importance, the ideas outlined in the initial paper are explored by numerous other scholars (An, 2022;Filipovski, 2021;Milovanović et al., 2021). At time n 1 ≥ , taking h x x ( ) = and α ϵ in Eq. 1, we obtain the generalized Zagreb index (Z n g ). According to Eq. 2, Proof. We will get the proof via mathematical induction on α. First, observe that the point (2) in Proposition 1 may be simplified by defining a new variable u t p 1 . Thus, for the base, that is We assume that the statement holds for all α, i.e., , which completes the proof. □ A special case of Proposition 4 has the following result, which is valid when t 0 = in Eq. 4.

Zagreb index
At time n 1 ≥ , taking h x x ( ) = and α 2 = in Eq. 1, we obtain the Zagreb index (Z n ). According to Eq. 3, we have We can obtain the moments of Z n by Eq. 6. Clearly, for n 1 ≥ ,  By the well-known normal approximation of noncentral chi-squared distribution (Severo and Zelen, 1960), it is obtained that  (2) The proof can be verified similar to that of Corollary 2.
We conduct a numerical experiment to verify point (1) of Proposition 5 with k 0 = , developed in this section. Given a fixed pϵ 0,1 , ( ) we independently generate 5,000 replications of RSTs after n 10,000 = evolutionary steps. For each simulated RST, its Zagreb index is computed, then the sample data are formed by 5,000 Zagreb indices from independent simulated graphs. The histogram of the sample data with a normal approximation curve is given in Figure 1 for p = 0.3, 0.5, and 0.7, respectively. We further confirm the conclusion via the Shapiro-Wilk normality test, which yields that the p-value equals 0.070, 0.365, and 0.469 for p = 0.3, 0.5, and 0.7, respectively.

Gordon-Scantlebury index
Defining S n as the Gordon-Scantlebury index at time n 1 ≥ , which verifies that Z S E 2 n n n ( ) = + (Nikolić et al., 2003) where E n is the number of edges at time n. The tree generated by the model at time n has n 3 + nodes and n 2 + edges, thus S n 2 n Z 2 n = − − . For n 1 ≥ , we get the following proposition: when n goes to infinity.

Platt index
Let P n denote the Platt index at time n 1 ≥ , which verifies that P S 2 n n = (Nikolić et al., 2003). Thus, we obtain the following proposition: 3) For all k ∈ , N 0,1

Forgotten index
At time n 1 ≥ , taking h x x ( ) = and α 3 = in Eq. 1, we obtain the forgotten index (F n ). According to Eq. 3, we have F L L n 7 8 2 .
According to Corollary 2, we obtain the following result:

Degree-based Gini index
Recently, a degree-based Gini index for general graphs was proposed by Domicolo and Mahmoud (2020). This index is a topological measure on a graph capturing the proximity to regular graphs. Ren et al. (2022) considered the degree-based Gini index introduced by Domicolo and Mahmoud (2020), with slight modifications. In this section, we will study the degree-based Gini index defined by Ren et al. (2022). By definition, the degree-based Gini index of a graph within the class of RSTs at time n 1 ≥ is given by where v* is an arbitrary node of a randomly selected graph from the class of RSTs and V n denotes the node set at time n. We take E G n ( ) as the degree-based Gini index of the class. Due to the characteristics of the model, Proof. By Chebyshev's inequality (Gut, 2005), we have . Note that f is strictly increasing, consequently, if we want a more regular class we must choose smaller values of p, since Domicolo and Mahmoud (2020) showed that a smaller value of degree-based Gini index suggests more regularity of a graph or a class of graphs, which makes sense in this case since the center would have a lower degree. Specifically, for p 0, 1 2 ( ) ∈ , we have that the class of RSTs that evolves in a preferential attachment manner is relatively less regular than the class of RSTs studied in this work ( Figure 2). b) Domicolo and Mahmoud (2020)  ( ) ≤ < , then the class of uniform binary and binary search trees are relatively less regular than the class of RSTs studied in this work for p 0,1 10 4 ( ) ∈ − ( Figure 2). c) Zhang and Wang (2022) concluded that the degreebased Gini index for the class of random caterpillars , we conclude that the class of RSTs is more regular than the class of random caterpillars of Zhang and Wang (2022) (Figure 2).

Degree-based Hoover index
Zhang and Wang (2022) proposed a degree-based Hoover index for graphs analogous to the degree-based Gini index introduced by Domicolo and Mahmoud (2020) as a competing measure for assessing graph regularity. In our context, at time n 1 ≥ , the degree-based Hoover index of a graph within the class of RSTs (H n ) is defined as follows: where V n denotes the node set at time n. In a similar way, we take E H n ( ) as the degree-based Hoover index of the class. The same analysis applied in Section 3.2 is used in this section and we obtain the following results.  ( ) → when n goes to infinity.
Remark 2 a) In view of Proposition 9 we define f p p , 0,1 then f 1 is strictly increasing. Since it is shown by Zhang and Wang (2022) that a value closer to 0 suggests that the graphs in the class tend to be more regular, by an argument similar to Remark 1a we get the same behavior as the degree-based Gini index studied in Section 3.2. b) Moreover, it is observed that f p f p 1 ( ) ( ) < for p 0,1 ( ) ∈ . Then, the degree-based Hoover index of the class of RSTs presented in Section 2 is less than the degree-based Gini index of the same class when n goes to infinity. c) Zhang and Wang (2022) (Section 3) concluded that the degree-based Hoover index for the class of random caterpillars is 1 2 as n → ∞. Since f p p for 0,1 1 1 2 ( ) ( ) < ∈ , we conclude that the class of RSTs is more regular than the class of random caterpillars of Zhang and Wang (2022), in these cases via the degree-based Hoover index.

Conclusion
We investigated a class of RSTs, the random variable of prime interest is the number of leaves as time proceeds and we calculated the moment generating function of the leaves and showed that the number of leaves follow a Gaussian law asymptotically. Next we investigated several useful topological indices for this class, including degree-based Gini index, degree-based Hoover index, generalized Zagreb index, and other indices associated with these. Moreover, Proposition 3 and Theorem 1 showed by Ren et al. (2022) are deduced from Proposition 1 of this work taking p 1 2 = / . In similar way, the results displayed in Sections 3.2.1 and 3.2.2 of Ren et al. (2022) are obtained as a special case of the results demonstrated in Sections 3.1 and 3.2, respectively. In particular, we conclude that the class of RSTs that evolves in a preferential attachment manner is relatively less regular than the class of RSTs studied in this work in some cases.