Matrix Analysis for Continuous-Time Markov Chains

Continuous-time Markov chains have transition matrices that vary continuously in time. Classical theory of nonnegative matrices, M-matrices and matrix exponentials is used in the literature to study their dynamics, probability distributions and other stochastic properties. For the benefit of Perron-Frobenius cognoscentes, this theory is surveyed and further adapted to study continuous-time Markov chains on finite state spaces.


Introduction
The concept and fundamental theory of continuous-time Markov chains are rarely encountered as applications in matrix theoretic textbooks, which is a motivating factor for this article. Similarly to classical Markov chains, the analysis involves stochastic matrices, M-matrices, Perron-Frobenius theory, generalized inverses and graph theory, along with enhancements that allow for time-dependence and non-stationary dynamics. For the benefit of linear algebraists who have not been exposed to this elegant area, we offer a survey of continuous-time Markov chain theory and some further developments, as well as directions and connections that point to future possible contributions.
Markov chains are indeed popular and well-established stochastic modeling tools, where it is often assumed that the transitions among states take place in fixed time intervals (steps) with transition probabilities that are also fixed in time. These assumptions are not always justified, as e.g., ecological systems rarely remain stable under dynamic environmental conditions. Relaxing them can therefore provide a powerful tool for studying fundamental problems of non-stationary dynamics, such as environmental effects of climatic changes.
The effort herein is specifically motivated by stochastic modeling, where the time inhomogeneity of the underlying Markov chain manifests itself in a continuous time-dependent transition matrix P (t) = [p ij (t)], which is row-stochastic for all t ≥ 0 and comprises the transition probabilities p ij (t) among a finite number of states. A specific instance of such an ecological model is proposed in [10] for forest path dynamics among a finite number of biomass stages, where the functional form of the time-dependent probabilities are estimated via sampling and empirical data. When the probabilistic model above presumes a continuous version of the Markov property (memorylessness), the resulting stochastic process is (frequently but not exclusively) referred to as a continuous-time Markov chain (ctMC). Such chains have been extensively studied; see e.g., [5,13,14]. Certain aspects of the analysis of ctMC's on finite state spaces resemble the matrix theoretic approach used in discrete-time Markov chains. However, the Chapman-Kolmogorov conditions apply, giving rise to a semigroup, matrix exponential dependence of the probability distributions of ctMC's. In summary, one can postulate the behavior of P (h) for small h > 0 (i.e., specify the derivatives of the entries of P (t) at t = 0) and then use Markov properties to determine P (t) for all t ≥ 0 as the solution to an initial value differential problem P (t) = P (t)P (0), which is indeed a matrix exponential function of P (0).
The presentation proceeds as follows. We assume that the reader is familiar with the basic theory of (entrywise) nonnegative matrices, stochastic matrices, as well as Markov chains and their association to matrices. Our terminology and notation generally follows [3]. However, basic notation, background and concepts are reviewed in Section 2 and within the rest of the text, as needed. In Section 3, the classical continuous-time Markov chain model is set up based on the exposition in [14], along with the underlying assumptions, principles and main consequences. An entrywise study of the transition matrix of a ctMC is pursued in Section 4; new results include Theorem 4.3, Corollary 4.4 and Theorem 4.5 from [9]. The entrywise dynamics as functions of time are further examined in Section 5, where some new observations are contained [9]: Theorem 5.1, Corollary 5.2 and Theorem 5.3. In Section 6 a study of a ctMC via generalized matrix inverses is initiated [9]. We conclude with a summary and possible future work on the subject.

General notations, definitions and premilinaries
The following definitions, facts and concepts are used in the sequel.
For an n × n real (complex) matrix A ∈ M n (R) (A ∈ M n (C)), the spectrum of A is denoted by σ(A) and its spectral radius by ρ(A) = max{|λ| : λ ∈ σ(A)}.

Digraphs, primitivity and irreducibility
When the vertices are distinct, we refer to the walk as a path from u to v. A cycle of length k in Γ is a path where the vertices i 1 , i 2 , . . . , i k are distinct. We say that vertex u ∈ E has access to vertex v ∈ E if there exists a path in Γ from u to v or u = v. If u has access to v and v has access to u, then u and v are access equivalent; an equivalence class under the relation of access is called an access class. We say access class W has access to vertex u (u has access to W ) if there is a path from some vertex w ∈ W to u (there is a path to u from some vertex w ∈ W ). The subdigraphs induced by each of the access classes of Γ are the strongly connected components of Γ. A strongly connected component of Γ is trivial if it has no edges (i.e., it comprises a single vertex that does not have a loop). Γ is called strongly connected if it has exactly one strongly connected component and that component is not a trivial one.
The transitive closure of a digraph Γ = (V, E) is the digraph Γ = (V, E), where a directed edge from i to j belongs to E if and only if there is path from i to j in Γ.
A digraph Γ is called primitive if it is strongly connected and the greatest common divisor of the lengths of its cycles is 1. Equivalently, Γ is primitive if and only if there exists a positive integer k such that for all vertices i, j of Γ, there is a walk of length k from i to j.
For a nonnegative matrix A ∈ M n (R) and a positive integer k, the (i, j)-th entry of A k is positive if and only if there exists a walk in Γ(A) from i to j of length k.
The digraph Γ(A) is primitive if and only if there exists m such that A k > 0 for all k ≥ m. We refer to this condition as the Frobenius test for primitivity; see e.g., [3,Chapter 2], where this condition serves as the definition of a nonnegative primitive matrix. We say A is primitive if Γ(A) is primitive.
Next, recall that A ∈ M n (C) is called reducible if there exists a permutation matrix P such that Associated with the access classes of Γ(A) is a standard form for a reducible matrix A ∈ M n (C); it is obtained by repeated application of the reduction to block triangular form in the definition of matrix reducibility above. It is known as the Frobenius Normal Form of A: Given A ∈ M n (C), there exist a permutation matrix P and positive integer k ≤ n such that where A 11 , A 22 , . . . , A kk are either irreducible matrices or 1 × 1 zero matrices. Note that A is irreducible if and only if k = 1. Finally, an irreducible matrix A ∈ M n (C) is called cyclic if there exist a permutation matrix P and positive integer h ≤ n such that where the zero diagonal blocks are square matrices. The integer h is called the index of cyclicity of A.

Perron-Frobenius and M-matrices
For general theory and background on the theory of nonnegative matrices (Perron-Frobenius theory) the reader is directed to [3] and [6]. The following variants of the Perron-Frobenius Theorem will specifically be quoted in the sequel. We proceed with more relevant facts. (b) Every nonnegative primitive matrix is irreducible. Also if A ≥ 0 is irreducible and has positive trace, then A is primitive.
(c) Matrix T ∈ M n (C) is called semiconvergent whenever lim j→∞ T j exists. It is known that T is semiconvergent if and only if ρ(T ) ≤ 1, any eigenvalue of T of modulus 1 is equal to 1, and the size of any Jordan block of T corresponding to 1 is 1 × 1.  We will work with group inverses of M-matrices for which [8] is a comprehensive reference.

Stochastic matrices and Markov chains
• Recall that a nonnegative matrix P for all i, j = 1, 2, . . . , n and n j=1 p ij = 1 for all i = 1, 2, . . . , n. Equivalently, P ∈ M n (R) is stochastic if P ≥ 0 and P e = e, where e denotes the all-ones column vector (in R n ).
• The entries of a stochastic matrix in P = [p ij ] ∈ M n (R) are typically associated with the transition probabilities among the n states, s 1 , s 2 , . . . , s n of a Markov chain. The entry p ij represents the probability that the process will next move to state s j , given that it is currently in state s i . The nonnegative k-th distribution vector is the vector n ] T , where for each k = 0, 1, . . ., e T π (k) = 1. The stochastic modeling assumptions dictate that the distribution vectors and the transition matrix P satisfy the iteration The initial distribution is π (0) . Each entry π (k) i measures the probability that the process is in state s i after k steps. If π (0) = π (k) , k = 1, 2, . . . , we refer to π (0) as a stationary distribution of the Markov chain. Every Markov chain has a stationary distribution.
• A Markov chain with transition matrix P ∈ M n (R) and its states can be classified according to the digraph of P , Γ(P ) as follows.
The vertices of Γ(P ) represent and are labeled by the states s 1 , s 2 , . . . , s n . We say s i has access to s j if there is path from s i to s j . When s i has access to s j and s j has access to s i , we say these two states communicate. regular if and only if P is primitive; periodic if and only if P is irreducible and cyclic.

The matrix exponential
The matrix exponential function plays a central role in linear differential systems and, as we shall see, in the analysis of continuous-time Markov chains. We review some of its basic properties below.
The matrix exponential of A ∈ M n (C) is denoted by e A and defined by the power series We will frequently consider the function e tA , where t ≥ 0 will be viewed as a variable representing time. It is well-known that the initial value problem has a unique solution given by Let A, B ∈ M n (C) and let s, t ∈ C. Then the matrix exponential satisfies the following well-known properties: Moreover, if AB = BA, then Be At = e At B and e At e Bt = e (A+B)t = e Bt e At .
Finally, if x is an eigenvector of A ∈ M n (C) corresponding to λ ∈ σ(A), then it is well-known that for each t ≥ 0, x is an eigenvector of e tA corresponding to e tλ ∈ σ(e tA ).
The following classical result (see [15]) is fundamental in the continuous-time Markov chain stochastic model to be subsequently developed.
For the converse, assume e tB ≥ 0 for all t ≥ 0 and by way of contradiction b ij < 0 for some i = j. Then, as t −→ 0 + , the (i, j) entry of e tB is dominated by b ij and is thus negative, a contradiction to the assumption that e tB ≥ 0 for all t ≥ 0.

The continuous-time Markov chain stochastic model
In this section, we follow closely the exposition in [14, Chapter VI, Section 6] (see also [4, Chapter VII, Section 2 ]) by considering a continuous-time Markov chain whose state at time t ≥ 0 is X(t) ∈ {s 1 , s 2 , . . . , s n }. The transition probabilities of this stochastic process from state to state comprise a time-dependent matrix that is, the probability of transition from state s i to state s j in t units of time is independent of the initial time s (stationary).
Furthermore, we assume that the stochastic model satisfies the following conditions for all s, t ≥ 0 and all i, j ∈ {1, 2, . . . , n}: In matrix terms, we thus have that for all s, t ≥ 0, P (t) satisfies where e ∈ R n is the all-ones vector; The Chapman-Kolmogorov equation (3.3) is a natural assertion of the fundamental Markov property that given the value of X(t), the value of X(t + s) for s > 0 only depends on X(t) and not on the value of X(u) for any u < t.
Equation ( Also for 0 < h < t, we have But P (h) is near the identity matrix I for sufficiently small h, so P (h)) −1 exists and also approaches the identity matrix I. Therefore, In fact, P (t) is differentiable at t = 0 in the sense that it follows that P (t) satisfies the matrix differential equation . The solution to (3.6) thus is The dynamics of the continuous-time Markov chain as described above are therefore governed by the Kolmogorov equation (3.6) and its explicit solution in (3.7). Indeed, let z(t) = [z j (t)] ∈ R n denote the state probability vector at time t, i.e., z j (t) = P r(X(t) = s j ), j = 1, 2, . . . , n. (3.8) Remark 3. 1 We note here that the continuous-time Markov process described above also admits a sojourn description as follows: . Starting in state s i , the process sojourns there for a duration of time that is exponentially distributed with parameter −b ii . Then the process jumps to state s j (j = i) with probability −b ij /b ii . The sojourn time in state s j is exponentially distributed with parameter −b jj before it jumps to another state, etc. The sequence of states visited in this process forms a discrete-time Markov chain called the embedded jump chain.

Remark 3.2
For a continuous-time Markov chain as described above, we adopt a classification and terminology for its states at any given time t ≥ 0 based on the digraph of P (t). In particular, we call the continuous-time Markov chain regular if P (t) is primitive for all t > 0.

The transition matrix of a continuous-time Markov chain
We proceed using the notation established in Section 3.
It is of interest to examine the time-dependent state, as well as the asymptotic (as t → ∞) behavior of z(t) and P (t) of a continuous-time Markov chain as functions of B = P (0) (the infinitesimal transitions rates). This task is accomplished by adapting the classical matrix-theoretic analysis of homogeneous Markov chains [3,Chapter 8].
We first observe the special nature of B in (3.7). Recall that e ∈ R n denotes the all-ones vector. Thus, Be = 0, which implies that B is singular. Moreover, we know and so by (4.10), −ĥB = I − A for some nonnegative matrix A = [a ij ], which must therefore be stochastic. Writing nowĥ B = A − I with A ≥ 0, since each row of B sums to 0 and −ĥ b ii = 1 − a ii > 0, we have that for each i = 1, 2, . . . , n, Also, for i = j, we haveĥ b ij = a ij ∈ [0, 1]. Moreover, for each i = j, The following result describes some properties of the digraph of P (t) and   (iv) If P (t) is irreducible for some t > 0, then P (t) is irreducible for all t > 0.
(v) P (t) > 0 (and hence P (t) is primitive) for some t > 0 if and only if B is irreducible.
(vi) If P (t) is primitive for some t > 0, then it is primitive for all t > 0.

Proof.
As in the proof of Lemma 4.1, we have that for some h > 0, h B = A − I, where A ≥ 0 is stochastic.
(i) For each t = ht ≥ 0, we have Thus, since Be = 0, we have P (t)e = e and therefore P (t) is a stochastic matrix.
(ii) Since A ≥ 0, if there is a walk from i to j in Γ(A), then there exists a positive integer k such that the (i, j) entry of A k is positive. For each t > 0, P (t) is an infinite series of positive multiples of all the powers of A, and so it follows that Γ(P (t)) is the transitive closure of Γ(A), which in turn coincides with the transitive closure of Γ(B).
(iii) Since A ≥ 0, A is irreducible if and only its powers are irreducible (there are no accidental cancelations when the powers are computed). By (4.11), P (t) is irreducible for some t > 0 if and only if A, and thus B, are irreducible matrices.
(iv) If P (t) is irreducible for some t > 0, then A and h B = A − I are irreducible by (iii), in which case it follows by (4.11), as in the proof of (iii), that P (t) is irreducible for any t > 0.
(v) & (vi) B is irreducible if and only if there is a walk from any vertex i to any other vertex j in Γ(A), or equivalently, there exists k such that the (i, j) entry of A k is positive. Thus, by (4.11), P (t) > 0 for some (and hence all ) t > 0 is equivalent to B being irreducible.
We will now proceed under the assumption that B is irreducible, in which case, by Lemma 4.2, P (t) is a primitive stochastic matrix for all t > 0. In particular, by Theorem 2.2, ρ(P (t)) = 1 is a simple and the sole eigenvalue of P (t) lying on the unit circle; the corresponding eigenspace is spanned by e.
Notice also that since P (t) = e tB , where B = P (0) is fixed, and since e tB and B share left eigenvectors, the left eigenvectors of P (t) corresponding to 1 coincide with the left eigenvector vectors of B.
As a consequence, because of the primitivity of P (t), for all t > 0, the left eigenspace of P (t) corresponding to ρ(t) = 1 is spanned by some constant, common (for all t ≥ 0) vector, π ∈ R n . As the transpose of P (t) is also a primitive matrix, we can take π to be entrywise positive and normalize it to satisfy π T e = 1.
Note that the above observations are intuitively in agreement with the notion of the embedded jump chain (see Remark 3.1), which suggests that under certain conditions (primitivity), there exists a unique stationary probability distribution π ∈ R n such that P T (t)π = π = lim t→∞ z(t) = lim t→∞ P T (t) z(0). (4.12) In conclusion, we have that π in (4.12) is thus the unique solution to the system where z(t) is the state probability vector introduced in (3.8) and (3.9). Moreover, under the assumption of irreducibility of B, and since then e t B is primitive for all t > 0, we have that (see [ The properties and behavior of homogeneous Markov chains with fixed transition probabilities P ∈ M n (R) are intimately related to the singular M-matrix A = I − P ; see [3,Chapter 6]. In keeping with this approach, we now consider the continuous-time model described above and define a time-dependent M-matrix function A(t) as follows.
We have seen in (4.12) that π = lim t→∞ P T (t) z(0). In what follows we directly connect π to the adjoint of I − P (t) and its trace, thus obtaining an explicit relation of π in terms of P (t) for every t > 0.
Recall that the (i, j) entry of the adjoint of a matrix X ∈ M n (R), denoted by adj(X), is the (i, j)-th co-factor of X, i.e., where X ji denotes the submatrix of X obtained when row j and column i are deleted. It is wellknown that adj(X) X = X adj(X) = det(X) I.
The null space of A(t) is indeed spanned by e and the left null space of A(t) is spanned by π. Thus the columns of C(t) are multiples of e and its rows multiples of π. It follows that C(t) = δ(t)eπ T for some δ(t) ∈ R. To prove that δ(t) is positive, note that as A(t) = I − P (t) is an M-matrix, its principal minors are nonnegative. In particular, the diagonal entries of C(t) are positive. Since e and π are positive vectors, it must then also be that δ(t) > 0.
Proof. By Theorem 4.3, each row of C(t) is equal to a positive multiple of π T . Since π is the unique normalized left null vector of B, each row of C(t) normalized to have sum equal to one, must equal π.
Remark 4.6 From Lemma 4.2, it is evident that when B is irreducible, P (t) has to be primitive for all t > 0 and so the continuous-time Markov chain must have a unique stationary distribution. Also, all the entries of P (t) = e tB are positive for all t > 0, that is, the probability for the process leaving its current state is always positive. This means that under the assumption of irreducibility of B, by Lemma 4.2 (v), the continuous-time Markov chain is regular.
Also note that from Lemma 4.2, when B = P (0) is reducible, then P (t) = e tB is reducible for all t ≥ 0. Without loss of generality, let B be symmetrically permuted into Frobenius normal form with irreducible diagonal blocks B jj (j = 1, 2, . . . k, k ≥ 2). By Lemma 4.2, for each t > 0, Γ(P (t)) is the transitive closure of Γ(B) and thus its Frobenius normal form has irreducible diagonal blocks equal to P jj (t) = e tB jj (j = 1, 2, . . . , k). A study can then ensue that ties the k irreducible components of B to the limiting behavior of z(t) in (3.9). It is apparent that in this case, the stationary distribution vector is not in general unique anymore as the Markov chain contains transient states. The study of P (t) pursued in Section 5 applies to reducible case of B as no assumption of irreducibility is made.

More on the dynamic behavior of P (t)
Recalling the stochastic process of a continuous-time Markov chain as described in (3.8) and (3.9), we will examine more closely the dynamic behavior of (the entries of) P (t), where P (t) is as described in (3.7). The results that follow offer a better understanding of monotonic and relative behavior of the transition probabilities as functions of t.
Given the dynamical system in (3.9), observe that so the monotonicity of z(t) depends on B T z(0). As the entries of z(t) are nonnegative and add up to 1, at no time t ≥ 0 can all entries of B T z(t) be positive. In particular, taking z(0) to be the l-th column of the identity, it follows that the l-th row of P (t) = e tB behaves monotonically according to the sign pattern of z T (0)B. The columns of P (t) also obey an orderly monotonic pattern, as explained in the following results. Proof. Since P (t) = BP (t), p il (t) ≥ p ml (t) ≥ 0 for every m, and b im ≥ 0 for all i = m, we have Thus, Analogously, we have [p il (t)] ≤ 0.

Corollary 5.2
Let P (t) = [p ij (t)] ≥ 0 be the transition matrix in (3.7). For any t > 0, let i, j ∈ {1, 2, . . . , n} be such that p il (t) = M (t), p jl (t) = m(t) are maximal and minimal entries, respectively, in column l of P (t). Let p kl (t) be an arbitrary entry of column l of P (t). Then Proof. Let x(t) denote column l of P (t), and let x i (t) and x j (t) be maximal and minimal entries of x(t), respectively. By Theorem 5.1, we have (Bx(t)) i ≤ 0 and (Bx(t)) j ≥ 0 . Letx(t) be the vector generated by replacing x i (t) by m(t) in x(t). Then (x(t)) i is a minimal entry ofx(t). Thus, Now let x k (t) be an arbitrary entry of x(t). Let y(t) be the vector generated from x(t) by replacing x k (t) by M (t) and let z(t) be the vector generated from x(t) by replacing x k (t) by m(t). Then we have The following theorem is analogous to and implicit in the proof of convergence to a stationary distribution of a discrete Markov process in [1,Eq. (9). p. 268].

Theorem 5.3
Let P (t) = [p ij (t)] = e tB ≥ 0 be the transition matrix in (3.7). For any t > 0, let (t) be the smallest value among the entries of P (t). Let m 0 and M 0 be the minimum and maximum values among the entries of a fixed real vector x, respectively. Also, let m 1 (t) = (P (t)x) l and M 1 (t) = (P (t)x) s be the minimum and maximum values among the entries of P (t)x, respectively, for some l, s ∈ {1, 2, . . . , n}.
Proof. For some integer l, we have Now assume that x i = m 0 and x j = M 0 are a minimal entry and a maximal entry of a fixed real vector x, respectively. Let y be the vector obtained from x by replacing all entries of x by M 0 except for the entry x i . Then, x ≤ y and each entry of P (t)y satisfies is the smallest entry of P (t). Moreover, since x ≤ y, for some l ∈ {1, 2, . . . , n} we have Negating x, we see that −m 0 is maximal entry and −M 0 is minimal among the entries of −x. As above, we now get From (1) and 2, we obtain

Remark 5.4
We note that the results in this section hold for irreducible as well as reducible B. However, if B is irreducible, in which case lim t→∞ P (t) = eπ T , then the observations herein take a special meaning. For example, at any t > 0, • the entries in the l-th column of P (t) approach in value the l-th entry of π in a fashion described in Theorem 5.1: maximal (minimal) elements in the column do not further increase (decrease).
• the entries in every row of P (t) tend toward π as t → ∞ in a mixed monotonic fashion within [0, 1].

The group generalized inverse of A(t) = I − P (t)
Recall the notion of the group (generalized) inverse of A ∈ M n (R), which, if it exists, is the unique solution X to the matrix equations AXA = A, XAX = X and XA = AX. When the group inverse exists, it is denoted by A . A necessary and sufficient condition for A to exist is that rank(A 2 ) = rank(A); see [2] for details.
In this section, we continue to consider the continuous-time Markov chain P (t) = e tB with irreducible B and its association with the M-matrix function which is indeed a singular matrix for each t ≥ 0.
We will consider the matrices where A(t) is the group inverse of A(t), whose existence is guaranteed by Theorem 6.1 below.
In the homogeneous case, these matrices contain critical information about a Markov chain, as well as provide computationally convenient methods to obtain this information (see [3,Chapter 8,Section 4]). We will pursue an analogous study for continuous-time Markov chains based on A(t).
For clarity, we will focus on the case where B is irreducible, so by Lemma 4.2, P (t) is primitive and ρ(t) = 1 is a simple eigenvalue of P (t) for each t > 0 and all other eigenvalues have modulus less than 1.
Recalling the notions of semiconvergence and "property c" from Section 2 (fact (e) following Theorem 2.2), we have the following basic observation, which is based on the fact that an M-matrix A has "property c" if and only if rank(A) = rank(A 2 ). Theorem 6.1 Let B = P (0) ∈ M n (R) be irreducible and P (t) = e tB be the transition matrix for a continuous-time Markov chain in (3.7). Then for each t ≥ 0, is an M-matrix with "property c".
Proof. Since P (t) ≥ 0 and ρ(P (t)) = 1, A(t) is a singular M-matrix for every fixed t ≥ 0. Now we only need to show that A(t) has "property c" for all t ≥ 0. When t = 0, P (0) = I and A(0) = 0 which is indeed an M-matrix with "property c". Let t > 0 be arbitrary but fixed. We need to show that rank(A(t)) = rank(A 2 (t)). Since B is irreducible, ρ(P (t)) = 1 is a simple eigenvalue of P (t) and all other eigenvalues have modulus less than 1. Therefore, invoking the Jordan canonical form of P (t), there exists a nonsingular matrix S(t) such that P (t) = S(t)J(t)S(t) −1 , where and ρ(N (t)) < 1. Then A(t) = I − P (t) = S(t)(I − J(t))S(t) −1 , which implies that rank(A(t)) = rank(I − J(t)). Since and I − N (t) is nonsingular, we can see that rank(A(t)) = rank(I − J(t)) = rank(I − J(t)) 2 = rank(A 2 (t)). Therefore, A(t) = I − P (t) is an M-matrix with "property c". Remark 6.2 An analogue to Theorem 6.1 for reducible B can be based on the Frobenius normal form of B.
The condition rank(A(t)) = rank(A 2 (t)) in the proof of Theorem 6.1 is indeed equivalent to the existence of the group generalized inverse A(t) , which, in turn, is also characterized by its action on x ∈ R n as follows: Thus, as an immediate consequence of Theorem 6.1 we have the following corollary. Proof. As in the proof of Theorem 6.1, for each t ≥ 0, there exists a nongsingular matrix S(t) such that P (t) = S(t)J(t)S(t) −1 , where and ρ(N (t)) < 1. Thus, it follows that Next, we will establish the relationship between A(t) = I − P (t) and the left and right Perron vectors of P (t).
Theorem 6.5 Let B = P (0) ∈ M n (R) be irreducible and P (t) = e tB be the transition matrix for a continuous-time Markov chain as in (3.7). Let A(t) = I − P (t) and L(t) = I − A(t)A(t) . Then for all t ≥ 0, where π is the stationary probability distribution vector associated with the stochastic model and e is the all-ones vector.
Proof. For each t ≥ 0, we have Also, since a Markov chain is ergodic if and only if it has a unique positive stationary distribution vector π (see, Section 2 and Lemma 4.2 (iv), or [5, Theorem 21, p. 261]), each row of L(t) is a multiple of π. By Theorem 6.4, L(t) is also a stochastic matrix as the limit of stochastic matrices, which implies that The following remarks provide context and exemplify the potential of the approach developed in this section.

Remark 6.6
• By Theorems 6.4 and 6.5, we confirm our findings that when B = P (0) ∈ M n (R) is irreducible and P (t) = e tB is the transition matrix for a continuous-time Markov chain as in (3.7), then lim t→∞ P (t) = L(t) = eπ T .
• We note that in Theorem 6.5, L(t) is a nonnegative, rank 1, stochastic matrix. For an arbitrary chain, L(t) is a nonnegative, rank r, stochastic matrix, where r is the number of ergodic classes associated with the chain.
• For a general chain, the entries of L(t) can be used to obtain the probabilities of eventual absorption into any one particular ergodic class, for each starting state. Moreover, if P (t) is the transition matrix for an absorbing chain, then we have the following: (a) If s j is an absorbing state, then l ij (t) equals that probability of eventual absorption of the process from state s i to the class [s j ].
(b) If s i and s j are transient states, then a ij (t) equals the expected number of times the chain will be in s j when it is initially in s i .

Summary, conclusions and future work
The stochastic process we identified as a continuous-time Markov chain (ctMC) is described in equations (3.6) -(3.9) and satisfies the following: • The transition probabilities P (t) of the ctMC are entirely determined by B = P (0).
• ctMC possesses at least one stationary distribution.
• If B = P (0) is irreducible, then (i) ctMC possesses a unique stationary distribution π; (ii) π is the unique nonnegative solution to B T π = 0, e T π = 1; (iii) ctMC is regular, i.e., at all times t > 0, the probability of the process of transitioning to any state is positive.
• The entries of P (t) ensue in a particular balanced manner in which the largest (resp., smallest) transition probabilities to any given state trend downward (resp., upward) at any given t > 0.
The particular way the mathematical model of a ctMC arises, points to the need for a specific type of simulation and emprirical data collection required to pursue real-life applications. Currently, inhomogenous stochastic models in use focus on (1) making assumptions about the functional and parametric form of the entries of the transition matrix and (2) collecting or simulating timesensitive data that inform the choice of the parameters. The model pursued herein indicates that an analysis can be pursued by estimating the rates of growth of the transitional probabilities at an initial time, namely the matrix B.
Our intention in this article, which is based on [9], is to provide a primer for a deeper matrixtheoretic analysis of continuous-time Markov chains. As such, we foresee several possible lines of future research: (1) Discover and interpret more dynamic relations among the entries of P (t).

(3)
When B is reducible, analyze the assymptotic behavior of the probability distributions in terms of the Frobenius normal form and the reduced graph of B [12].
(4) Extend and expand the use of group inverses of M-matrices to general continuous-time Markov chains, analogously to the homogeneous case.
(5) Study initial distributions z(0) in (3.9) that lead to specific monotonic behavior of the state probabilities; this is equivalent to studying the image of R n + under B.