The model we consider next has the DAG of model 4-3b in , reproduced in Figure 3 for convenience.

Figure 3 The DAG of model 4-3b

Parameters for the model are:

1.

${\mathbf{p}}_{0}=P({X}_{0})\in {\mathrm{\Delta}}^{{n}_{0}-1}$, a stochastic vector giving the distribution for the ${n}_{0}$-state hidden variable ${X}_{0}$.

2.

Stochastic matrices ${M}_{1}=P({X}_{1}|{X}_{0})$ of size ${n}_{0}\times {n}_{1}$; ${M}_{i}=P({X}_{i}|{X}_{0},{X}_{1})$ of size ${n}_{0}{n}_{1}\times {n}_{i}$ for $i=2,3$; and ${M}_{4}=P({X}_{4}|{X}_{0},{X}_{3})$ of size ${n}_{0}{n}_{3}\times {n}_{4}$.

. *Consider the model represented by the DAG of model 4-3b, where variables* ${X}_{i}$ *have* ${n}_{i}\ge 2$ *states, with* ${n}_{2},{n}_{4}\ge {n}_{0}$*. Then generic parameters of the model are identifiable up to label swapping, and an algebraic procedure for determination of the parameters from the joint probability distribution* $P({X}_{1},{X}_{2},{X}_{3},{X}_{4})$ *can be given*.

*More specifically, suppose* ${\mathbf{p}}_{0},{M}_{1},{M}_{3}$ *have no zero entries, the* ${n}_{0}\times {n}_{2}$ *and* ${n}_{0}\times {n}_{4}$ *matrices*
${M}_{2}^{i}=P({X}_{2}|{X}_{0},{X}_{1}=i),\phantom{\rule{thickmathspace}{0ex}}1\le i\le {n}_{1},\phantom{\rule{1em}{0ex}}\mathrm{a}\mathrm{n}\mathrm{d}$
${M}_{4}^{j}=P({X}_{4}|{X}_{0},{X}_{3}=j),\phantom{\rule{thickmathspace}{0ex}}1\le j\le {n}_{3}$*have rank* ${n}_{0}$, *and there exists some* $i,{i}^{\mathrm{\prime}}$ *with* $1\le i<{i}^{\mathrm{\prime}}\le {n}_{1}$ *such that for all* $1\le j<{j}^{\mathrm{\prime}}<{n}_{3}$, $1\le k<{k}^{\mathrm{\prime}}\le {n}_{0}$ *the entries of* ${M}_{3}$ *satisfy inequality (7). Then from the resulting joint distribution the parameters can be found through determination of the roots of certain nth degree univariate polynomials and solving linear equations. The coefficients of these polynomials and linear systems are rational expressions in the entries of the joint distribution*.

*Proof*. Consider first the case ${n}_{0}={n}_{2}={n}_{4}=n$. With $P=P({X}_{1},{X}_{2},{X}_{3},{X}_{4})$ viewed as an ${n}_{1}\times n\times {n}_{3}\times n$ array, we work with $n\times n$ slices of *P*,
${P}_{i,j}=P({X}_{1}=i,{X}_{2},{X}_{3}=j,{X}_{4}),$

that is, we essentially condition on

${X}_{1},{X}_{3}$, though omit the normalization.

Note that these slices can be expressed as
${P}_{i,j}=({M}_{2}^{i}{)}^{T}{D}_{i,j}{M}_{4}^{j},$(4)where ${D}_{i,j}=\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}(P({X}_{0},{X}_{1}=i,{X}_{3}=j))$ is the diagonal matrix given in terms of parameters by
${D}_{i,j}(k,k)={\mathbf{p}}_{0}(k){M}_{1}(k,i){M}_{3}((k,i),j),$and ${M}_{2}^{i}$ and ${M}_{4}^{j}$ are as in the statement of the theorem.

Equation (4) implies for $1\le i,{i}^{\prime}\le {n}_{1}$ and $1\le j,{j}^{\prime}\le {n}_{3}$ that
${P}_{i,j}^{-1}{P}_{i,{j}^{\prime}}{P}_{{i}^{\prime},{j}^{\prime}}^{-1}{P}_{{i}^{\prime},j}=({M}_{4}^{j}{)}^{-1}{D}_{i,j}^{-1}{D}_{i,{j}^{\prime}}{D}_{{i}^{\prime},{j}^{\prime}}^{-1}{D}_{{i}^{\prime},j}{M}_{4}^{j},$(5)and the hypotheses on the parameters imply the needed invertibility. But this shows the rows of ${M}_{4}^{j}$ are left eigenvectors of this product.

In fact, if $i\ne {i}^{\prime}$, $j\ne {j}^{\mathrm{\prime}}$, then the eigenvalues of this product are distinct, for generic parameters. To see this, note the eigenvalues are
${M}_{3}((k,i),{j}^{\mathrm{\prime}}){M}_{3}((k,{i}^{\mathrm{\prime}}),j)/({M}_{3}((k,i),j){M}_{3}((k,{i}^{\prime}),{j}^{\prime}),$(6)for $1\le k\le n$, so distinctness of eigenvalues is equivalent to
$\begin{array}{rl}{M}_{3}& ((k,i),{j}^{\prime}){M}_{3}((k,{i}^{\prime}),j){M}_{3}(({k}^{\prime},i),j){M}_{3}(({k}^{\prime},{i}^{\prime}),{j}^{\prime})\\ & \ne {M}_{3}((k,i),j){M}_{3}((k,{i}^{\prime}),{j}^{\prime}){M}_{3}(({k}^{\prime},i),{j}^{\prime}){M}_{3}(({k}^{\prime},{i}^{\prime}),j),\end{array}$(7)for all $1\le k<{k}^{\prime}\le n$. Thus a generic choice of ${M}_{3}$ leads to distinct eigenvalues.

With distinct eigenvalues, the eigenvectors are determined up to scaling. But since each row of ${M}_{4}^{j}$ must sum to 1, the rows of ${M}_{4}^{j}$ are therefore determined by *P*.

The ordering of the rows of the ${M}_{4}^{j}$ has not yet been determined. To do this, first fix an arbitrary ordering of the rows of ${M}_{4}^{1}$, say, which imposes an arbitrary labeling of the states for ${X}_{0}$. Then using eq. (4), from ${P}_{i,1}({M}_{4}^{1}{)}^{-1}$ we can determine ${D}_{i,1}$ and ${M}_{2}^{i}$ with their rows ordered consistently with ${M}_{4}^{1}$. For $j\ge 1$, using eq. (4) again, from $({M}_{2}^{i}{)}^{-T}{P}_{i,j}$ we can determine ${D}_{i,j}$ and ${M}_{4}^{j}$ with a consistent row order. Thus ${M}_{2}$ and ${M}_{4}$ are determined.

To determine the remaining parameters, again appealing to eq. (4), we can recover the distribution $P({X}_{0},{X}_{1},{X}_{2})$ using
$({M}_{2}^{i}{)}^{-T}{P}_{i,j}({M}_{4}^{j}{)}^{-1}=\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}(P({X}_{0},{X}_{1}=i,{X}_{3}=j)).$With ${X}_{0}$ no longer hidden, it is straightforward to determine the remaining parameters.

The general case of ${n}_{0}\le {n}_{2},{n}_{4}$ is handled by considering subarrays, just as in the proof of the preceding theorem.□

## Comments (0)