Open Access Published by De Gruyter Open Access September 5, 2022

# Regularity of models associated with Markov jump processes

• Wissem Jedidi
From the journal Open Mathematics

## Abstract

We consider a jump Markov process X = ( X t ) t 0 , with values in a state space ( E , ) . We suppose that the corresponding infinitesimal generator π θ ( x , d y ) , x E , hence the law P x , θ of X , depends on a parameter θ Θ . We prove that several models (filtered or not) associated with X are linked, by their regularity according to a certain scheme. In particular, we show that the regularity of the model ( π θ ( x , d y ) ) θ Θ is equivalent to the local regularity of ( P x , θ ) θ Θ .

MSC 2010: 65C20; 62M20

## 1 Introduction and main results

Jump Markov processes, have found application in Bayesian statistics, chemistry, economics, information theory, finance, physics, population dynamics, speech processing, signal processing, statistical mechanics, traffic modeling, thermodynamics, and many others [1]. Regularity plays a significant role in the classical asymptotic statistics for parametric statistical models for jump Markov processes; see [2,3,4] for recent developments. Asymptotic normality or Bernstein-von Mises-type theorems impose several regularity conditions so that their results hold rigorously. In this article, we focus on the regularity conditions of several statistical models associated with a jump Markov process X with values E being an arbitrary space state, endowed with a σ -field . Let Ω be the canonical space of piecewise constant functions ω : R + E , right continuous for the discrete topology. Let X = ( X t ) t 0 be the canonical process, ( t ) t 0 the canonical filtration, and = t 0 t . Let T 0 = 0 and ( T n ) n 0 be the sequence of the jump times of X , which are almost surely increasing to . To each θ Θ R d and x E , we associate

μ θ : E ( 0 , ) , an -measurable function , Q θ ( x , d y ) , a Markov kernel (also called a transition probability) on E .

We assume that, under P x , θ , the process ( X t ) t 0 is Markovian, starts from x E , is non-exploding, and admits the infinitesimal generator

π θ ( x , d y ) = μ θ ( x ) Q θ ( x , d y ) .

The existence of the probabilities P x , θ is guaranteed by the boundedness of the function μ θ for instance. The following facts clarify our focus on the different statistical models that will be presented later on:

• under P x , θ , and conditionally to T n 1 , the distribution of T n T n 1 is exponential with parameter μ θ ( X T n 1 ) ;

• Q θ ( x , d y ) = P x , θ ( X T 1 d y ) is the transition probability of the embedded Markov chain ( X T n ) n 0 ;

• Q ¯ θ k ( x , d y , d t ) , k N , is the distribution of ( X T k , T k ) under P x , θ ;

• The associated sub-Markovian transition kernels ( P t θ ) t 0 satisfy the backward Kolmogorov equations:

(1) t P t θ ( x , A ) = E ( P t θ ( y , A ) P t θ ( x , A ) ) π θ ( x , d y ) , s , t 0 , x E , A .

The Markov process X is simple if π θ ( x , d y ) is a Markov kernel, i.e., for every x E , π θ ( x , d y ) is a probability measure on ( E , ) . In this case, the transition functions are also Markov and satisfy the Chapman-Kolmogorov equation

P s + t θ ( x , A ) = E P t θ ( y , A ) P s θ ( x , d x ) , x E , A ,

and

P x , θ ( X s + t A t ) = P s θ ( X t , A ) , P x , θ -almost surely,

cf. [5] for more account.

• The multivariate point process associated with the process ( X t ) t 0 is

λ ( , d t , d y ) = k 1 ε ( T k , X T k ) ( d t , d y ) ,

and its compensator, under P x , θ , is

ν θ ( , d t , d y ) = π θ ( X t , d y ) 1 R + ( t ) d t .

Cf. Höpfner et al. [6] for instance. Note that in our study, we will not use the transition functions nor the multivariate point process and its compensator. In fact, we aim to show that the regularity of each model for the following statistical models is linked to the others, according to a certain scheme:

(2) x = ( Ω , , ( t ) t 0 , ( P x , θ ) θ Θ ) = the filtered model associated with ( X t ) t 0 ,

(3) E x = ( E , , ( π θ ( x , d y ) ) θ Θ ) = the model associated with the generator of ( X t ) t 0 ,

(4) E x = ( E , , ( Q θ ( x , d y ) ) θ Θ ) = the model associated with the observation of X T 1 ,

(5) E x k = ( E × R + , R + , ( Q ¯ θ k ( x , d y , d t ) ) θ Θ ) = the model associated with the observation of ( X T k , T k ) .

The model E x is not a proper statistical model since ( π θ ( x , d y ) ) θ Θ is not a probability measure. Nevertheless, the extension of the notion of regularity to models associated with families of finite positive measures is also feasible and is described as follows. Let ( R θ ) θ Θ be a family of finite positive measures in ( E , ) . For θ , ξ Θ , we denote by Π θ , ξ a measure that dominates R 0 , R θ , and R ξ , and by z θ , ξ , z ξ , θ , and Z θ , ξ , be Radon-Nikodym derivatives, respectively, of R θ and R ξ according to Π θ , ξ and of R θ according to R ξ . The Lebesgue decomposition of R θ , with respect to R ξ , is given by the pair ( N θ , ξ , Z θ , ξ ) ,

N θ , ξ = { u / z ξ , θ ( u ) = 0 } , Z θ , ξ = z θ , ξ z ξ , θ , outside N θ , ξ , 0 , on N θ , ξ .

We start by recalling the notion of “error functions” which was introduced in [6] as follows.

## Definition 1

A function f : [ 0 , ) [ 0 , ) is called an error function, if lim u 0 f ( u ) = 0 . More generally, an error function is any positive function f : E × ( 0 , ) [ 0 , ) , such that

f ( , u ) is -measurable , u > 0 , and lim u 0 f ( x , u ) = 0 , x E .

## Definition 2

(Regularity of non-filtered models). The model ( E , , ( R θ ) θ Θ ) is regular at θ = 0 , if the random function

Θ L 2 ( R 0 ) θ Z θ , 0 ,

is differentiable at θ = 0 , i.e., there exists a random vector V = ( V i ) 1 i d and an error function f : [ 0 , ) [ 0 , ) , such that

(6) R θ ( N θ , 0 ) + Z θ , 0 1 1 2 V θ L 2 ( R 0 ) 2 θ 2 f ( θ ) .

Note that if the regularity of the model ( E , , ( R θ ) θ Θ ) holds, then V is necessarily ( R 0 ) -square integrable. Furthermore, if ( R θ ) θ Θ is a family of probability measures, then E R 0 ( V ) = 0 . The Hellinger integral of order 1 2 between the measures R θ and R ξ , , is defined by

H θ , ξ Π θ , ξ ( z θ , ξ z ξ , θ )

and is independent of the dominating measure Π θ , ξ . The regularity of the model ( E , , ( R θ ) θ Θ ) is equivalent to one of these two assertions:

1. There exist an error function f 1 and a random vector V (the same as before), such that

(7) z θ , 0 z 0 , θ 1 2 z 0 , θ V θ L 2 ( Π θ , 0 ) θ f 1 ( θ ) .

2. There exists an error function f 2 and a matrix I = [ I i j ] 1 i , j d , such that

H 0 , 0 + H θ , ξ H 0 , θ H 0 , ξ 1 4 θ I ξ θ ξ f 2 ( θ ξ ) .

The matrix I is positive definite and is called the Fisher information matrix of the model at θ = 0 . It is linked to the vector V by

I i j = R 0 ( V i V j ) ,

cf. [6,7].

Let ( Ω , ) be a sample space endowed with a filtration ( t ) t 0 , and a family of probability measures ( P θ ) θ Θ coinciding on 0 . The regularity of the statistical filtered model

(8) ( Ω , , ( t ) t 0 , ( P θ ) θ Θ )

mimics the one in Definition 2 and is expressed in terms of likelihood processes [8,7]. For a clear presentation, we need to introduce the likelihood process of P θ with respect to P ξ , θ , ξ Θ , defined in Jacod and Shiryaev’s book [9], by

Z t θ , ξ = d P θ t d P ξ t , t 0 .

The process Z t θ , ξ is a positive ( P ξ , t ) -supermartingale and is a martingale if

P θ l o c P ξ , ( i.e. if P θ t P ξ t , t 0 ) .

For any probability measure K θ , ξ , locally dominating P θ and P ξ , the ( K θ , ξ , t ) -martingales

z t θ , ξ = d P θ t d K θ , ξ t , z t ξ , θ = d P ξ t d K θ , ξ t

and the stopping times

τ θ , ξ = inf { t 0 s.t. z t θ , ξ = 0 } , τ ξ , θ = inf { t 0 s.t. z t ξ , θ = 0 }

provide this version of Z θ , ξ :

Z t θ , ξ = z t θ , ξ z t ξ , θ , if t < τ θ , ξ τ ξ , θ 0 , if t τ θ τ ξ .

As for non-filtered models, we have the following definition.

## Definition 3

(Regularity of filtered models). Let T be a stopping time relative to ( t ) t 0 . The model ( Ω , , ( t ) t 0 , ( P θ ) θ Θ ) is said to be regular (or differentiable) at time T and at θ = 0 , if the model ( Ω , T , ( P θ ) θ Θ ) is regular in the sense of Definition 2. That means that there exists an T -measurable, P 0 -square-integrable, and centered random vector V T = [ V T i ] 1 i d and two error functions f 1 , T , f 2 , T , such that

(9) E P 0 [ 1 Z T θ , 0 ] θ 2 f 1 , T ( θ )

and

(10) E P 0 Z T θ , 0 1 1 2 θ V T 2 θ 2 f 2 , T ( θ ) .

As in Definition 2 and according to [7, point 3.12], the regularity of the model is equivalent to the existence of a positive definite d × d matrix J T and of an error function f T , such that

H T 0 , 0 + H T θ , ξ H T 0 , θ H T 0 , ξ 1 4 θ J T ξ θ ξ f T ( θ ξ ) ,

where

H T θ , ξ = E K θ , ξ [ z T θ , ξ z T ξ , θ ] = E P 0 [ Z T θ , ξ Z T ξ , θ 1 ( T < τ θ τ ξ ) ]

is the Hellinger integral of order 1 2 , at time T , and which is independent of the choice of the dominating probability measure. The Fisher information matrix of the model is then

J T = [ E P 0 [ V T i V T j ] ] 1 i , j d .

It is worth noting that if the regularity at a time T implies the regularity at any stopping time S T . In particular, if the regularity holds along a sequence S p , p N , increasing to infinity, then there exists a local martingale ( V t ) t 0 , locally square-integrable, null at zero, such that if T S p for some p , then V T is a version of the random variable in (10). In particular, if (9) and (10) are satisfied for all t 0 , then ( V t ) t 0 is a square-integrable martingale, null at 0, cf. [7, Corollary 3.16].

We are now able to introduce the notion of local regularity, which is less restrictive than the preceding one.

## Definition 4

(Local regularity of filtered models). A sequence ( S p ) p N of stopping times is called a localizing sequence if it is P 0 -almost surely increasing to . A localizing family is a sequence formed by the pair ( S p , S n , p ) p N , n 1 , where ( S p ) p N is a localizing sequence and ( S n , p ) n 1 is a sequence of stopping times, satisfying

(11) S n , p S p and lim n P 0 ( S n , p < S p ) = 0 .

The model (8) is said to be locally regular (or locally differentiable at θ = 0 ), if there exists a right continuous, left limited process ( V t ) t 0 on R d , such that, for all ( θ n , θ ) satisfying

(12) lim n θ n = 0 and lim n θ n θ n = θ ,

there exists a localizing family ( S p , S n , p ) p N , n 1 , satisfying

(13) lim n E P 0 [ 1 Z S n , p θ n , 0 ] θ n 2 = 0 , p N ,

and

(14) Z t S n , p θ n , 0 1 θ n L 2 ( P 0 ) 1 2 θ V t S p , as n + , p N , t 0 .

Note that if the model is regular along a localizing sequence, then it is necessarily locally regular. By Theorem [7, Theorem 4.6], the process ( V t ) t 0 is a locally square-integrable ( P 0 , t )-local martingale and the Fisher information process ( I t ) t 0 , at θ = 0 , is defined as the predicable quadratic covariation of ( V t ) t 0 :

I t [ I t i j ] 1 i , j d = [ V i , V j t ] 1 i , j d .

The local regularity does guarantee the integrability of I ; however, it is the minimal condition we require to obtain the property of local asymptotic normality (LAN) for statistical models. In this case, the Fisher information quantities provide the lower bound of the variance of any estimator of the unknown parameters intervening in the models, see [10,11,12] for instance.

According to [7, Theorem 6.2], the local regularity is equivalent to the two following conditions:

(15) 1 θ ξ Var h 0 , 0 + h θ , ξ h 0 , θ h 0 , ξ 1 4 θ I ξ t P 0 0 , as θ , ξ 0 , t 0 ,

and for all t 0 ,

(16) A t θ θ 2 P 0 0 , as θ 0 ,

where

• ( Var { . } t ) t 0 is the variation process of { . } .

• ( h t θ , ξ ) t 0 is a version of the Hellinger process of order 1 2 between P θ and P ξ , i.e., is a predictable nondecreasing process, null at zero, such that

(17) z θ z ξ + [ z θ z ξ ] h θ , ξ is a ( K θ , ξ , t ) -martingale.

• ( A t θ ) t 0 is the predictable nondecreasing process intervening in the Doob-Meyer decomposition of the supermartingale ( Z t θ ) t 0 . Since P θ and P 0 coincide on 0 , then necessarily Z 0 θ = 1 and there exists a ( P 0 , t ) -local martingale ( M t θ ) t 0 such that

Z θ = 1 + M θ A θ .

The results that we obtain complete those of Höpfner et al. [6], who proved that if ( π θ ( x , d y ) ) θ Θ is regular and if the process X satisfies a condition of positive recurrence (resp. null recurrence), then the model ( P x , θ ) θ Θ localized around the parameter θ = 0 is or locally asymptotically normal or is locally asymptotically mixed normal. The main result is as follows.

## Theorem 5

The model x is locally regular for all x E , if, and only if, E y is regular for all y E .

Models (2)–(5) are described in depth in Section 2. We also provide a full scheme linking them by their regularity, see Theorems 68 and 10. The proofs are given in Section 3.

## 2 Additional regularity properties

Our notations and the calculus of the Hellinger integrals and of the likelihood processes are borrowed from Höpfner [13] and Höpfner et al. [6]. For x E and θ , ξ θ , the following measures will be used in the sequel.

1. Π x θ , ξ ( d y ) is a measure dominating π θ ( x , d y ) , π ξ ( x , d y ) , and π 0 ( x , d y ) . Thus, Π x θ , ξ ( d y ) also dominates Q θ ( x , d y ) , Q ξ ( x , d y ) , and Q 0 ( x , d y ) ;

2. Q x θ , ξ ( d y , d t ) is a transition probability on E × R + dominating Q ¯ θ ( x , d y , d t ) , Q ¯ ξ ( x , d y , d t ) , and Q ¯ 0 ( x , d y , d t ) ;

3. K x θ , ξ is a probability measure, locally dominating P x , θ , P x , ξ , and P x , 0 ;

4. Π x θ ( d y ) = Π x θ , 0 ( d y ) , Q x θ ( d y , d t ) = P x θ , 0 ( d y , d t ) , and K x θ = K x θ , 0 .

The Radon-Nikodym derivatives are denoted by

χ θ , ξ ( x , ) = d π θ ( x , ) d Π x θ , ξ ( ) , ρ θ , ξ ( x , ) = d Q θ ( x , ) d Π x θ , ξ ( ) , ρ θ , ξ 1 ( x , , ) = d Q ¯ θ ( x , , ) d Q x θ , ξ ( , ) χ θ ( x , ) = χ θ , 0 ( x , ) , ρ θ ( x , ) = ρ θ , 0 ( x , ) , ρ θ 1 ( x , , ) = ρ θ , 0 1 ( x , , ) χ 0 ( x , ) = χ 0 , θ ( x , ) , r h o 0 ( x , ) = ρ 0 , θ ( x , ) , ρ 0 1 ( x , , ) = ρ 0 , θ 1 ( x , , ) .

If we choose

Π x θ , ξ ( d y ) = π θ ( x , d y ) + π ξ ( x , d y ) + π 0 ( x , d y ) ,

and if K x θ , ξ is the probability under which the canonical process ( X t ) t 0 , starts from x , and has the infinitesimal generator Π x θ , ξ ( d y ) , then we have

P x , 0 l o c K x θ , ξ , P x , θ l o c K x θ , ξ and P x , 0 l o c K x θ , ξ .

With the convention j = 1 0 = 1 , a version of the likelihood processes of P x , θ , with respect to K x θ , ξ and to ( t ) t 0 , is given in [6] by

z t θ , ξ = d P x , θ t d K x θ , ξ t = { j 1 : T j t χ θ , ξ ( X T j 1 , X T j ) } exp 0 t E ( 1 χ θ , ξ ) ( X s , y ) Π X s ξ , θ ( d y ) d s .

With the notations

z t θ z t θ , 0 , z t 0 z t 0 , θ , and τ θ inf { t 0 / z t θ = 0 } ,

a version of the likelihood processes of P x , θ , relative to P x , 0 and ( t ) t 0 , is explicitly given by

Z t θ = z t θ z t 0 = exp 0 t ( μ 0 μ θ ) ( X s ) d s j 1 : T j t χ θ χ 0 ( X T j 1 , X T j ) , if t < τ 0 τ θ 0 , if t τ 0 τ θ .

The Hellinger integral of order 1 2 between π θ ( x , d y ) and π ξ ( x , d y ) is then

H θ , ξ ( x ) = E χ θ , ξ ( x , y ) χ ξ , θ ( x , y ) Π x θ , ξ ( d y ) ,

and the Hellinger integral of order 1 2 at time t between P x , θ and P x , ξ relative to the filtration ( t ) t 0 , is expressed by

(18) H t θ , ξ ( x ) = E K x θ , ξ [ z t θ , ξ z t ξ , θ ] .

We also consider the quantities

H ¯ θ , ξ ( x ) = μ θ ( x ) + μ ξ ( x ) 2 H θ , ξ ( x ) ,

which are used to define the Hellinger process ( h t θ , ξ ) t 0 , of order 1 2 , between P x , θ and P x , ξ , and relative to ( t ) t 0 . It is expressed by

(19) h t θ , ξ = 0 t H ¯ θ , ξ ( X s ) d s .

Finally, we define the function

(20) g ( x , θ , ξ ) 1 θ ξ H ¯ 0 , θ ( x ) + H ¯ 0 , ξ ( x ) H ¯ θ , ξ ( x ) 1 4 θ I ( x ) ξ ,

where I ( x ) is the Fisher information matrix of the model E x at θ = 0 , whenever it is regular. Consequently, H ¯ 0 , θ ( x ) is expressed by

(21) H ¯ 0 , θ ( x ) = 1 8 θ I ( x ) θ + 1 2 θ 2 g ( x , θ , θ ) .

Observe that the function g in (20) is such that the function

f ( x , u ) sup θ , ξ u g ( x , θ , ξ ) , x E ,

is nondecreasing in u , satisfies g ( x , θ , ξ ) f ( x , θ ξ ) . Thus, f has the vocation to be an error function.

We can now state a first technical but intuitive result.

## Theorem 6

Let x E . Then the following assertions are equivalent.

1. E x is regular;

2. E x is regular and μ . ( x ) is differentiable at θ = 0 ;

3. E x 1 is regular;

4. x is regular at time T 1 .

In the three following theorems, we complete our results by studying the regularity of the filtered model x , at fixed times t > 0 , or at the jump times T k , k N . In this direction, we obtain only partial results appealing to some additional conditions of integrability.

## Theorem 7

Let t > 0 . For all x E , assume the following.

Condition A t ( x ) . There exists u t > 0 , an error function f 1 and a measurable function f 2 : E [ 0 , ) , satisfying the following:

H ¯ θ , ξ ( x ) θ ξ 2 f 2 ( x ) , i f θ , ξ u t ,

and

0 t E K x θ [ f 1 ( X s , u t ) 2 ] d s < + , 0 t E K x θ , ξ [ f 2 ( X s ) 2 ] d s < + .

Then, y is regular at the time t , for all y E .

## Theorem 8

For all x E , assume the following. The model E x is regular, and

Condition B ( x ) . The error function f in (7), associated with the model E x 1 , satisfies the following. There exists r > 0 such that

Q x θ [ f ( , r ) ] ( x ) = E × R + f ( y , r ) Q x θ ( x , d y , d t ) < + , if θ r .

Then, the model E y k is regular for all y E , and all k N .

## Remark 9

1. The control in the first integral in condition A t ( x ) is exactly the required condition for E x to be regular. The finiteness of the second integral will ensure integrability conditions in the proof of Theorem 7.

2. Equivalently, we could replace the error function in condition B( x ) by the one in (6). The integrability condition becomes

Q ¯ 0 [ f ( , r ) ] ( x ) = E × R + f ( y , r ) Q ¯ 0 ( x , d y , d t ) < + ,

and the only difference is that we would have to check two inequalities instead of one.

We conclude with our last result.

## Theorem 10

1. Let x E . If x is regular at a time t > 0 , then E x is regular.

2. Furthermore, if x is regular at a time t > 0 , for all x E , then, y is locally regular, for all y E .

## 3 Proofs of the theorems

We will sometimes use the notion of isomorphism between two statistical models. Referring to Strasser’s book [14], we say that two models G = ( A , A , ( P θ ) θ Θ ) and = ( B , , ( Q θ ) θ Θ ) are isomorphic if they are randomized of each other. To illustrate this notion, assume for instance that G and are, respectively, dominated by P and Q . Then, the model is randomized from G , if there exists a Markovian operator M : L ( A , A , P θ ) L ( B , , Q θ ) , such that

d Q θ d Q = M d P θ d P , θ Θ .

The models G and are randomized of each other if they are mutually exhaustive, which is always the case in our study, each time an isomorphism holds, cf. [14, Lemma 23.5 and Theorem 24.11]. When computing expectations, these isomorphisms allow us to handle at our convenience, one of the two likelihoods of the models G and . The latter is justified by the fact that they have the same law under the respective probability quotient.

## Proof of Theorem 6

( 1 ) ( 2 ) : (a) By (7), the regularity of E x at θ = 0 is equivalent to the existence of a random vector V ( x , ) L 2 ( π 0 ( x , d y ) ) , and of an error function f χ , such that

(22) h ( x , θ ) E χ θ ( x , y ) χ 0 ( x , y ) 1 2 χ 0 ( x , y ) θ V ( x , y ) 2 Π x θ ( d y ) θ 2 f χ ( x , θ ) .

The latter implies

(23) E ( χ θ ( x , y ) χ 0 ( x , y ) ) 2 Π x θ ( d y ) θ 2 2 f χ ( x , θ ) + 1 2 E V ( x , z ) 2 π 0 ( x , d z ) = θ f χ ( x , θ ) and f χ is an error function .

(b) The implication “ E x is regular at θ = 0 μ . ( x ) is differentiable at θ = 0 ” is shown in [6], using the fact that the differentiability of χ θ ( x , ) in L 2 implies the differentiability of χ θ ( x , ) in L 1 . Furthermore, the derivative at θ = 0 of μ . ( x ) is

E V ( x , z ) π 0 ( x , d z ) = μ 0 ( x ) E V ( x , z ) Q 0 ( x , d z ) ,

which gives,

(24) μ 0 ( x ) μ θ ( x ) = 1 1 2 θ E V ( x , z ) Q 0 ( x , d z ) + θ F μ ( x , θ ) ,

where

f μ ( x , u ) sup θ u F μ ( x , θ ) is an error function

(c) Let us define

h ( x , θ ) E ρ θ ( x , y ) ρ 0 ( x , y ) 1 2 ρ 0 ( x , y ) θ V ( x , y ) 2 Π x θ ( d y ) ,

where the function

(25) V ( x , y ) V ( x , y ) E V ( x , z ) Q 0 ( x , d z ) L 2 ( Q 0 ( x , d y ) )

satisfies

E V ( x , y ) Q 0 ( x , d y ) = 0 .

Then, we can write

h ( x , θ ) = E χ θ ( x , y ) μ θ ( x ) χ 0 ( x , y ) μ 0 ( x ) 1 2 χ 0 ( x , y ) μ 0 ( x ) θ V ( x , y ) 2 Π x θ ( d y ) ,

and use (24) and (25) to obtain

h ( x , θ ) = 1 μ 0 ( x ) E χ θ ( x , y ) χ 0 ( x , y ) 1 2 χ 0 ( x , y ) θ V ( x , y ) 1 2 θ E V ( x , z ) Q 0 ( x , d z ) { χ θ ( x , y ) χ 0 ( x , y ) } + θ F μ ( x , θ ) χ θ ( x , y ) 2 Π x θ ( d y ) 3 μ 0 ( x ) E χ θ ( x , y ) χ 0 ( x , y ) 1 2 χ 0 ( x , y ) θ V ( x , y ) 2 + 1 4 θ E V ( x , z ) Q 0 ( x , d z ) 2 × χ θ ( x , y ) χ 0 ( x , y ) 2 + θ 2 ( f μ ( x , θ ) ) 2 χ θ ( x , y ) Π x θ ( d y ) .

Finally, according to (22) and (23), we have

h ( x , θ ) 3 μ 0 ( x ) θ 2 f χ ( x , θ ) + 1 4 θ E V ( x , z ) Q 0 ( x , d z ) 2 f χ ( x , θ ) + μ θ ( x ) θ 2 ( f μ ( x , θ ) ) 2 θ 2 f ρ ( x , θ ) , where f ρ is an error function .

( 2 ) ( 1 ) : (a) Under the condition of differentiability of μ . ( x ) at 0, we obtain

(26) μ θ ( x ) = μ 0 ( x ) 1 + 1 2 θ μ 0 ( x ) μ 0 ( x ) + θ F μ ( x , θ ) ,

where

f μ ( x , u ) = sup θ u F μ ( x , θ ) is an error function .

(b) According to (7), the regularity of E x , at θ = 0 , is equivalent to the existence of a centered vector V ( x , ) L 2 ( Q 0 ( x , d y ) ) , and of an error function f ρ such that

h ( x , θ ) = E ρ θ ( x , y ) ρ 0 ( x , y ) 1 2 ρ 0 ( x , y ) θ V ( x , y ) 2 Π x θ ( d y ) θ 2 f ρ ( x , θ ) .

The vector V ( x , ) is defined by

V ( x , y ) V ( x , y ) + μ 0 ( x ) μ 0 ( x ) ,

which belongs to L 2 ( π 0 ( x , d y ) ) and satisfies

μ 0 ( x ) E V ( x , z ) π 0 ( x , d z ) .

(c) Let us define

(27) h ( x , θ ) = E χ θ ( x , y ) χ 0 ( x , y ) 1 2 χ 0 ( x , y ) θ V ( x , y ) 2 Π x θ ( d y ) .

By (26), we have

h ( x , θ ) = E 1 + 1 2 θ E V ( x , z ) Q 0 ( x , d z ) + θ F μ ( x , θ ) μ 0 ( x ) ρ θ ( x , y ) μ 0 ( x ) ρ 0 ( x , y ) 1 2 μ 0 ( x ) ρ 0 ( x , y ) θ V ( x , y ) 2 Π x θ ( d y ) = μ 0 ( x ) E ρ θ ( x , y ) ρ 0 ( x , y ) 1 2 ρ 0 ( x , y ) θ V ( x , y ) + 1 2 θ E V ( x , z ) Q 0 ( x , d z ) { ρ θ ( x , y ) ρ 0 ( x , y ) } + θ F μ ( x , θ ) ρ θ ( x , y ) 2 Π x θ ( d y ) .

With the same arguments as in ( 1 ) ( 2 ) (c), we retrieve

h ( x , θ ) 3 μ 0 ( x ) θ 2 f ρ ( x , θ ) + 1 4 θ E V ( x , z ) Q 0 ( x , d z ) 2 × 2 θ 2 f ρ ( x , θ ) + 1 2 E θ V ( x , z ) 2 Q 0 ( x , d z ) + θ 2 μ θ ( x ) f μ ( x , θ ) .

Consequently, there exists an error function f χ such that

h ( x , θ ) θ 2 f χ ( x , θ ) .

( 2 ) ( 3 ) : Let x E . Since

Q ¯ θ ( x , d y , d t ) = Q θ ( x , d y ) μ θ ( x ) e μ θ ( x ) t 1 R + ( t ) d t

is the tensorial product of two probability measures, then E x 1 is statistically isomorphic to E x × E x , where

(28) E x = ( R + , R + , ( μ θ ( x ) e μ θ ( x ) t 1 R + ( t ) d t ) θ Θ ) .

The differentiability of μ . ( x ) , at θ = 0 , is equivalent to the differentiability of the model E x . The assertion is then a consequence of [15, Corollary I.7.1] in Ibragimov and Has’minskii’s book.

( 3 ) ( 2 ) : As in the proceeding implication, observe that E x 1 is statistically isomorphic to E x × E x , and the result becomes a simple consequence of [15, Theorem I.7.2].

( 3 ) ( 4 ) : This equivalence is deduced from the fact that Q ¯ θ ( x , d y , d t ) is the distribution of ( X T 1 , T 1 ) , then Q ¯ θ ( x , d y , d t ) is identified with P x , θ restricted to the σ -field T 1 . Thus, E x 1 and ( Ω , T 1 , ( P θ ) θ Θ ) are statistically isomorphic.□

## Proof of Theorem 5

(1) For the necessity condition, we will check (15) and (16), as it was done for the Markov chains in [7]. For fixed x E , we choose the dominating probability K x θ , ξ , the one for which the process ( X t ) t 0 has the generator

Π x θ , ξ ( d y ) = π θ ( x , d y ) + π ξ ( x , d y ) + π 0 ( x , d y ) .

(1)(a) Using the function g in (20), we have

h t 0 , θ + h t 0 , ξ h t θ , ξ 1 4 0 t θ I ( X s ) ξ d s = θ ξ 0 t g ( X s , θ , ξ ) d s .

Then, the convergence (15) holds if A t ( y ) is true for all y E , which, by Remark 9, is equivalent to the regularity of E y , which is regular for all y E .

(1)(b) The Doob-Meyer decomposition of the supermartingale Z θ asserts that

Z θ = 1 + M θ A θ ,

where M θ is a local martingale and A θ is a predictable nondecreasing process. Since the jump times of the process ( X t ) t 0 are totally inaccessible, then Z θ is left-quasi continuous and A θ has necessarily continuous paths, cf. [16, Theorem 14]. From the decomposition of the additive functional log Z θ on the event ( t < τ 0 τ θ ) , into a local martingale N θ , and a process with finite variation B θ (see [5, p. 40]), we may write Z θ in the form

Z t θ = i 1 , T i t χ θ χ 0 ( X T i 1 , X T i ) exp 0 t ( μ 0 μ θ ) ( X s ) d s = e N t θ + B t θ ,

where,

N t θ = s t , X s X s log χ θ χ 0 ( X s , X s ) 0 t E log χ θ χ 0 ( X s , y ) π 0 ( X s , d y ) d s ,

B t θ = 0 t E 1 μ θ μ 0 ( X s ) + log χ θ χ 0 ( X s , y ) π 0 ( X s , d y ) d s .

Applying Ito’s formula to the semimartingale Z θ , we obtain

Z t θ = 1 + 0 t Z s θ d N s θ + 0 t Z s θ d B s θ + s t Z s θ ( e Δ N s θ 1 Δ N s θ ) = 1 + 0 t Z s θ d N s θ + 0 t Z s θ d B s θ + i 1 , T i t Z T i 1 θ χ θ χ 0 ( X T i 1 , X T i ) 1 log χ θ χ 0 ( X T i 1 , X T i ) = 1 + M t θ A t θ ,

where

M t θ = 0 t Z s θ d N s θ + 0 t E Z s θ χ θ χ 0 ( X s , y ) 1 log χ θ χ 0 ( X s , y ) ( μ ν 0 ) ( , d s , d y ) , A t θ = 0 t Z s θ d B s θ 0 t E Z s θ