Metric Entropy of Nonautonomous Dynamical Systems

We introduce the notion of metric entropy for a nonautonomous dynamical system given by a sequence of probability spaces and a sequence of measure-preserving maps between these spaces. This notion generalizes the classical concept of metric entropy established by Kolmogorov and Sinai, and is related via a variational inequality to the topological entropy of nonautonomous systems as defined by Kolyada, Misiurewicz and Snoha. Moreover, it shares several properties with the classical notion of metric entropy. In particular, invariance with respect to appropriately defined isomorphisms, a power rule, and a Rokhlin-type inequality are proved.


Introduction
In the theory of dynamical systems, entropy is an invariant which measures the exponential complexity of the orbit structure of a system. Undoubtedly, the most important notions of entropy are metric entropy for measure-theoretic dynamical systems, sometimes also named Kolmogorov-Sinai entropy by its inventors, and topological entropy for topological systems (cf. Kolmogorov [12], Sinai [25] and Adler et al. [1]). There exists a huge variety of modifications and generalizations of these two basic notions. However, most of these only apply to systems which are governed by time-invariant dynamical laws, so-called autonomous dynamical systems. In the literature, one basically finds two exceptions. In the theory of random dynamical systems, which are nonautonomous dynamical systems described by measurable skew-products, both notions of entropy, metric and topological, have been defined and extensively studied (see, e.g., [3,7,17,18,27]). In particular, the classical variational principle which relates the two notions of entropy to each other, has been adapted to their random versions by Bogenschütz [3]. The second exception is the quantity introduced in Kolyada and Snoha [13], the topological entropy of a nonautonomous system given as a discrete-time deterministic process on a compact topological space. The theory founded in [13] has been further developed in [9,10,14,20,22,26,28,29] by several authors. In some of these articles, the definition of entropy has been generalized, in particular to continuous-time systems, to systems with noncompact state space, systems with time-dependent state space, and to local processes. Besides that, there have been other independent approaches (see, e.g., [21,24]), which essentially lead to the same notion. Both of the nonautonomous versions of entropy, random and deterministic, are intimately related to each other but nevertheless, one cannot draw direct conclusions from the well-developed random theory to the deterministic one except for generic statements (saying that something holds for almost every deterministic system in a large class of such systems parametrized by a random parameter).
The reason why the deterministic nonautonomous theory of entropy is still quite poor-developed in particular lies in the fact that the notion of metric entropy (together with a variational principle) has not yet successfully been established in that theory. To the best of my knowledge, the only approach in this direction can be found in Zhu et al. [28]. This work shows that one of the obstacles in establishing a reasonable notion of metric entropy which allows for a variational principle lies in the proof of the power rule which relates the entropies of the time-t-maps (the powers of the system) to that of the time-one-map. The aim of this paper is to introduce the notion of metric entropy for nonautonomous measure-theoretic dynamical systems together with a formalism which allows for a power rule and at least the easier part of the variational principle.
We briefly describe the contents of the paper. In Section 2, we recall the notion of topological entropy for a nonautonomous dynamical system as defined in [14] by Kolyada, Misiurewicz and Snoha. This notion of entropy generalizes the one in [13] by replacing the state space X (a compact metric space) by a whole sequence X n of such spaces. The process is then given by a sequence of continuous maps f n : X n → X n+1 . As in the classical theory, three equivalent characterizations of entropy are available, via open covers, via spanning sets, or via separated sets. However, one crucial point here is that in the open cover definition sequences of open covers for the spaces X n with Lebesgue numbers bounded away from zero have to be considered. In order to prove the power rule for this entropy, the additional assumption that the sequence f n be uniformly equicontinuous is necessary.
In Section 3, the metric entropy is defined. Here the system is given by a sequence f n : X n → X n+1 of measurable maps between probability spaces (X n , µ n ) such that the sequence µ n of measures is preserved in the sense that f n µ n = µ n+1 . The metric entropy with respect to a sequence of finite measurable partitions of the spaces X n can be defined in the usual way (with the obvious modifications), and has similar properties as in the autonomous case. Similarly as in the topological situation (the definition of entropy via sequences of covers), one does not get a reasonable quantity by considering all sequences of partitions. One problem is that information about the initial state can be generated merely due to the fact that the partitions in such a sequence become finer very rapidly. Hence, we have to restrict the class of admissible sequences of partitions, which is done in an axiomatic way by requiring some of the properties that are satisfied in the topological setting by the class of all sequences of open covers with Lebesgue numbers bounded away from zero. This leads to the notion of an admissible class which enjoys some nice and natural properties. For instance, in the case of an autonomous measure-preserving system, one can consider the smallest admissible class which contains all constant sequences of partitions, which leads to the classical notion of metric entropy. Several properties of the classical metric entropy carry over to its nonautonomous generalization. In particular, we can establish invariance under appropriately defined isomorphisms, an analogue of the Rokhlin inequality, and a power rule.
In Section 4, we prove for equicontinuous systems the inequality between metric and topological entropy which establishes one part of the variational principle. We adapt the arguments of Misiurewicz's elegant proof from [19] by defining an appropriate admissible class of sequences of partitions which is designed in such a way that Misiurewicz's arguments can be applied to its members. This class depends on the given invariant sequence of measures. In general, it might be very small, so that our variational inequality would not give any meaningful information. For this reason we establish different stability conditions for invariant sequences of measures which guarantee that the associated Misiurewicz class contains sequences of arbitrarily fine partitions. These stability conditions capture the intuitive idea that the initial measure µ 1 should not be deformed too much by pushing it forwards by the maps f n 1 = f n • · · · • f 1 , so that such sequences become an appropriate nonautonomous substitute of invariant measures in the autonomous theory. In particular, we show that the expanding systems studied in Ott, Stenlund, and Young [23] satisfy such a stability condition.

Notation
By a nonautonomous dynamical system (short NDS) we understand a deterministic process (X 1,∞ , f 1,∞ ), where X 1,∞ = {X n } n≥1 is a sequence of sets and f n : X n → X n+1 a sequence of maps. For all integers k, n ∈ N we write The last notation will only be applied to sets. We do not assume that the maps f n are invertible. The trajectory of a point x ∈ X 1 is the sequence {f n 1 (x)} n≥0 .
By f k,∞ we denote the sequence {f k , f k+1 , f k+2 , . . .} which defines a NDS on We consider two categories of systems, metric and topological. In a metric system, the sets X n are probability spaces and the maps f n are measure-preserving. That is, each X n is endowed with a σ-algebra A n and a probability measure µ n such that the maps f n are measurable and f n µ n = µ n+1 for all n ≥ 1, where f n µ n denotes the push-forward (f n µ n )(A) = µ n (f −1 n (A)) for all A ∈ A n+1 . In this case, we call µ 1,∞ = {µ n } n≥1 an f 1,∞ -invariant sequence. In a topological system, each X n is a compact metric space and the maps f n are continuous.
If X is a compact topological space and U an open cover of X, we denote by N (U) the minimal cardinality of a finite subcover. If U 1 , . . . , U n are open covers of X, we write n i=1 U i for their join, i.e., the open cover consisting of all the In a metric space (X, ̺), we denote the open ball centered at x with radius ε by B(x, ε) or B(x, ε; ̺). We write dist(x, A) for the distance from a point x to a nonempty set A, i.e., dist(x, A) = inf a∈A ̺(x, a). The closure, the interior, and the boundary of a set A are denoted by cl A, int A and ∂A, respectively.
Recall that the Lebesgue number of an open cover U of a compact metric space X is defined as the maximal ε > 0 such that every ε-ball in X is contained in one of the members of U.

Topological Entropy
In this subsection, we recall the notion of entropy for a topological NDS (X 1,∞ , f 1,∞ ), as defined in Kolyada et al. [14]. As in the classical autonomous theory, three equivalent definitions are available. We denote the metric of X k by ̺ k and define on each of the spaces X k a class of Bowen-metrics by It is easy to see that ̺ k,n is a metric on X k which is topologically equivalent to ̺ k . In order to define the topological entropy of f 1,∞ , we only use the metrics We let r sep (n, ε, f 1,∞ ) denote the maximal cardinality of an (n, ε)-separated subset of X 1 and r span (n, ε, f 1,∞ ) the minimal cardinality of a set which (n, ε)-spans X 1 , and we define The corresponding limits in ε exist, since the quantities r sep (n, ε, f 1,∞ ) and r span (n, ε, f 1,∞ ) are monotone (non-increasing) with respect to ε, and this prop-erty carries over to their exponential growth rates. Hence, the limits can also be replaced by the corresponding suprema over all ε > 0. With the same arguments as in the autonomous case, one shows that the numbers h sep (f 1,∞ ) and h span (f 1,∞ ) actually coincide. We call their common value the topological entropy of f 1,∞ .
The definition of topological entropy via open covers has to be modified a little bit in order to fit to the nonautonomous case. Consider a sequence U 1,∞ = {U n } such that U n is an open cover of X n for each n ≥ 1. The entropy of f 1,∞ with respect to the sequence U 1,∞ is then defined as In contrast to the autonomous case, the upper limit cannot be replaced by a limit (see [13] for a counterexample). In order to define the topological entropy of f 1,∞ one should not take the supremum of h cov (f 1,∞ ; U 1,∞ ) over all sequences of open covers. The problem is that the value of h cov (f 1,∞ ; U 1,∞ ) might become arbitrarily large just by the fact that the maximal diameters of the open sets in the covers U n exponentially converge to zero for n → ∞. In this case, information about the initial state can be obtained due to finer and finer measurements even if the system has very regular dynamics. To exclude this, we restrict ourselves to sequences of open covers with Lebesgue numbers bounded away from zero. We denote the family of all these sequences by L(X 1,∞ ) and define h cov (f 1,∞ ; U 1,∞ ).
We leave the easy proof that this number coincides with the topological entropy as defined above to the reader. In the rest of the paper, we write h top (f 1,∞ ) for the common value of h sep (f 1,∞ ), h span (f 1,∞ ) and h cov (f 1,∞ ).

Remark:
Note that the value of h top (f 1,∞ ) heavily depends on the metrics ̺ k in contrast to the classical autonomous situation. However, in many relevant examples, as, e.g., systems defined by time-dependent differential equations, all of these metrics come from a single metric on a possibly compact space. So in this case the dependence on the metrics disappears due to a canonical choice.
The topological entropy of an autonomous system given by a map f satisfies the power rule h top (f k ) = k · h top (f ) for all k ≥ 1. In order to formulate an analogue of this property for NDSs, we have to introduce for every k ≥ 1 the k-th power system of the NDS (X 1,∞ , f 1,∞ ). This is the system (X 1,∞ , f In case that the spaces X n coincide, the following result can be found in [13,Lem. 4.2]. Since the proof for the general case works analogously, we omit it.

Proposition:
For every k ≥ 1 it holds that In general, the converse inequality in the above proposition fails to hold (see [13] for a counterexample). However, if we assume that the family {f n } is equicontinuous, equality does hold. Equicontinuity in this context means uniform equicontinuity, i.e., for every ε > 0 there exists δ > 0 such that ̺ n (x, y) < δ for any x, y ∈ X n , n ∈ N, implies ̺ n+1 (f n (x), f n (y)) < ε. In [13,Lem. 4.4] this is proved for the case when the spaces X n all coincide, by using the definition via separated sets. Here we present a different proof using the definition via sequences of open covers, since we want to carry over the arguments later to the proof of the power rule for metric entropy.

Lemma:
Let U 1,∞ ∈ L(X 1,∞ ) and assume that f 1,∞ is equicontinuous. Then for each m ≥ 1 the sequence V 1,∞ , defined by V n := Proof: Let ε > 0 be a common lower bound for the Lebesgue numbers of the covers U n . Then, for each n ≥ 1, ε is also a lower bound for the Lebesgue number of V n with respect to the Bowen-metric ̺ n,m . This is proved as follows: Let x ∈ X n and assume that ̺ n,m (x, y) < ε. Then f i n (y) is contained in the ball B(f i n (x), ε; ̺ n+i ) for i = 0, 1, . . . , m−1. Since ε is a lower bound of the Lebesgue number of U n+i for all i, we find sets U i ∈ U n+i such that B(f i n (x), ε; ̺ n+i ) ⊂ U i for i = 0, 1, . . . , m − 1, which implies that It is easy to see that from equicontinuity of f 1,∞ it follows that also the family {f i n : n ≥ 1, i = 0, 1, . . . , m − 1} is equicontinuous. Hence, we can find δ > 0 such that ̺ n (x, y) < δ implies ̺ n+i (f i n (x), f i n (y)) < ε for all n ≥ 1 and i = 0, 1, . . . , m − 1. Therefore, every Bowen-ball B(x, ε; ̺ n,m ) contains the δ-ball B(x, δ; ̺ n ), which shows that δ is a lower bound for the Lebesgue numbers of the covers V n .

Lemma:
Let {a n } n≥1 be a monotonically increasing sequence of real numbers. Then for every k ≥ 1 it holds that lim sup n→∞ a n n = lim sup n→∞ a nk nk .
Proof: It suffices to prove the inequality "≤". To this end, consider an arbitrary sequence {n l } l≥1 of positive integers converging to ∞. For every l ≥ 1 there is an m l ∈ N 0 with m l k ≤ n l ≤ (m l + 1)k, and m l → ∞. This implies 1 n l a n l ≤ 1 m l k a (m l +1)k .
It follows that Hence, we conclude that lim sup l→∞ a n l n l ≤ lim sup which yields the desired inequality.
1,∞ as follows: Then we find To obtain the last equality we used Lemma 2.4. By Lemma 2.3, Since this holds for every U 1,∞ ∈ L(X 1,∞ ), the desired inequality follows. (i) Topological entropy for uniformly continuous maps on noncompact metric spaces (cf. Bowen [4]): Consider a uniformly continuous map f : X → X on a metric space X. The topological entropy of f is defined by where the supremum runs over all compact sets K ⊂ X and r span (n, ε, K) is the minimal cardinality of a set which (n, ε)-spans K. Alternatively, one can take maximal (n, ε)-separated subsets of K. If we define for each we see that h top (f ) can be written as (ii) Topological sequence entropy (cf. Goodman [8]): Here the sequence X 1,∞ is constant and the sequence f n is of the form f n = f kn , where f : X → X is a given continuous map and (k n ) n≥1 an increasing sequence of integers.
(iii) Topological entropy of random dynamical systems (cf. Bogenschütz [3]): Consider a probability space (Ω, F , P ) with an ergodic invertible transformation ϑ on Ω, and a measurable space (X, B). A mapping ϕ : Z×Ω×X → X such that (ω, x) → ϕ(n, ω, x) is F ⊗ B-measurable for all n ∈ Z and ϕ(n+ m, ω, x) = ϕ(n, ϑ m ω, ϕ(m, ω, x)) for all n, m ∈ Z and (ω, x) ∈ Ω× X is called a random dynamical system on X over ϑ. If X is a compact metric space, B is the Borel σ-algebra of X, and the maps ϕ(n, ω, ·) are homeomorphisms, one speaks of a topological random dynamical system. If U is an open cover of X, one defines for every ω ∈ Ω From Kingman's subadditive ergodic theorem it follows that this number exists for almost every ω ∈ Ω and is constant almost everywhere. Then one can take this constant value (for each U) and define the topological entropy of the random dynamical system by taking the supremum over all open covers U. If we fix one ω ∈ Ω and consider the number (2), replacing the limit by a lim sup, and then take the supremum over all U, we obtain the topological entropy of the NDS (X 1,∞ , f 1,∞ ) given by X n := X, f n := ϕ(1, ϑ n−1 ω, ·).

Remark:
It is an interesting fact that not only Bowen's notion of topological entropy for uniformly continuous maps is a special case of the topological entropy for NDSs, but that for an equicontinuous NDS (X 1,∞ , f 1,∞ ) also the converse statement is true: h top (f 1,∞ ) can be regarded as the topological entropy of a uniformly continuous map, restricted to a compact noninvariant set. To see this, let X be the disjoint sum of the spaces X n , i.e., Then a uniformly continuous map f : X → X is given by putting f equal to f n on X n , and we have This observation in particular allows to conclude the power rule from the corresponding power rule for Bowen's entropy. Taking the supremum of h top (f, K) over all compact subsets K of X gives the quantity called the asymptotical topological entropy of f 1,∞ in [13], defined by lim n→∞ h top (f n,∞ ).

Metric Entropy
In this section, we introduce the metric entropy of a NDS.

The Entropy with Respect to a Sequence of Partitions
Recall that the entropy of a finite measurable partition P = {P 1 , . . . , P k } of a probability space (X, A, µ) is defined by where 0·log 0 := 0, and satisfies 0 ≤ H µ (P) ≤ log k. The equality H µ (P) = log k holds iff all members of P have the same measure.
If P and Q are two measurable partitions of X, the joint partition Now consider a metric NDS (X 1,∞ , f 1,∞ , µ 1,∞ ), where µ 1,∞ denotes the sequence of probability measures with f n µ n = µ n+1 . Let P 1,∞ = {P n } be a sequence such that P n is a finite measurable partition of X n for every n ≥ 1, and define We call this number the metric entropy of f 1,∞ with respect to P 1,∞ . Note that in the autonomous case this definition reduces to the usual definition of metric entropy with respect to a partition. In this case, the lim sup is in fact a limit, which follows from a subadditivity argument. However, in the general case considered here, subadditivity does not necessarily hold. (In [13], one finds a counterexample for the topological case, which can be modified to serve as a counterexample in the metric case, since this system preserves the Lebesgue measure.) For an autonomous system given by a map f with an invariant measure µ and a partition P, we also use the common notations h µ (f ; P) and Several well-known properties of the entropy with respect to a partition carry over to its nonautonomous generalization. In order to formulate these properties, we have to introduce some notation. We say that a sequence P 1,∞ of measurable partitions is finer than another such sequence Q 1,∞ if P n is finer than Q n for every n ≥ 1 (i.e., every element of P n is contained in an element of Q n ). In this case, we write P 1,∞ Q 1,∞ . If P 1,∞ and Q 1,∞ are two sequences of measurable partitions, we define their join Finally, recall the definition of conditional entropy for partitions of a probability If P and Q are two partitions of X, the conditional entropy of P given Q is Some well-known properties of the conditional entropy are summarized in the following proposition (cf., e.g., Katok and Hasselblatt [11]).
3.1 Proposition: Let P, Q and R be partitions of X.
Now we can prove a list of elementary properties of h(f 1,∞ ; P 1,∞ ) most of which are straightforward generalizations of the corresponding properties of classical metric entropy.
3.2 Proposition: Let P 1,∞ and Q 1,∞ be two sequences of finite measurable partitions for X 1,∞ . Then the following assertions hold: Proof: The properties (i)-(iii) follow very easily from the properties of the entropy of a partition. Property (iv) is a consequence of Lemma 2.4, since the partitions become finer with increasing n, and hence the sequence This implies which concludes the proof of (v). Next, let us prove (vi): From Proposition 3.1 (ii) it follows that For the last term in this expression we further obtain Now we use Proposition 3.1 (iii) to see that this sum can be estimated by Using the same arguments again, for this expression we find Using . Going on inductively, we end up with the estimate Hence, we obtain which finishes the proof of (vi). Finally, we prove (vii): For any k ∈ N we find Using the elementary property of the entropy of partitions that H(A) ≥ H(B) whenever A is finer than B, the converse inequality is proved by This implies (vii) and finishes the proof of the proposition.

Remark:
Note that the equality in item (vii) of the preceding proposition reveals an essential difference between metric and topological entropy of NDSs, since in the topological setting only the inequality holds. A counterexample for the equality is given by a sequence f 1,∞ on the unit interval such that f 1 is constant and all other f n are equal to the standard tent map. In this case, clearly h top (f 1,∞ ) = 0, but h top (f k,∞ ) = log 2 for all k ≥ 2 (see also [13] for a counterexample with h top (f k,∞ ) < h top (f k+1,∞ ) for all k). Therefore, the notion of asymptotical topological entropy, as defined in [13], has no meaningful analogue for metric systems.
From item (vii) of the preceding proposition we can conclude a similar result as [13, Thm. A] which asserts that the topological entropy of autonomous systems is commutative in the sense that

Corollary:
Consider two probability spaces (X, µ) and (Y, ν) and measurable maps f : X → Y , g : Y → X such that f µ = ν and gν = µ. Then µ is an invariant measure for g • f , ν is an invariant measure for f • g, and it holds that Using Proposition 3.2 (iv), we find Similarly, we obtain 2h(f 2,∞ ; P 2,∞ ) = h ν (f • g; Q). Hence, from (4) we conclude Since we can choose Q freely, this implies h ν (f • g) ≤ h µ (g • f ). Starting with a partition P of X and putting Q := g −1 P, we get the converse inequality.
3.5 Remark: In Balibrea, Jiménez López, and Cánovas [2] one finds proofs for the commutativity of metric and topological entropy which are not based on entropy notions for nonautonomous systems. These commutativity properties were first found in Dana and Montrucchio [6]. Later, Kolyada and Snoha [13] rediscovered the commutativity of topological entropy.
We finish this subsection with an example which shows that the entropy h(f 1,∞ ; P 1,∞ ) can be arbitrarily large even for a very trivial system.
From this example one sees that by taking appropriate sequences of partitions, one obtains arbitrarily large values for the entropy of the identity. Here we have the same problem as we had in defining the topological entropy via sequences of open covers. If the resolution becomes finer at exponential speed, one obtains a gain in information which is not due to the dynamics of the system. Hence, in the definition of the metric entropy of f 1,∞ , we have to exclude such sequences.

Admissible Classes and Metric Entropy of Nonautonomous Systems
To define the entropy of the system (X 1,∞ , f 1,∞ , µ 1,∞ ), we have to choose a sufficiently nice subclass E from the class of all sequences P 1,∞ . Then the entropy can be defined in the usual way by taking the supremum over all P 1,∞ ∈ E. In view of the definition of topological entropy in terms of sequences of open covers and Example 3.6 it is clear that taking all sequences of partitions is too much. Since there is no direct analogue to Lebesgue numbers for measurable partitions, we introduce suitable classes of sequences of partitions by axioms which reflect some properties of the family L(f 1,∞ ) defined in Section 2.

Definition:
We call a nonempty class E of sequences of finite measurable partitions for X 1,∞ admissible (for f 1,∞ ) if it satisfies the following axioms: (A) For every sequence P 1,∞ ∈ E there is a bound N ≥ 1 on #P n , i.e., #P n ≤ N for all n ≥ 1.
(C) E is closed with respect to successive refinements via the action of f 1,∞ . That is, if P 1,∞ ∈ E, then for every m ≥ 1 also P m 1,∞ (f 1,∞ ) ∈ E.
From Axiom (A) it follows that the upper bound in Proposition 3.2 (i) is always finite. Moreover, by adding sets of measure zero, we can assume that #P n is constant for every element of E. Axiom (B) says that with every sequence P 1,∞ ∈ E, also the sequences which are coarser than P 1,∞ are contained in E. Axiom (C) will be essential for proving the power rule for metric entropy. It reflects the property of sequences of open covers stated in Lemma 2.3.

Definition:
If E is an admissible class, we define the metric entropy of f 1,∞ with respect to E by 3.9 Proposition: Given a metric NDS (X 1,∞ , f 1,∞ ), let E be the class of all sequences of partitions for X 1,∞ which satisfy Axiom (A). Then E is an admissible class. E is maximal, i.e., it cannot be extended to a larger admissible class. Therefore, we denote this class by E max or E max (X 1,∞ ).
Proof: It is obvious that E cannot be enlarged without violating Axiom (A). Hence, it suffices to prove that E satisfies Axioms (B) and (C). If P 1,∞ ∈ E and Q 1,∞ is a sequence of partitions which is coarser than P 1,∞ , it follows that #Q n ≤ #P n for all n ≥ 1, which implies Q 1,∞ ∈ E. Now consider for some P 1,∞ ∈ E and m ≥ 1 the sequence P m 1,∞ (f 1,∞ ). We have This implies that E satisfies Axiom (C).
The following example shows that E max is in general not a useful admissible class. 1

Example:
We show that h Emax (f 1,∞ ) = ∞ whenever the maps f i are bimeasurable and the spaces (X n , µ n ) are non-atomic. Indeed, for every k ≥ 1 we find a sequence P 1,∞ of partitions with #P n ≡ k such that h(f 1,∞ ; P 1,∞ ) = log k, which is constructed as follows. On X 1 take a partition P 1 consisting of k sets with equal measure 1/k. Then Q 2 := f 1 P 1 is a partition of X 2 into k sets of equal measure. Partition each element Q i of Q 2 into k sets Q i1 , . . . , Q ik of equal measure 1/k 2 . Then define a new partition P 2 of X 2 consisting of the sets Also P 2 is a partition of X 2 into k sets of equal measure 1/k, and P 2 , Q 2 are independent. This implies Inductively, one can proceed this construction. For i from 1 to some fixed n, assume that P i is a partition of X i into k sets of equal measure such that R n := P 1 ∨ f −1 1 P 2 ∨ . . . ∨ f −(n−1) 1 P n consists of k n sets of equal measure. Then consider the partition Q n+1 := f n 1 R n of X n+1 . Let R n = {R 1 , . . . , R k n } and partition each R i into k sets of equal measure 1/k n+1 , say = log k n + log k = (n + 1) log k, which implies h(f 1,∞ ; P 1,∞ ) = log k for the sequence P 1,∞ = {P n } obtained by this construction.
As this example shows, we have to consider smaller admissible classes. These are provided by the following proposition whose simple proof will be omitted.
3.11 Proposition: Arbitrary unions and nonempty intersections of admissible classes are again admissible classes. In particular, for every nonempty subset F ⊂ E max there exists a smallest admissible class E(F ) which satisfies F ⊂ E(F ) ⊂ E max (defined as the intersection of all admissible classes containing F ). We also call E(F ) the admissible class generated by F .
We also have to show that the metric entropy of a NDS indeed generalizes the usual notion of metric entropy for autonomous systems. To this end, we use the following result.

Proposition:
Let F be a nonempty subset of E max . Then is an admissible class with F ⊂ H(F ) ⊂ E max . Consequently, E(F ) ⊂ H(F ) and it holds that
The preceding proposition shows not only that there exists a multitude of admissible classes, but also that the metric entropy of f 1,∞ can be equal to any of the numbers h(f 1,∞ ; P 1,∞ ) by taking the one-point set F := {P 1,∞ } as a generator for an admissible class. The next corollary immediately follows.

Corollary:
Assume that the sequences X 1,∞ , f 1,∞ , µ 1,∞ are constant, i.e., we have an autonomous system (X, f, µ). Let F be the set of all constant sequences of finite measurable partitions of X.

Invariance, Rokhlin Inequality, and Restrictions
In order to be a reasonable quantity, the metric entropy of a system f 1,∞ should be an invariant with respect to isomorphims. By an isomorphism between sequences (X 1,∞ , µ 1,∞ ) and (Y 1,∞ , ν 1,∞ ) of probability spaces we understand a sequence π 1,∞ = {π n } of bi-measurable maps π n : X n → Y n with π n µ n = ν n . Such a sequence is an isomorphism between the systems f 1,∞ on X 1,∞ and g 1,∞ on Y 1,∞ if additionally for each n ≥ 1 the diagram commutes. In this case we also say that the systems f 1,∞ and g 1,∞ are conjugate.
If the maps π n are only measurable but not necessarily measurably invertible, we say that the systems f 1,∞ and g 1,∞ are semiconjugate. The sequence π 1,∞ is then called a conjugacy or a semiconjugacy from f 1,∞ to g 1,∞ , respectively.
Given two admissible classes E and F for X 1,∞ and Y 1,∞ , resp., we also define the notions of E-F -isomorphisms and E-F -(semi)conjugacies via the condition that π 1,∞ respects E and F in the sense that In the case of an isomorphism or a conjugacy, the implication into the other direction must hold as well.
For autonomous systems, Proposition 3.2 (vi) can be used to show that the entropy depends continuously on the partition, where the set of partitions is endowed with the Rokhlin metric, given by d R (P, Q) = H µ (P|Q)+H µ (Q|P). The nonautonomous analogue of this result is formulated in the next proposition.

Proposition: For two sequences
Then d R is a metric on E max and the function P 1, Proof: The proof that d R is a metric easily follows from the properties of conditional entropy stated in Proposition 3.1. From Proposition 3.2 (vi) we conclude the nonautonomous Rokhlin inequality which finishes the proof.
Given a metric NDS (X 1,∞ , f 1,∞ , µ 1,∞ ), assume that we can decompose each of the spaces X n as a disjoint union X n = Y n∪ Z n such that f n (Y n ) ⊂ Y n+1 , f n (Z n ) ⊂ Z n+1 , and µ n (Y n ) ≡ c for a constant 0 < c ≤ 1. Then let us consider the restrictions of f 1,∞ to the sequences Y 1,∞ := {Y n } and Z 1,∞ := {Z n }, resp., i.e., the systems defined by the maps It we consider the probability measure ν n (A) := µ n (A)/c on Y n , it follows that (Y 1,∞ , g 1,∞ , ν 1,∞ ) is also a metric system. If c < 1, we can define a corresponding invariant sequence of probability measures for the system (Z 1,∞ , h 1,∞ ) as well.
Proof: It is clear that E| Y1,∞ satisfies Axiom (A). Let Q 1,∞ ∈ E 1,∞ | Y1,∞ . Then there exists P 1,∞ ∈ E such that the elements of each Q n are the intersections of the elements of P n with Y n . Now assume that R 1,∞ is a sequence of partitions for Y 1,∞ which is coarser than Q 1,∞ . Then the elements of each R n are unions of elements of Q n . Taking corresponding unions of elements of P n for each n, one constructs a sequence S 1,∞ ∈ E coarser than P 1,∞ such that {Y n } ∨ S 1,∞ = R 1,∞ , which proves that E| Y1,∞ satisfies Axiom (B). Finally, if Q n ≡ {Y n } ∨ P n for some P 1,∞ ∈ E, then for all k, m ≥ 1 it holds that which implies that E| Y1,∞ satisfies Axiom (C). To prove the inequality of entropies, consider Q 1,∞ ∈ E| Y1,∞ and the corresponding P 1,∞ ∈ E with Q n ≡ {Y n } ∨ P n . Then The last summand gives and thus can be omitted in the computation of h(g 1,∞ ; Q 1,∞ ). We obtain If we consider the sequence P 1,∞ of partitions P n := {P ∩Y n : P ∈ P n }∪{P ∩Z n : P ∈ P n }, we see that By the assumption on E it follows that P 1,∞ ∈ E and hence the assertion follows. In the case c = 1, the measures µ n (Z n ) are all zero, and hence equality holds in (7). Since P 1,∞ is finer than P 1,∞ , we have which finishes the proof.

Remark:
For a topological NDS given by a sequence of homeomorphisms, endowed with an invariant sequence of Borel probability measures, the above proposition can be applied to the decomposition Y n := supp µ n , Z n := X n \ supp µ n , where supp µ n = {x ∈ X n |∀ε > 0 : µ n (B(x, ε)) > 0} is the support of the measure µ n .

The Power Rule for Metric Entropy
Given a metric NDS (X 1,∞ , f 1,∞ ) and k ∈ N, we define the k-th power system (X 1,∞ ) in exactly the same way as we did for topological systems. It is very easy to see that this system is a metric system as well.
If E is an admissible class for (X 1,∞ , f 1,∞ ), we denote by E [k] the class of all sequences of partitions for X 1,∞ which are defined by restricting the sequences in E to the spaces in X 1,∞ (i.e., Q n P (n−1)k+1 for all n ≥ 1), we can extend Q 1,∞ to a sequence R 1,∞ of partitions for X 1,∞ which is coarser than P 1,∞ . This can be done in a trivial way by putting R n := P n if n − 1 is not a multiple of k, Q 1+(n−1)/k if n − 1 is a multiple of k.

Proposition: If E is an admissible class for
Then it follows that R n = P n P n in the first case, and R n = Q 1+(n−1)/k P n in the second one. Since E satisfies Axiom (B), we know that R 1,∞ ∈ E, which implies that satisfies Axiom (C), let P 1,∞ ∈ E and m ≥ 1. We have to show that the sequence Q 1,∞ defined by is an element of E [k] . To this end, first note that The sequence R 1,∞ can be extended to an element S 1,∞ of E, which is given by Indeed, S 1,∞ ∈ E, since E satisfies Axiom (C). Hence, 1,∞ ∈ E [k] and since E [k] satisfies Axiom (B), this implies Q 1,∞ ∈ E [k] . Now let us prove the formula for the entropies. Let P 1,∞ ∈ E. We define a sequence Q 1,∞ of finite measurable partitions for X 1,∞ as follows: This follows by combining the facts that P 1,∞ ∈ E and E satisfies Axiom (C). We find that To obtain the last equality we used Proposition 3.2 (iv). Now consider also the sequence P [k] 1,∞ . It is obvious that Q 1,∞ is finer than P [k] 1,∞ . Hence, using Proposition 3.2 (iii), we find Taking the supremum over all P 1,∞ on the left-hand side and over all P 1,∞ on the right-hand side, the inequality follows. The converse inequality follows from which holds for every P 1,∞ ∈ E.

Relation to Topological Entropy
In order to prove a variational inequality, we consider a topological NDS (X 1,∞ , f 1,∞ ) with an f 1,∞ -invariant sequence µ 1,∞ of Borel probability measures. When speaking of measurable partitions in this context, we mean "exact" partitions and not partitions in the sense of measure theory, where different elements of the partition may have a nonempty overlap of measure zero. We will frequently use the property of inner regularity of Borel measures, i.e., µ(A) = sup{µ(K) : K ⊂ A compact} for any Borel subset of a compact metric space.

The Misiurewicz Class
In this subsection, we introduce a special admissible class which we will use to prove the variational inequality. This class is constructed in such a way that its elements are just perfect to apply the arguments of Misiurewicz's proof of the variational principle to them. Therefore, we call it the Misiurewicz class.
We define the Misiurewicz class E M ⊂ E max as follows. A sequence P 1,∞ ∈ E max , P n = {P n,1 , . . . , P n,kn }, is an element of E M iff for every ε > 0 there exist δ > 0 and compact sets C n,i ⊂ P n,i (n ≥ 1, 1 ≤ i ≤ k n ) such that for every n ≥ 1 the following two hypotheses are satisfied: (a) µ n (P n,i \C n,i ) ≤ ε.

Proposition:
If f 1,∞ is equicontinuous, then E M is an admissible class.
Proof: First note that E M is nonempty, since it contains the trivial sequence defined by P n := {X n } for all n ≥ 1. To show that E M satisfies Axiom (B), assume that P 1,∞ = {P n } ∈ E M , P n = {P n,1 , . . . , P n,kn }, and let Q 1,∞ be a sequence which is coarser than P 1,∞ . Let Q n be given by Q n = {Q n,1 , . . . , Q n,ln }.
Then every element of Q n must be a disjoint union of elements of P n : Q n,i = Nn,i α=1 P n,jα .
Since P 1,∞ ∈ E M , we can choose compact sets C n,i ⊂ P n,i and δ > 0 depending on a given ε = ε/(max n≥1 #P n ) such that (a) and (b) hold for P 1,∞ . Define It is clear that D n,i is a compact subset of Q n,i . Moreover, it holds that since each C n,jα is disjoint from all C n,j β . Hence, Q 1,∞ ∈ E M . To show that Axiom (C) holds, let P 1,∞ = {P n } ∈ E M , P n = {P n,1 , . . . , P n,kn }, and m ≥ 1.
Consider the sequence P These sets are obviously compact subsets of X r and each element of P m r (f 1,∞ ) contains exactly one such set. We have Finally, in order to show that (b) holds for P m (f 1,∞ ), we need the assumption of equicontinuity for f 1,∞ , which yields a number ρ > 0 such that ̺ r (x, y) < ρ implies ̺ r+i (f i r (x), f i r (y)) < δ for all r ≥ 1 and i = 0, 1, . . . , m − 1 (cf. the proof of Lemma 2.3). Now consider two sets D r,(j0,...,jm−1) and D r,(l0,...,lm−1) . These sets are disjoint iff there is an index α ∈ {0, 1, . . . , m − 1} such that j α = l α . This implies ̺ r+α (f α r (x), f α r (y)) ≥ δ, and hence ̺ r (x, y) ≥ ρ. Thus, we have found that for every r ≥ 1 it holds that In [13,Thm. B] it is shown that an equiconjugacy preserves the topological entropy of a topological NDS. An equiconjugacy between systems f 1,∞ and g 1,∞ is an equicontinuous sequence π 1,∞ = {π n } of homeomorphisms such that also {π −1 n } is equicontinuous and π n+1 • f n = g n • π n . The following proposition shows that an equiconjugacy also preserves the Misiurewicz class and hence the associated metric entropy.

The Variational Inequality
Now we are in position to prove the general variational inequality following the lines of Misiurewicz's proof [19].

Theorem:
For an equicontinuous topological NDS (X 1,∞ , f 1,∞ ) with an invariant sequence µ 1,∞ it holds that Proof: Let P 1,∞ ∈ E M . We may assume that each P n has the same number k of elements, P n = {P n,1 , . . . , P n,k }. By definition of the Misiurewicz class, we find compact sets Q n,i ⊂ P n,i (for all n, i) such that µ n (P n,i \Q n,i ) ≤ 1 k log k , i = 1, . . . , k, n ≥ 1, By setting Q n,0 := X n \ k i=1 Q n,i we can define another sequence Q 1,∞ of measurable partitions Q n := {Q n,0 , Q n,1 , . . . , Q n,k }. As in Misiurewicz's proof one finds H µn (P n |Q n ) ≤ 1, which by Proposition 3.2 (vi) leads to the inequality Define a sequence U 1,∞ of open covers U n of X n by U n := {Q n,0 ∪ Q n,1 , . . . , Q n,0 ∪ Q n,k } .
To see that the sets Q n,0 ∪Q n,i are open, consider their complements Q n,1 ∪. . .∪ Q n,i−1 ∪ Q n,i+1 ∪ . . . ∪ Q n,k , which are finite unions of compact sets and hence closed. For a fixed m ≥ 1, let E m ⊂ X 1 be a maximal (m, δ)-separated set. From (9) it follows that each (δ/2)-ball in X n intersects at most two elements of Q n for any n ≥ 1. Hence, we can associate to each x ∈ E m at most 2 m different

Consequently, we obtain
Using (10), we therefore have Taking the supremum over all P 1,∞ ∈ E M , we find That the constant term log 2 + 1 can be omitted in this estimate now follows from a careful application of the power rules for topological and metric entropy.
Inspecting the definition of the Misiurewicz class, one sees that for every k ≥ 1 the admissible class E 1,∞ . Therefore, the arguments that we have applied to the system (X 1,∞ , f 1,∞ ) can equally be applied to all of the power systems (X 1,∞ ), k ≥ 1. Hence, using the power rules (Proposition 2.2 and Proposition 3.18), we obtain Since this holds for every k ≥ 1, sending k to infinity gives the result.
An interesting corollary of Theorem 4.3 is the following generalized variational principle for autonomous systems.

Corollary:
For a topological autonomous system (X, f ) it holds that where the supremum is taken over all sequences µ 1,∞ with f µ n ≡ µ n+1 .
Proof: The inequality "≤" holds by Theorem 4.3. The converse inequality follows from the classical variational principle, if we consider only the constant sequences µ 1,∞ , i.e., the invariant measures of f , and assure ourselves that the associated Misiurewicz classes contain all constant sequences.

Corollary:
Let f 1,∞ be an equicontinuous sequence of (not necessarily strictly) monotone maps f n : X → X where X is either a compact interval or a circle. Then for every f 1,∞ -invariant sequence µ 1,∞ it holds that h EM (f 1,∞ ) = 0.
Proof: This follows from [13, Thm. D], which asserts that the corresponding topological entropy is zero.

Large Misiurewicz Classes
Up to now, we only know that the Misiurewicz class E M contains the trivial sequence of partitions. If it would contain no further sequences, Theorem 4.3 would not give any valuable information on the metric or topological entropy. The aim of this subsection is to find conditions on invariant sequences of measures which give rise to a large Misiurewicz class. The simplest case consists in a system (X 1,∞ , f 1,∞ , µ 1,∞ ), where both X 1,∞ and µ 1,∞ are constant, say X n ≡ X and µ n ≡ µ. Then any finite measurable partition P of X gives rise to a constant sequence P n ≡ P of partitions which is obviously contained in E M . The following proposition slightly generalizes this situation. Proof: We first show that every Borel set A ⊂ X can be approximated by compact subsets uniformly for all µ n . The strong topology is characterized by Let C be the strong closure of µ 1,∞ , and let A ⊂ X be a Borel set and ε > 0. For each µ ∈ C there exists a compact set B µ ⊂ A such that µ(A\B µ ) ≤ ε/2. Now take a neighborhood U µ of µ in C such that |ν(A\B µ ) − µ(A\B µ )| ≤ ε/2 for all ν ∈ U µ . Then for every ν ∈ U µ we have We can cover the compact set C by finitely many of such neighborhoods, say U µ1 , . . . , U µr . Then B := r i=1 B µi is a compact subset of A which satisfies ν(A\B) ≤ ε for all ν ∈ C, so in particular for all ν = µ n . Now let P = {P 1 , . . . , P k } be a finite measurable partition of the state space X. Then for any given ε > 0 we find compact sets C i ⊂ P i such that µ n (P i \C i ) ≤ ε for all n ≥ 1 and i = 1, . . . , k. Moreover, since the sets C i are pairwisely disjoint, This implies that the constant sequence P n ≡ P is an element of E M . 4.7 Example: Consider a system which is given by a periodic sequence Let µ 1 be an f N 1 -invariant probability measure on X (which exists by the theorem of Krylov-Bogolyubov). Define Then µ 1,∞ is an f 1,∞ -invariant sequence, which follows from Clearly, {µ 1 , . . . , µ N } is compact.
The assumption that the closure of {µ n } should be compact still seems to be very restrictive. The next result provides another condition.

Lemma:
Let (X, ̺) be a compact metric space with a Borel probability measure µ. Let A ⊂ X be a Borel set with µ(∂A) = 0. Then A can be approximated by compact subsets with zero boundaries, i.e., Proof: We can assume without loss of generality that ∂A = ∅, since otherwise A is closed and hence compact itself. For every ε > 0 define the set We claim that each K ε is a closed subset of X and hence compact. To this end, consider a sequence x n ∈ K ε with x n → x ∈ X. By continuity of dist(·, ∂A), it follows that dist(x, ∂A) ≥ ε and x ∈ cl A. Assume to the contrary that x ∈ ∂A. Then ε ≤ dist(x, ∂A) = 0, a contradiction. Hence, x ∈ K ε . We further claim that µ(K ε ) → µ(A) for ε → 0. To show this, take an arbitrary strictly decreasing sequence ε n → 0. Then K εn ⊂ K εn+1 for all n ≥ 1. Hence, by continuity of the measure µ and the assumption that µ(∂A) = 0, it follows that To conclude the proof, it suffices to show that one can choose the sequence ε n such that µ(∂K εn ) = 0. To this end, we first show that for δ 1 < δ 2 the boundaries of K δ1 and K δ2 are disjoint. Assume to the contrary that there exists x ∈ ∂K δ1 ∩ ∂K δ2 . Then, by continuity of the dist-function, dist(x, ∂A) ≥ δ 1 and dist(x, ∂A) ≥ δ 2 . However, if one of these inequalities would be strict, the point x would be contained in the interior of the corresponding set. Hence, dist(x, ∂A) = δ 1 < δ 2 = dist(x, ∂A), a contradiction. Now, we can construct the desired sequence ε n → 0 as follows. Fix n ∈ N and assume to the contrary that µ(∂K ε ) > 0 for all ε ∈ (1/(n + 1), 1/n). Define the sets I m := {ε ∈ (1/(n + 1), 1/n) : µ(∂K ε ) ≥ 1/m}. Then (1/(n + 1), 1/n) = m∈N I m and hence one of the sets I m , say I m0 , must be uncountable. However, since the boundaries of the K ε are disjoint, this would imply that the set ε∈Im 0 ∂K ε has an infinite measure. Hence, we can take ε n ∈ (1/(n + 1), 1/n) with µ(∂K εn ) = 0.

Proposition:
Let (X 1,∞ , f 1,∞ ) be an equicontinuous system such that X 1,∞ is constant and let µ 1,∞ = {µ n } be an f 1,∞ -invariant sequence. Assume that the measures in the weak * -closure of {µ n } are pairwisely equivalent. Then E M contains all constant sequences of partitions whose members have zero boundaries (with respect to the measures µ n ).
Since ∂(P i \C ν,i ) ⊂ ∂P i ∪ ∂C ν,i and hence ν(∂(P i \C ν,i )) = 0, the Portmanteau theorem yields a weak * -neighborhood U ν ⊂ C of ν such that for every µ ∈ U ν it holds that |ν(P i \C ν,i ) − µ(P i \C ν,i )| ≤ ε/2. Therefore, µ(P i \C ν,i ) ≤ ε for all µ ∈ U ν . Since C is weakly * -compact, we can cover C with finitely many of these neighborhoods, say U ν1 , . . . , U νr . Then C i := r i=1 C νi is a compact subset of P i for 1 ≤ i ≤ k and for every µ ∈ C it holds that µ(P i \C i ) ≤ ε, in particular for all µ = µ n . This implies that the constant sequence P n ≡ P is in E M . 4.10 Remark: Note that every compact metric space admits finite measurable partitions of sets with arbitrarily small diameters and zero boundaries (cf. [11,Lem. 4.5.1]).

Example:
An example for systems with invariant sequences satisfying the assumption of Proposition 4.9, can be found in [23]: Let M be a compact connected Riemannian manifold. By d(·, ·) denote the Riemannian distance and by m the Riemannian volume measure. For simplicity, we will assume that m(M ) = 1, so m is a probability measure. For constants λ > 1 and Γ > 0 consider the set E(λ, Γ) := f ∈ C 2 (M, M ) : f expanding with factor λ, where "expanding with factor λ" means that |Df x (v)| ≥ λ|v| holds for all x ∈ M and all tangent vectors v ∈ T x M . We will consider a NDS f 1,∞ = {f n } on M with f n ∈ E(λ, Γ) for fixed λ > 1 and Γ > 0. It is clear that such a system is equicontinuous. We define For any expanding map f : M → M we write for the Perron-Frobenius operator associated with f acting on densities ϕ ∈ D.
Note that this makes sense, since the expanding maps are covering maps, and hence the sets f −1 (x) are finite, all having the same number of elements.
Now let ϕ ∈ D. We claim that the f 1,∞ -invariant sequence, defined by µ 1 := ϕdm and µ n := f n−1 1 µ 1 for all n ≥ 2, has the property that the elements of the weak * -closure of {µ n } n∈N are pairwisely equivalent. To show this, let L > 0 be chosen such that ϕ ∈ D L and note that µ n+1 = P f n 1 (ϕ)dm for all n. By [23,Prop. 2.3], there exist L * > 0 and τ ≥ 1 such that P f n 1 (ϕ) ∈ D L * for all n ≥ τ . Hence, we may assume that P f n 1 (ϕ) ∈ D L * for all n. We will first show that the densities in D L * are uniformly bounded away from zero and infinity and that they are equicontinuous. Assume to the contrary that there are ϕ n ∈ D L * and x n ∈ M such that ϕ n (x n ) ≥ n. Without loss of generality, we may assume that ϕ n (x n ) = max x∈M ϕ n (x). Choosing δ ∈ (0, ε] with Lδ < 1, we obtain Since m(B(x n , δ)) is bounded away from zero, this is a contradiction. Hence, the functions in D L * are uniformly bounded by some constant K. This immediately implies equicontinuity, since for x, y ∈ M with d(x, y) < ε we have |ϕ(x) − ϕ(y)| = ϕ(y) ϕ(x) ϕ(y) − 1 ≤ KLd(x, y).
To show that the ϕ ∈ D L * are uniformly bounded away from zero, assume to the contrary that there exist ϕ n ∈ D L * and x n ∈ M such that ϕ n (x n ) → 0. By compactness, we may assume that x n → x. Then |ϕ n (x) − ϕ n (x n )| ≤ KLd(x, x n ) → 0 ⇒ ϕ n (x) → 0.
cannot be broken apart into two invariant subsets of positive measure, one can use the same definition for a metric NDS on a single probability space. However, this definition is probably too strict. It seems more likely that for different purposes different analogues of ergodicity of varying strength will fit.
• One of the next steps in the further development of the entropy theory for nonautonomous systems certainly is the study of the question to which extent the variational inequality (Theorem 4.3) can be extended to a full variational principle. Another interesting question is under which conditions there exist reasonably small generating sets for the Misiurewicz class.
• The classical Pesin formula and Margulis-Ruelle inequality relate the metric entropy of a diffeomorphism to its Lyapunov exponents, given by the Multiplicative Ergodic Theorem. It is an interesting and probably very far-reaching question to which extent such results can be transferred to the nonautonomous case.
• The notion of metric entropy in this paper also generalizes the metric sequence entropy introduced in Kushnirenko [16]. It might be an interesting topic for future research to look for generalizations of the known results about metric sequence entropy.