Skip to content
BY 4.0 license Open Access Published by De Gruyter December 3, 2021

Gradient estimates for an orthotropic nonlinear diffusion equation

  • Pierre Bousquet , Lorenzo Brasco , Chiara Leone EMAIL logo and Anna Verde

Abstract

We consider a quasilinear degenerate parabolic equation driven by the orthotropic p-Laplacian. We prove that local weak solutions are locally Lipschitz continuous in the spatial variable, uniformly in time.

MSC 2010: 35K65; 35B65; 35K92

1 Introduction

1.1 Aim of the paper

Let Ω N be an open bounded set and I an open bounded interval. We study the gradient regularity of local weak solutions to the following parabolic equation:

(1.1) u t = i = 1 N ( | u x i | p - 2 u x i ) x i in  I × Ω .

Evolution equations of this type have been studied since the 1960s, especially by the Soviet school, see for example the paper [27] by Vishik. Equation (1.1) also explicitly appears in the monographs [21], [23, Example 4.A, Chapter III] and [29, Example 30.8], among others.

In this paper, we will focus on the case p 2 . We first observe that (1.1) looks quite similar to the more familiar one

(1.2) u t = Δ p u in  I × Ω ,

which involves the p-Laplace operator

Δ p u = i = 1 N ( | u | p - 2 u x i ) x i .

Indeed, both parabolic equations are particular instances of equations of the type

u t = div F ( u )

with F : N a convex function which satisfies the structural conditions

F ( z ) , z 1 C | z | p and | F ( z ) | C | z | p - 1 for every  z N .

Then the basic regularity theory equally applies to both (1.1) and (1.2). The standard reference in the field is DiBenedetto’s monograph [13], where one can find boundedness results for the solution u (see [13, Chapter V]), Hölder continuous estimates for u (see [13, Chapter III]), as well as Harnack inequality for positive solutions (see [13, Chapter VI]). At a technical level, there is no distinction to be made between (1.1) and (1.2).

In contrast, when coming to the regularity of u (i.e. boundedness and continuity), the situation becomes fairly more complicated. Let us start from (1.2). DiBenedetto and Friedman [14] have proved that the gradients of solutions to this equation are bounded. This is the starting point to obtain the continuity of the gradients for any

p > 2 N N + 2 .

We refer again to DiBenedetto’s book for a comprehensive collection of results on the subject, notably to [13, Chapter VIII]. Since then, there has been a growing literature concerning the regularity for nonlinear, possibly degenerate or singular, parabolic equations (or systems), the main model of which is given by the evolutionary p-Laplacian equation (1.2). Without any attempt to completeness, we can just mention some classical references [15, 12, 28, 8, 9], up to the most recent contributions on the subject, given by [2, 19, 18], among others.

However, none of these results apply to our equation (1.1). Indeed, all of them rely on the fact that the loss of ellipticity of the operator div F is restricted to a single point, since the Hessian D 2 F behaves as in the model case (1.2)

D 2 F ( z ) ξ , ξ 1 C | z | p - 2 | ξ | 2 ,

where the elliptic character is lost only for z = 0 . Such a property dramatically breaks down for our equation (1.1). Indeed, in this case, the function F has the following orthotropic structure:

(1.3) F ( z ) = 1 p i = 1 N | z i | p for every  z N .

The Hessian matrix of F now degenerates on an unbounded set, namely the set of those z N such that one component z i is 0. As a consequence, the aforementioned references do not provide any regularity results for the gradients of the solutions.

The main goal of the present paper is to prove the L bound on u for our equation (1.1), thus extending the result by DiBenedetto and Friedman to this more degenerate setting. In order to do this, we will need to adapt to the parabolic setting the machinery that we developed in [3, 4, 5, 6] and [7], for degenerate equations with orthotropic structure. Indeed, the operator

i = 1 N ( | u x i | p - 2 u x i ) x i ,

that we called orthotropic p-Laplacian, is the prominent example of this kind of equations. We also refer to [11] for an approach to this operator, based on viscosity techniques.

1.2 Main result

In this paper, we establish the following regularity result which can be seen as the parabolic counterpart of our previous result [5, Theorem 1.1] for the elliptic case. In the statement below, the notation u refers to the spatial variables, i.e. u = ( u x 1 , , u x N ) .

Main Theorem.

Let p > 2 and let u L loc p ( I ; W loc 1 , p ( Ω ) ) be a local weak solution of (1.1). Then u L loc ( I × Ω ) . More precisely, for every parabolic cube

𝒬 τ , R ( t 0 , x 0 ) := ( t 0 - τ , t 0 ) × ( x 0 - R , x 0 + R ) N I × Ω ,

and every 0 < σ < 1 , we have

(1.4) u L ( 𝒬 σ τ , σ R ( t 0 , x 0 ) ) C 1 ( 1 - σ ) N + 2 2 ( τ R 2 ) 1 2 ( 𝒬 τ , R ( t 0 , x 0 ) | u | p 𝑑 t 𝑑 x ) 1 2 + C ( ( 1 - σ ) R 2 τ ) 1 p - 2

for a constant C = C ( N , p ) > 0 .

Remark 1.1 (Scalings).

We observe that equation (1.1) is invariant with respect to the “horizontal” and “vertical” scale changes

u λ , μ ( t , x ) = μ u ( μ p - 2 λ p t , λ x )

for every λ , μ > 0 . Then it is easily seen that the a priori estimate (1.4) is invariant with respect to these scale changes. We point out that such estimate is the exact analogue of that for the evolutionary p-Laplacian, see [13, Theorem 5.1, Chapter VIII]. Occasionally, in the paper we will work with anisotropic parabolic cubes of the type

Q R ( t 0 , x 0 ) = ( t 0 - R p , t 0 ) × ( x 0 - R , x 0 + R ) N .

The choice of cubes of this type could be loosely justified by a dimensional analysis of the equation. Indeed, by considering the quantity u as dimensionless and using the family of scalings

( t , x ) ( λ p t , λ x ) for every  λ > 0 ,

we get the relation

time ( length ) p .

However, as is well known, estimates on cubes of the type Q R are too restrictive when looking at C 0 , α estimates for u . Indeed, in light of the so-called intrinsic geometry, it is much more important to work with local estimates on cubes 𝒬 τ , R , where the time scale τ is adapted to the solution itself: roughly speaking, we can take

τ R 2 | u | p - 2 .

This explains the importance of having (1.4) with two independent scales R and τ. We refer to [13, Chapter VIII] for a description of the method of intrinsic scalings, where these heuristics are clarified.

Remark 1.2 (Case 1 < p 2 ).

When p = 2 , the orthotropic parabolic equation (1.1) boils down to the standard heat equation, for which solutions are well known to be smooth. For this reason, in our statement we restrict our attention to the case p > 2 . However, we point out that by making the choice τ = R 2 and taking the limit as p goes to 2 in (1.4), we formally end up with the classical gradient estimate for solutions of the heat equation

u L ( 𝒬 σ R 2 , σ R ( t 0 , x 0 ) ) C ( 1 - σ ) N + 2 2 ( 𝒬 R 2 , R ( t 0 , x 0 ) | u | 2 𝑑 t 𝑑 x ) 1 2 .

In light of the previous remark, in the case p = 2 the relation

time ( length ) 2

is now the natural one.

As for the singular case p < 2 , this is somehow simpler than its degenerate counterpart. In this case, the local Lipschitz regularity of solutions to (1.1) can be directly inferred from [22, Theorem 1], under the restriction

p > 2 N N + 2 .

Indeed, the result of [22] covers (among others) the case of parabolic equations of the form

u t = div F ( u ) ,

under the following assumptions on the convex function F:

| F ( z ) | C | z | p - 1    and    D 2 F ( z ) ξ , ξ 1 C | z | p - 2 | ξ | 2 for every  ξ N , z N { 0 } .

It is not difficult to see that the orthotropic function (1.3) for p < 2 matches both requirements. Indeed, observe that

D 2 F ( z ) ξ , ξ = ( p - 1 ) i = 1 N | z i | p - 2 | ξ | 2 ( p - 1 ) | z | p - 2 | ξ | 2 ,

thanks to the fact that p - 2 < 0 . Thus in the subquadratic case, the orthotropic structure helps more than it hurts, in a sense.

Remark 1.3 (Anisotropic diffusion).

We conclude this part by observing that, more generally, one could consider the following parabolic equation:

u t = i = 1 N ( | u x i | p i - 2 u x i ) x i in  Ω × I ,

which still has an orthotropic structure. Now we have a whole set of exponents 1 < p 1 p 2 p N , one for each coordinate direction. We cite the paper [24], where some global Lipschitz regularity results are proven for solutions of the relevant Cauchy–Dirichlet problem, under appropriate regularity assumptions on the data. We point out that in light of their global nature, for p 1 = = p N = p > 2 such results are not comparable to ours. We also refer to [10] for a sophisticated Harnack inequality for positive local weak solutions, as well as for some further references on the problem. Finally, the very recent paper [16] contains a thorough study of the Cauchy problem in the case p i < 2 , together with some regularity results.

However, as for the counterpart of our Main Theorem for local solutions of this equation, this is still an open problem, to the best of our knowledge.

1.3 Technical aspects of the proof

The core of the proof of the Main Theorem is an a priori Lipschitz estimate for smooth solutions of the orthotropic parabolic equation, see Proposition 4.1 below. More precisely, we introduce the regularized problem

( u ε ) t = div F ε ( u ε ) ,

where F ε is a smooth uniformly convex approximation of the orthotropic function (1.3). By the classical regularity theory, the maps u ε are regular enough to justify all the calculations below. The goal is to establish a local uniform Lipschitz estimate on u ε , which does not depend on the regularization parameter ε. Finally, we let ε go to 0 and prove that the family u ε converges to the original solution u. This allows to obtain the Lipschitz estimate for u itself.

In the subsequent part of this subsection, we emphasize the main difficulties to get such a Lipschitz estimate on u ε . In order to simplify the presentation, we drop the index ε both for F ε and u ε . The strategy is apparently quite classical: we rely on a Moser iterative scheme of reverse Hölder’s inequalities, resulting from the interplay between Caccioppoli estimates and the Sobolev embeddings.

To be more specific, we first differentiate the equation with respect to a spatial variable x j , so as to get the equation solved by the j-th component of the gradient. This is given by

(1.5) I × Ω ( u x j ) t φ 𝑑 t 𝑑 x + I × Ω ( F ( u ) ) x j , φ 𝑑 t 𝑑 x = 0 for every  φ C 0 ( I × Ω ) .

More generally, the composition of the component u x j with a non-negative convex function h is a subsolution of this equation. Accordingly, the map h ( u x j ) satisfies the Caccioppoli inequality which is naturally attached to (1.5) (see Lemma 3.1 below). If I = ( T 0 , T 1 ) and τ ( T 0 , T 1 ) , this reads as follows:

(1.6)

χ ( τ ) { τ } × Ω h 2 ( u x j ) η 2 𝑑 x + ( T 0 , τ ) × Ω D 2 F ( u ) h ( u x j ) , h ( u x j ) χ η 2 𝑑 t 𝑑 x
( T 0 , τ ) × Ω χ η 2 h 2 ( u x j ) 𝑑 t 𝑑 x + ( T 0 , τ ) × Ω D 2 F ( u ) η , η h 2 ( u x j ) χ 𝑑 t 𝑑 x .

Here, the maps χ C 0 ( I ) and η C 0 ( Ω ) are non-negative cut-off functions in the time and space variables, respectively. We have used the following expedient notation: given f L 1 ( I × Ω ) ,

{ τ } × Ω f 𝑑 x := Ω f ( τ , x ) 𝑑 x for a.e.  τ I .

When F is a uniformly elliptic integrand, in the sense that

1 C | ξ | 2 D 2 F ( u ) ξ , ξ C | ξ | 2 for every  ξ N ,

one can easily obtain from (1.6) a crucial “unnatural” feature of the subsolution h ( u x j ) ; that is, a sort of reverse Poincaré inequality where the Sobolev norm of h ( u x j ) is controlled by the L 2 norm of the subsolution itself. In conjunction with the Sobolev inequality, this is the cornerstone which eventually leads to the classical version of the Moser iterative scheme. It should be noticed that this strategy still works even in the degenerate case, provided the Hessian behaves like

1 C | u | p - 2 | ξ | 2 D 2 F ( u ) ξ , ξ C | u | p - 2 | ξ | 2 for every  ξ N ,

as for the evolutionary p-Laplace equation (1.2). It is sufficient to use the “absorption of degeneracy” trick, where the degenerate weight | u | p - 2 is recombined with the subsolution h ( u x j ) by means of simple algebraic manipulations. This still permits to infer from (1.6) a control on the Sobolev norm of a suitable convex function of u x j . This is nowadays a standard technique in the field; for the elliptic case, it goes back to the pioneering works by Ural’tseva [26] and Uhlenbeck [25].

As we explained above, due to the severe degeneracy of D 2 F in our orthotropic situation, it is not possible to follow the same path. In order to rely on such an absorption trick, we have to go through a tour de force and to introduce a new family of weird Caccioppoli inequalities (see Lemma 3.2 below). These are the parabolic counterparts of a corresponding estimate introduced in the elliptic setting in [3] and then fruitfully exploited in [5].

The crucial idea is to mix together the components of the gradient with respect to 2 orthogonal directions. This compensates the lack of ellipticity of D 2 F and allows to rely on the Sobolev embeddings in the iterative scheme. We do not detail these Caccioppoli-type estimates here, but instead explain the main additional difficulties with respect to the elliptic framework.

Let us come back for one instant to the standard Caccioppoli inequality (1.6). It follows from (1.5) by taking φ = h h ( u x j ) χ η 2 . In particular, the parabolic term is given by

I × Ω ( u x j ) t φ 𝑑 t 𝑑 x = 1 2 I × Ω ( h 2 ( u x j ) ) t χ η 2 𝑑 t 𝑑 x .

Then an integration by parts yields

I × Ω ( u x j ) t φ 𝑑 t 𝑑 x = - 1 2 I × Ω h 2 ( u x j ) χ η 2 𝑑 t 𝑑 x .

The latter yields the “time slice” term on the left-hand side of (1.6). This way, the time derivative is transferred to χ and one can handle the factor h 2 ( u x j ) as in the elliptic framework.

For the weird Caccioppoli inequalities, the test function is now φ = u x j Φ ( u x j 2 ) Ψ ( u x k 2 ) χ η 2 , for some 1 j , k N . The corresponding parabolic term becomes

I × Ω ( u x j ) t φ 𝑑 t 𝑑 x = 1 2 I × Ω ( Φ ( u x j 2 ) ) t Ψ ( u x k 2 ) χ η 2 𝑑 t 𝑑 x .

In contrast to the previous situation, we cannot perform an integration by parts to get rid of the time derivative on Φ ( u x j 2 ) , since it would affect the factor Ψ ( u x k 2 ) . In order to overcome this difficulty, which does not arise in the elliptic setting, we need a new approach, aimed at “symmetrizing” the above quantity containing u x j and u x k .

Basically, we merge together two weird Caccioppoli inequalities, where the spatial variables x j and x k play symmetric roles. More specifically, we insert into (1.5) the test functions

φ = u x j Φ ( u x j 2 ) Ψ ( u x k 2 ) χ η 2 , φ ~ = u x k Ψ ( u x k 2 ) Φ ( u x j 2 ) χ η 2 ,

and then add the two resulting inequalities. The parabolic term is now replaced by the following quantity:

1 2 I × Ω ( Φ ( u x j 2 ) Ψ ( u x k 2 ) ) t χ η 2 𝑑 t 𝑑 x .

This allows to integrate by parts and transfer the time derivative on the test function. It turns out that by a suitable adaptation of the arguments that we used in the elliptic case, one can incorporate this new term in the iterative Moser scheme. This finally leads to the desired local L estimate on u .

1.4 Plan of the paper

The paper is organized as follows: after collecting the basic terminology and some preliminaries on Steklov averages in Section 2, we present in Section 3 the proofs of the new Caccioppoli inequalities in the parabolic setting. We detail the iterative Moser scheme in Section 4 and finally establish the Main Theorem in Section 5, by transferring to the original solution u the a priori estimates obtained on the approximating solutions u ε .

2 Preliminaries

2.1 Local solutions

Let Ω N be an open bounded set and I an open bounded interval. Fix p > 2 and take 𝒜 : N N a continuous function such that

𝒜 ( z ) - 𝒜 ( w ) , z - w 0 for every  z , w N

and

𝒜 ( z ) , z 1 C | z | p and | 𝒜 ( z ) | C | z | p - 1 for every  z N .

We say that u L loc p ( I ; W loc 1 , p ( Ω ) ) is a local weak solution of the quasilinear diffusion equation

(2.1) u t = div 𝒜 ( u ) in  I × Ω

if for every φ C 0 ( I × Ω ) we have

- I × Ω u φ t 𝑑 t 𝑑 x + I × Ω 𝒜 ( u ) , φ 𝑑 t 𝑑 x = 0 .

2.2 Steklov averages

Throughout the paper, we denote by T 0 < T 1 the endpoints of the time interval I. Let v L loc 1 ( I × Ω ) . For every 0 < σ < T 1 - T 0 , we define its so-called

  1. forward Steklov average

    v σ + ( t , x ) = t t + σ v ( τ , x ) 𝑑 τ for every  ( t , x ) ( T 0 , T 1 - σ ) × Ω ,

  2. backward Steklov average

    v σ - ( t , x ) = t - σ t v ( τ , x ) 𝑑 τ for every  ( t , x ) ( T 0 + σ , T 1 ) × Ω .

We shall use some standard properties of the Steklov averages. Let 0 < σ < T 1 - T 0 and ψ L ( I × Ω ) such that ψ is compactly supported in ( T 0 , T 1 - σ ) × Ω . We extend ψ by 0 on ( I ) × Ω , so that ψ σ - is well defined on I × Ω and compactly supported therein. By the Fubini theorem, we have

(2.2) ( T 0 , T 1 - σ ) × Ω v σ + ψ 𝑑 t 𝑑 x = I × Ω v ψ σ - 𝑑 t 𝑑 x .

Moreover, if v L loc q ( I × Ω ) for some 1 q < , then v σ + converges to v in L loc q ( I × Ω ) , as σ goes to 0, see e.g. [13, Chapter I, Lemma 3.2].

Finally, we can derive from (2.2) the following regularity properties of the Steklov averages:

Lemma 2.1.

Let v L loc 1 ( I × Ω ) . Then for every 0 < σ < T 1 - T 0 ,

  1. the map v σ + belongs to W loc 1 , 1 ( ( T 0 , T 1 - σ ) ; L loc 1 ( Ω ) ) and

    (2.3) ( v σ + ) t ( t , x ) = v ( t + σ , x ) - v ( t , x ) σ for a.e.  ( t , x ) ( T 0 , T 1 - σ ) × Ω ,

  2. if one further assumes that v L loc 1 ( I ; W loc 1 , 1 ( Ω ) ) , then ( v σ + ) L loc 1 ( ( T 0 , T 1 - σ ) × Ω ) and

    (2.4) ( v σ + ) = ( v ) σ + .

Proof.

Fix 0 < σ < T 1 - T 0 . Let ψ C 0 ( ( T 0 , T 1 - σ ) × Ω ) . Then by (2.2),

( T 0 , T 1 - σ ) × Ω v σ + ψ t 𝑑 t 𝑑 x = I × Ω v ( ψ t ) σ - 𝑑 t 𝑑 x = I × Ω v ( t , x ) ψ ( t , x ) - ψ ( t - σ , x ) σ 𝑑 t 𝑑 x .

By an obvious change of variables, this yields

( T 0 , T 1 - σ ) × Ω v σ + ψ t 𝑑 t 𝑑 x = - ( T 0 , T 1 - σ ) × Ω v ( t + σ , x ) - v ( t , x ) σ ψ ( t , x ) 𝑑 t 𝑑 x ,

which gives the desired identity (2.3).

In order to prove (2.4), we rely again on (2.2), this time tested with ψ x j in place of ψ, for some 1 j N ,

( T 0 , T 1 - σ ) × Ω v σ + ψ x j 𝑑 t 𝑑 x = I × Ω v ( ψ x j ) σ - 𝑑 t 𝑑 x = I × Ω v ( ψ σ - ) x j 𝑑 t 𝑑 x .

In the last equality, we have derived under the integral sign the smooth function ψ. Hence, by integrating by parts the last integral and using (2.2) again, one gets

I × Ω v σ + ψ x j 𝑑 t 𝑑 x = - ( T 0 , T 1 - σ ) × Ω ( v x j ) σ + ψ 𝑑 t 𝑑 x ,

from which (2.4) follows. ∎

3 Energy estimates for a regularized equation

3.1 An approximating equation

We denote by

G ( ξ ) = 1 p ( 1 + | ξ | 2 ) p 2 for every  ξ N ,

and for every ε ( 0 , 1 ) , we consider the convex function

(3.1) F ε ( ξ ) = 1 p i = 1 N | ξ i | p + ε G ( ξ ) for every  ξ N .

We consider a local weak solution u ε L loc p ( I ; W loc 1 , p ( Ω ) ) of equation (2.1) with the choice

𝒜 ( z ) = F ε ( z ) .

This means that u ε verifies

(3.2) - I × Ω u ε φ t 𝑑 t 𝑑 x + I × Ω F ε ( u ε ) , φ 𝑑 t 𝑑 x = 0

for every φ C 0 ( I × Ω ) . Observe that the map F ε belongs to C 2 ( N ) and satisfies

ε ( 1 + | ξ | 2 ) p - 2 2 | ζ | 2 D 2 F ε ( ξ ) ζ , ζ ( 1 + ε ) ( p - 1 ) ( 1 + | ξ | 2 ) p - 2 2 | ζ | 2 for every  ξ , ζ N .

Hence, one can rely on the classical regularity theory for quasilinear parabolic equations, see e.g. [13, Theorem 5.1, Chapter VIII] and [1, Lemma 3.1], to get

(3.3) u ε L loc ( I × Ω ) and u ε L loc 2 ( I ; W loc 2 , 2 ( Ω ) ) .

In the following computations, we delete the index ε both for u and F.

3.2 An equation for the spatial gradient

In order to establish a Lipschitz bound on our solution u, we need to differentiate (3.2) with respect to the spatial variables x j , 1 j N .

Fix 0 < σ < T 1 - T 0 . Let ψ C 0 ( ( T 0 , T 1 - σ ) × Ω ) . As already observed, the backward Steklov average

φ ( t , x ) = ψ σ - ( t , x ) for  ( t , x ) I × Ω

is compactly supported in I × Ω . We can thus insert it into (3.2):

- I × Ω u ( ψ σ - ) t 𝑑 t 𝑑 x + I × Ω F ( u ) , ψ σ - 𝑑 t 𝑑 x = 0 .

Since ( ψ σ - ) t = ( ψ t ) σ - , equation (2.2) implies that

- I × Ω u ( ψ σ - ) t 𝑑 t 𝑑 x = - ( T 0 , T 1 - σ ) × Ω u σ + ψ t 𝑑 t 𝑑 x = ( T 0 , T 1 - σ ) × Ω ( u σ + ) t ψ 𝑑 t 𝑑 x .

One thus gets

(3.4) ( T 0 , T 1 - σ ) × Ω ( u σ + ) t ψ 𝑑 t 𝑑 x + I × Ω F ( u ) , ψ σ - 𝑑 t 𝑑 x = 0

for every ψ C 0 ( ( T 0 , T 1 - σ ) × Ω ) .

Let j { 1 , , N } and φ C 0 ( ( T 0 , T 1 - σ ) × Ω ) . The map

( u σ + ) t ( t , x ) = u ( t + σ , x ) - u ( t , x ) σ

belongs to L loc p ( ( T 0 , T 1 - σ ) ; W loc 1 , p ( Ω ) ) and ( ( u σ + ) t ) x j = ( ( u x j ) σ + ) t . By derivation under the integral sign, one also has

( ( φ x j ) σ - ) = ( ( φ σ - ) ) x j .

We insert ψ = φ x j in equation (3.4). An integration by parts in the spatial variable leads to

( T 0 , T 1 - σ ) × Ω ( ( u x j ) σ + ) t φ 𝑑 t 𝑑 x + I × Ω ( F ( u ) ) x j , φ σ - 𝑑 t 𝑑 x = 0 .

Finally, using (2.2) in the second term, one gets

(3.5) ( T 0 , T 1 - σ ) × Ω ( ( u x j ) σ + ) t φ 𝑑 t 𝑑 x + ( T 0 , T 1 - σ ) × Ω ( ( F ( u ) ) x j ) σ + , φ 𝑑 t 𝑑 x = 0 .

We observe that, since F C 2 ( N ) and u L loc ( I × Ω ) L loc 2 ( I ; W loc 1 , 2 ( Ω ) ) , one has

F ( u ) L loc 2 ( ( 0 , T ) ; W loc 1 , 2 ( Ω ) ) .

We can thus appeal to a density argument to get that (3.5) remains true for every φ L 2 ( I ; W 1 , 2 ( Ω ) ) , with compact support in ( T 0 , T 1 - σ ) × Ω .

3.3 Caccioppoli-type inequalities

As explained in the introduction, the first technical tool in the proof of the Lipschitz bound of u is the following Caccioppoli inequality which provides a W 1 , 2 estimate on h ( u x j ) , where h is any smooth convex function.

Lemma 3.1 (Standard Caccioppoli inequality).

Let η C 0 ( Ω ) and χ C 0 ( ( T 0 , T 1 ] ) be two non-negative functions, with χ non-decreasing. Let h : R R be a C 1 convex non-negative function. Then, for almost every τ I and every j = 1 , , N , we have

(3.6)

χ ( τ ) { τ } × Ω h 2 ( u x j ) η 2 𝑑 x + ( T 0 , τ ) × Ω D 2 F ( u ) h ( u x j ) , h ( u x j ) χ η 2 𝑑 t 𝑑 x
( T 0 , τ ) × Ω χ η 2 h 2 ( u x j ) 𝑑 t 𝑑 x + 4 ( T 0 , τ ) × Ω D 2 F ( u ) η , η h 2 ( u x j ) χ 𝑑 t 𝑑 x .

Proof.

We first assume that h is a C 2 convex non-negative function. Let ζ C 0 ( I ) and η C 0 ( Ω ) . There exists 0 < σ 1 < 1 2 ( T 1 - T 0 ) such that ζ is compactly supported in ( T 0 + σ 1 , T 1 - σ 1 ) . Given 0 < σ < σ 1 , Lemma 2.1 and (3.3) imply that ( u x j ) σ + W loc 1 , 1 ( ( T 0 , T 1 - σ ) ; L loc 1 ( Ω ) ) L loc ( ( T 0 , T 1 - σ ) × Ω ) . Hence, the map h 2 ( ( u x j ) σ + ) belongs to W loc 1 , 1 ( ( T 0 , T 1 - σ ) ; L loc 1 ( Ω ) ) and we have

(3.7) 1 2 ( h 2 ( ( u x j ) σ + ) ) t = ( h h ) ( ( u x j ) σ + ) ( ( u x j ) σ + ) t .

We insert into (3.5) the test function

φ = ( h h ) ( ( u x j ) σ + ) ζ η 2 ,

which has compact support in ( T 0 + σ 1 , T 1 - σ 1 ) × Ω and belongs to L ( ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ) L 2 ( ( T 0 + σ 1 , T 1 - σ 1 ) ; W 1 , 2 ( Ω ) ) . By (3.7),

( ( u x j ) σ + ) t φ = 1 2 ( h 2 ( ( u x j ) σ + ) ) t ζ η 2 .

We use the above identity to infer

1 2 ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ( h 2 ( ( u x j ) σ + ) ) t ζ η 2 𝑑 t 𝑑 x + ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ( ( F ( u ) ) x j ) σ + , φ 𝑑 t 𝑑 x = 0 .

We then perform an integration by parts with respect to the time variable in the first term

- 1 2 ( T 0 + σ 1 , T 1 - σ 1 ) × Ω h 2 ( ( u x j ) σ + ) ζ η 2 𝑑 t 𝑑 x + ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ( ( F ( u ) ) x j ) σ + , φ 𝑑 t 𝑑 x = 0 .

We now want to take the limit as σ goes to 0. Let Ω 1 Ω such that η is compactly supported in Ω 1 . Since u x j L loc 2 ( I × Ω ) , we have

lim σ 0 + ( u x j ) σ + - u x j L 2 ( ( T 0 + σ 1 , T 1 - σ 1 ) × Ω 1 ) = 0 .

Moreover, we know that u x j L loc ( I × Ω ) which guarantees that there exists C 1 > 0 such that for every σ ( 0 , σ 1 ) ,

| ( u x j ) σ + | C 1 a.e. on  ( T 0 + σ 1 , T 1 - σ 1 ) × Ω 1 .

It then follows from the Dominated Convergence Theorem that

(3.8) lim σ 0 + - 1 2 ( T 0 + σ 1 , T 1 - σ 1 ) × Ω h 2 ( ( u x j ) σ + ) ζ η 2 𝑑 t 𝑑 x = - 1 2 I × Ω h 2 ( u x j ) ζ η 2 𝑑 t 𝑑 x .

Next, by recalling the choice of φ above, we have

φ = ( h h ) ( ( u x j ) σ + ) ( ( u x j ) σ + ) ζ η 2 + ( h h ) ( ( u x j ) σ + ) ζ ( η 2 ) .

By Lemma 2.1, we know that

( ( u x j ) σ + ) = ( u x j ) σ + .

This implies that ( ( u x j ) σ + ) converges to u x j in L 2 ( ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ) . Hence, a similar argument to the one leading to (3.8) implies that

lim σ 0 + φ - ( ( h h ) ( u x j ) ζ η 2 ) L 2 ( ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ) = 0 .

Finally, by using that ( F ( u ) ) x j L loc 2 ( I × Ω ) , we can infer that

lim σ 0 + ( ( F ( u ) ) x j ) σ + - ( F ( u ) ) x j L 2 ( ( T 0 + σ 1 , T 1 - σ 1 ) × Ω 1 ) = 0 .

It follows that

lim σ 0 + ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ( ( F ( u ) ) x j ) σ + , φ 𝑑 t 𝑑 x = I × Ω ( F ( u ) ) x j , ( ( h h ) ( u x j ) ζ η 2 ) 𝑑 t 𝑑 x .

Up to now, we have thus proved

(3.9) - 1 2 I × Ω h 2 ( u x j ) ζ η 2 𝑑 t 𝑑 x + I × Ω ( F ( u ) ) x j , ( ( h h ) ( u x j ) ζ η 2 ) 𝑑 t 𝑑 x = 0 .

We now choose ζ as follows. Let χ C 0 ( ( T 0 , T 1 ] ) be as in the statement. Given τ I and δ > 0 such that T 0 < τ < τ + δ < T 1 , we define

χ ~ δ ( t ) := { 1 if  t τ , 1 - t - τ δ if  τ < t < τ + δ , 0 if  t τ + δ .

We then insert

(3.10) ζ ( t ) = χ ~ δ ( t ) χ ( t )

into (3.9). Then, for almost every τ I , we can let δ go to 0 and obtain

(3.11) χ ( τ ) 2 { τ } × Ω h 2 ( u x j ) η 2 𝑑 x + ( T 0 , τ ) × Ω ( F ( u ) ) x j , ( ( h h ) ( u x j ) χ η 2 ) 𝑑 t 𝑑 x = 1 2 ( T 0 , τ ) × Ω χ η 2 h 2 ( u x j ) 𝑑 t 𝑑 x .

Since χ does not depend on the spatial variable, we have

( F ( u ) ) x j , ( ( h h ) ( u x j ) χ η 2 ) = D 2 F ( u ) h ( u x j ) , h ( u x j ) χ η 2
+ D 2 F ( u ) u x j , u x j h ′′ ( u x j ) h ( u x j ) χ η 2
+ 2 D 2 F ( u ) u x j , η ( h h ) ( u x j ) χ η .

Since the second term is non-negative, by dropping it, we get from (3.11)

χ ( τ ) 2 { τ } × Ω h 2 ( u x j ) η 2 𝑑 x + ( T 0 , τ ) × Ω D 2 F ( u ) h ( u x j ) , h ( u x j ) χ η 2 𝑑 t 𝑑 x
1 2 ( T 0 , τ ) × Ω χ η 2 h 2 ( u x j ) 𝑑 t 𝑑 x - 2 ( T 0 , τ ) × Ω D 2 F ( u ) u x j , η ( h h ) ( u x j ) χ η 𝑑 t 𝑑 x .

In order to estimate the last term, we use the Cauchy–Schwarz inequality

| D 2 F ( u ) u x j , η | ( D 2 F ( u ) u x j , u x j ) 1 2 ( D 2 F ( u ) η , η ) 1 2 .

A further application of Young inequality leads to

| - 2 ( T 0 , τ ) × Ω D 2 F ( u ) u x j , η ( h h ) ( u x j ) χ η d t d x |
1 2 ( T 0 , τ ) × Ω D 2 F ( u ) u x j , u x j h ( u x j ) 2 χ η 2 𝑑 t 𝑑 x + 2 ( T 0 , τ ) × Ω D 2 F ( u ) η , η h 2 ( u x j ) χ 𝑑 t 𝑑 x .

In this way, the integral containing u x j can be absorbed on the left-hand side. Let us finally observe that we can remove the C 2 assumption on the function h, by a standard approximation argument. ∎

We next establish the key tool for the proof of our main result, namely a Caccioppoli-type inequality, where two different partial derivatives u x j and u x k come into play.

Lemma 3.2 (Weird Caccioppoli inequality).

Let η C 0 ( Ω ) and χ C 0 ( ( T 0 , T 1 ] ) be two non-negative functions, with χ non-decreasing. Let Φ : R + R and Ψ : R + R be two C 1 non-decreasing and non-negative convex functions. Then, for almost every τ I , every k , j = 1 , , N and every θ [ 0 , 1 ] , we have

χ ( τ ) { τ } × Ω Φ ( u x j 2 ) Ψ ( u x k 2 ) η 2 𝑑 x + ( T 0 , τ ) × Ω D 2 F ( u ) u x j , u x j Φ ( u x j 2 ) Ψ ( u x k 2 ) χ η 2 𝑑 t 𝑑 x
( T 0 , τ ) × Ω χ η 2 Φ ( u x j 2 ) Ψ ( u x k 2 ) 𝑑 t 𝑑 x
+ 4 ( T 0 , τ ) × Ω D 2 F ( u ) η , η ( u x j 2 Φ ( u x j 2 ) Ψ ( u x k 2 ) + u x k 2 Ψ ( u x k 2 ) Φ ( u x j 2 ) ) χ 𝑑 t 𝑑 x
+ 8 ( ( T 0 , τ ) × Ω D 2 F ( u ) u x j , u x j u x j 2 Φ ( u x j 2 ) 2 Ψ ( u x k 2 ) θ χ η 2 𝑑 t 𝑑 x ) 1 2
× ( 1 4 ( T 0 , τ ) × Ω χ η 2 | u x k | 2 θ Ψ ( u x k 2 ) 2 - θ d t d x
+ ( T 0 , τ ) × Ω D 2 F ( u ) η , η | u x k | 2 θ Ψ ( u x k 2 ) 2 - θ χ d t d x ) 1 2 .

Proof.

It is convenient to divide the proof into two steps.

Step 1: An identity involving u x j and u x k . We first assume that Φ and Ψ are two C 2 non-decreasing and non-negative convex functions. We fix k , j { 1 , , N } . Given 0 < σ 1 < 1 2 ( T 1 - T 0 ) and ζ C 0 ( T 0 + σ 1 , T 1 - σ 1 ) , we consider (3.5) with the index j and for every 0 < σ < σ 1 , we insert the test function

φ = ( u σ + ) x j Φ ( ( ( u σ + ) x j ) 2 ) Ψ ( ( ( u σ + ) x k ) 2 ) ζ η 2 .

Symmetrically, we consider (3.5) with the index k and insert the test function

φ ~ = ( u σ + ) x k Ψ ( ( ( u σ + ) x k ) 2 ) Φ ( ( ( u σ + ) x j ) 2 ) ζ η 2 .

The functions φ and φ ~ are compactly supported in ( T 0 + σ 1 , T 1 - σ 1 ) × Ω and belong to L ( ( T 0 + σ 1 , T 1 - σ 1 ) × Ω ) L 2 ( ( T 0 + σ 1 , T 1 - σ 1 ) ; W 1 , 2 ( Ω ) ) . Thus they are admissible test functions. We observe that

( ( u σ + ) x j ) t φ = 1 2 ( Φ ( ( ( u σ + ) x j ) 2 ) ) t Ψ ( ( ( u σ + ) x k ) 2 ) ζ η 2 ,

and similarly

( ( u σ + ) x k ) t φ ~ = 1 2 ( Ψ ( ( ( u σ + ) x k ) 2 ) )