## Abstract

In this paper we give a refinement of Jensen’s integral inequality and its generalization for linear functionals. We also present some applications in Information Theory.

Show Summary Details# Open Mathematics

### formerly Central European Journal of Mathematics

# Refinement of the Jensen integral inequality

#### Open Access

## Abstract

## 1 Introduction

## 2 Main results

## 3 Further generalization

## 4 Applications for Csiszár divergence measures

## Acknowledgement

## References

## About the article

Editor-in-Chief: Gianazza, Ugo / Vespri, Vincenzo

1 Issue per year

IMPACT FACTOR 2015: 0.512

SCImago Journal Rank (SJR) 2015: 0.521

Source Normalized Impact per Paper (SNIP) 2015: 1.233

Impact per Publication (IPP) 2015: 0.546

Mathematical Citation Quotient (MCQ) 2015: 0.39

In this paper we give a refinement of Jensen’s integral inequality and its generalization for linear functionals. We also present some applications in Information Theory.

Keywords: Convex functions; Jensen’s inequality; f-divergences

Let C be a convex subset of the linear space *X* and *f* be a convex function on **C**. If **p** = (*p*_{1}, ... *p _{n}*) is probability sequence and

is well known in the literature as Jensen’s inequality.

The Lebesgue integral version of the Jensen inequality is given below:

**Theorem 1.1:** *Let* (Ω, Λ, *μ*) *be a measure space with* 0 < *μ*(Ω) < ∞ *and let ϕ* : *I* → ℝ *be a convex function defined on an open interval I in* ℝ. *If f* : Ω → *I is such that f*, *ϕ* ∘ *f* ∈ *L*(Ω, Λ, *μ*), *then*
$$\varphi \left(\frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}f}d\mu \right)\le \frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}\varphi}(f)d\mu .$$(2)*In case when ϕ is strictly convex on I one has equality in* (2) *if and only if f is constant almost everywhere on* Ω.The Jensen inequality for convex functions plays a crucial role in the Theory of Inequalities due to the fact that other inequalities such as the arithmetic mean-geometric mean inequality, the Hölder and Minkowski inequalities, the Ky Fan inequality etc. can be obtained as particular cases of it.There is an extensive literature devoted to Jensen’s inequality concerning different generalizations, refinements, counterparts and converse results, see, for example [1–9].In this paper we give a refinement of Jensen’s integral inequality and its generalization for linear functionals. We also present some applications in Information Theory for example for Kullback-Leibler, total variation and Karl Pearson *χ*^{2}-divergences etc.

Let (Ω, Λ, *μ*) be a measure space with 0 < *μ*(Ω) < ∞ and *L*(Ω, Λ, *μ*) = {*f* : Ω → ℝ : *f* is *μ* measurable and ∫_{Ω} *f*(*t*)*dμ*(*t*) < ∞} be a Lebesgue space. Consider the set 𝔖 = {*ω* ∈ Λ : *μ*(ω) ≠ 0 and *μ*(*ω*̅) = *μ*(Ω \ *ω*) ≠ 0} and *ϕ* : (*a, b*) → ℝ be a convex function defined on an open interval (*a*, *b*). If *f* : Ω → (*a*, *b*) is such that *f*, *ϕ* ∘ *f* ∈ *L*(Ω, Λ, *μ*), then for any set *ω* ∈ 𝔖, define the functional as
$$\u03dc(\varphi ,f;\omega )=\frac{\mu (\omega )}{\mu (\text{\Omega})}\varphi \left(\frac{1}{\mu (\omega )}{\displaystyle \underset{\omega}{\int}f}d\mu \right)+\frac{\mu (\overline{\omega})}{\mu (\text{\Omega})}\varphi \left(\frac{1}{\mu (\overline{\omega})}{\displaystyle \underset{\overline{\omega}}{\int}f}d\mu \right).$$(3)

We give the following refinement of Jensen’s inequality.

**Theorem 2.1:** *Let* (Ω, Λ, *μ*) *be a measure space with* 0 < *μ*(Ω) < ∞ *and let ϕ* : (*a*, *b*) → ℝ *be a convex function defined on an open interval* (*a*, *b*). *If f* : Ω → (*a*, *b*) *is such that f*, *ϕ* ∘ *f* ∈ *L*(Ω, Λ, *μ*), *then for any set* *ω* ∈ 𝔖 *we have*
$$\varphi \left(\frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}f}d\mu \right)\le \u03dc(\varphi ,f;\omega )\le \frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}\varphi}(f)d\mu .$$(4)

**Proof:** *As for any ω ∈ 𝔖 we have
$$\varphi \left(\frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}f}d\mu \right)=\varphi \left[\frac{\mu (\omega )}{\mu (\text{\Omega})}\left(\frac{1}{\mu (\omega )}{\displaystyle \underset{\mathrm{\omega}}{\int}f}d\mu \right)+\frac{\mu (\overline{\omega})}{\mu (\text{\Omega})}\left(\frac{1}{\mu (\overline{\omega})}{\displaystyle \underset{\overline{\omega}}{\int}f}d\mu \right)\right].$$Therefore by the convexity of the function ϕ we get
$$\varphi \left(\frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}f}d\mu \right)\le \frac{\mu (\omega )}{\mu (\text{\Omega})}\varphi \left(\frac{1}{\mu (\omega )}{\displaystyle \underset{\mathrm{\omega}}{\int}f}d\mu \right)+\frac{\mu (\overline{\omega})}{\mu (\text{\Omega})}\varphi \left(\frac{1}{\mu (\overline{\omega})}{\displaystyle \underset{\overline{\omega}}{\int}f}d\mu \right)=\u03dc(\varphi ,f;\omega ),$$(5)Also for any ω ∈ 𝔖 and by the Jensen inequality we have
$$\begin{array}{rl}\u03dd(\varphi ,f;\omega )& =\frac{\mu (\omega )}{\mu (\mathrm{\Omega})}\varphi \left(\frac{1}{\mu (\omega )}\underset{\omega}{\int}fd\mu \right)+\frac{\mu (\overline{\omega})}{\mu (\mathrm{\Omega})}\varphi \left(\frac{1}{\mu (\overline{\omega})}\underset{\overline{\omega}}{\int}fd\mu \right)\\ & \le \frac{1}{\mu (\mathrm{\Omega})}\underset{\omega}{\int}\varphi (f)d\mu +\frac{1}{\mu (\mathrm{\Omega})}\underset{\overline{\omega}}{\int}\varphi (f)d\mu \\ & =\frac{1}{\mu (\mathrm{\Omega})}\underset{\mathrm{\Omega}}{\int}\varphi (f)d\mu .\end{array}$$(6)From (5) and (6) we have (4). □*

**Remark 2.2:** *We observe that the inequality* (4)*can be written in an equivalent form as*
$$\underset{\omega \in \mathfrak{S}}{\mathrm{inf}}\u03dc(\varphi ,f;\omega )\ge \varphi \left(\frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}f}d\mu \right)$$*and*
$$\frac{1}{\mu (\text{\Omega})}{\displaystyle \underset{\mathrm{\Omega}}{\int}\varphi}(f)d\mu \ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\hspace{0.17em}\u03dc(\varphi ,f;\omega ).$$

**Remark 2.3:** *If* ∅ Ω ∈ 𝔖 *and if we take ω* = ∅ *or ω* = Ω, *then we have F*(*ϕ*, *f*; *ω*) *is equal to the left hand side of* (2). *In this case* (5) *holds trivially*.Particularly Riemann integral version can be given as:

**Corollary 2.4:** *Let ϕ* : [*a*, *b*] → ℝ *be a convex function defined on the interval* [*a*, *b*]. *If f* : [*c*, *d*] → [*a*, *b*], *p* : [*c*, *d*] → ℝ^{+} *are such that f*, *fp and* (*ϕ* ∘ *f*) *p are all integrable on* [*c*, *d*], *then we have*
$$\begin{array}{l}\underset{x\in (c,d)}{\mathrm{inf}}\left[\frac{x-c}{d-c}\varphi \left(\frac{1}{x-c}{\displaystyle \underset{c}{\overset{x}{\int}}p}(t)f(t)dt\right)+\frac{d-x}{d-c}\varphi \left(\frac{1}{d-x}{\displaystyle \underset{x}{\overset{d}{\int}}p}(t)f(t)dt\right)\right]\ge \varphi \left(\frac{1}{d-c}{\displaystyle {\int}_{c}^{d}p}(t)f(t)dt\right),\\ \frac{1}{d-c}{\displaystyle \underset{c}{\overset{d}{\int}}p}(t)\varphi (f(t))dt\ge \underset{x\in [c,d]}{\mathrm{sup}}\left[\frac{x-c}{d-c}\varphi \left(\frac{1}{x-c}{\displaystyle \underset{c}{\overset{x}{\int}}p}(t)f(t)dt\right)+\frac{d-x}{d-c}\varphi \left(\frac{1}{d-x}{\displaystyle \underset{x}{\overset{d}{\int}}p}(t)f(t)dt\right)\right].\end{array}$$As a simple consequence of Theorem 2.1 we can obtain refinement of Hermite-Hadamard inequality:

**Corollary 2.5:** *If ϕ* : [*a*, *b*] → ℝ *is a convex function defined on the interval* [*a*, *b*], *then for any* [*c*, *d*] ⊆ [*a*, *b*]*we have*
$$\begin{array}{c}\varphi \left(\frac{d+c}{2}\right)\le \underset{x\in [c,d]}{\mathrm{inf}}\left[\frac{x-c}{d-c}\varphi \left(\frac{x+c}{2}\right)+\frac{d-x}{d-c}\varphi \left(\frac{d+x}{2}\right)\right],\\ \frac{1}{d-c}{\displaystyle \underset{c}{\overset{d}{\int}}\varphi}(t)dt\ge \underset{x\in [c,d]}{\mathrm{sup}}\left[\frac{x-c}{d-c}\varphi \left(\frac{x+c}{2}\right)+\frac{d-x}{d-c}\varphi \left(\frac{d+x}{2}\right)\right].\end{array}$$

Let *E* be a nonempty set, 𝔄 be an algebra of subsets of *E*, and *L* be a linear class of real-valued functions *f* : *E* → ℝ having the properties:

L1 : *f*, *g* ∈ *L* ⇒ (*αf* + *βg*) ∈ *L* for all *α*, *β* ∈ ℝ;

L2 : **1** ∈ *L*, i.e., if *f*(*t*) = 1 for all *t* ∈ *E*, then *f* ∈ *L*;

L3 : *f* ∈ *L*, *E*_{1} ∈ 𝔄 ⇒ *f*. *χE*_{1} ∈ *L*,

where *χE*_{1} is the indicator function of *E*_{1}. It follows from *L*_{2}, *L*_{3} that *χE*_{1} ∈ *L* for every E_{1} ∈ 𝔄.

A positive isotonic linear functional *A* : *L* → ℝ is a functional satisfying the following properties:

A1 : *A*(*αf* + *βg*) = *αA*(*f*) + *βA*(*g*) for *f*, *g* ∈ *L*, *α*, *β* ∈ ℝ;

A2 : *f* ∈ *L*, *f*(*t*) ≥ 0 on *E* ⇒ *A*(*f*) ≥ 0;

It follows from *L*_{3} that for every *E*_{1} ∈ 𝔄 such that *A*(*χE*_{1}) > 0, the functional *A*_{E1} is defined for a fixed positive isotonic linear functional *A* as ${A}_{{E}_{1}}(f)=\frac{A(f.{\chi}_{{E}_{1}})}{A({\chi}_{{E}_{1}})}$, for all *f* ∈ *L*, with *A*(1) = 1. Furthermore, we observe that
$$A({\chi}_{{E}_{1}})+A({\chi}_{E\mathrm{\setminus}{E}_{1}})=1,$$
$$A(f)=A(f.{\chi}_{{E}_{1}})+A(f.{\chi}_{E\mathrm{\setminus}{E}_{1}}).$$(7)

Jessen (see [10, p-47]) gave the following generalization of Jensen’s inequality for convex functions.

**Theorem 3.1:** *Let L satisfy L*_{1}*and L*_{2} *on a nonempty set E*, *and assume that ϕ* : [*a*, *b*] → ℝ *be a continuous convex function. If A is linear positive functional with* ** A**(

**Theorem 3.2:** *Under the above assumptions, if ϕ* : [*a*, *b*] → ℝ *is a continuous convex function, then*
$$\varphi (A(f))\le \overline{D}(A,f,\varphi ;{E}_{1})\le A(\varphi (f));$$(9)*where*
$$\overline{D}(A,f,\varphi ;{E}_{1})=A({\chi}_{{E}_{1}})\varphi \left(\frac{A(f.{\chi}_{{E}_{1}})}{A({\chi}_{{E}_{1}})}\right)+A({\chi}_{E\mathrm{\setminus}{E}_{1}})\varphi \left(\frac{A(f.{\chi}_{E\mathrm{\setminus}{E}_{1}})}{A({\chi}_{E\mathrm{\setminus}{E}_{1}})}\right)$$*for all non empty set E*_{1} ∈ 𝔄 *such that* 0 < *A*(*χE*_{1}) < 1

**Proof:** *Since
$$\begin{array}{l}\overline{D}(A,f,\varphi ;{E}_{1})=A({\chi}_{{E}_{1}})\varphi \left(\frac{A(f.{\chi}_{{E}_{1}})}{A({\chi}_{{E}_{1}})}\right)+A({\chi}_{E\mathrm{\setminus}{E}_{1}})\varphi \left(\frac{A(f.{\chi}_{E\mathrm{\setminus}{E}_{1}})}{A({\chi}_{E\mathrm{\setminus}{E}_{1}})}\right)\\ \text{i}\text{.\hspace{0.17em}\hspace{0.17em}e}\text{.}\overline{D}(A,f,\varphi ;{E}_{1})=A({\chi}_{{E}_{1}})\varphi \left({A}_{{E}_{1}}(f)\right)+A({\chi}_{E\mathrm{\setminus}{E}_{1}})\varphi \left({A}_{E\mathrm{\setminus}{E}_{1}}(f)\right).\end{array}$$Using the inequality (8) we obtain
$$\overline{D}(A,f,\varphi ;{E}_{1})\le A(\varphi \left(f\right).{\chi}_{{E}_{1}})+A\left(\varphi \left(f\right).{\chi}_{E\mathrm{\setminus}{E}_{1}}\right)=A(\varphi \left(f\right)).$$(10)This proves the second inequality in (9).The first inequality follows by using definition of convex function and identity (7). □*

Let (Ω, Λ, *μ*) be a probability measure space. Consider the set of all density functions on *μ* to be *S* ≔ {*p*|*p* : Ω → ℝ, *p*(*s*) > 0, ∫_{Ω} *p*(*s*)*dμ*(*s*) = 1}.

Csiszár introduced the concept of *f* -divergence for a convex function *f* : (0, ∞) → (-∞, ∞) (cf. [11], see also [12]) by
$${I}_{f}(q,p)={\displaystyle \underset{\mathrm{\Omega}}{\int}p}(s)f(\frac{q(s)}{p(s)})d\mu (s),\hspace{0.17em}\hspace{0.17em}p,q\in S.$$

By appropriately defining the convex function *f*, various divergences can be derived. We give some important *f* -divergences, playing a significant role in Information Theory and Statistics.

(i) The class of *χ*-divergences: The *f* -divergences, in this class, are generated by the family of functions
$$\begin{array}{c}{f}_{\alpha}(u)=\mid u-1{\mid}^{\alpha}u\ge 0\hspace{0.17em}\mathrm{and}\hspace{0.17em}\alpha \ge 1.\\ {I}_{{f}_{\alpha}}(q,p)={\displaystyle \underset{\mathrm{\Omega}}{\int}}{p}^{1-\alpha}(s)\mid q(s)-p(s){\mid}^{\alpha}d\mu (s).\end{array}$$

For *α* = 1; it gives the total variation distance.
$$V(q,p)={\displaystyle \underset{\mathrm{\Omega}}{\int}\mid}\text{\hspace{0.17em}}q(s)-p(s)\mid d\mu (s).$$

For *α* = 2; it gives the Karl Pearson *χ*^{2}-divergence,
$${I}_{{\chi}^{2}}(q,p)={\displaystyle \underset{\mathrm{\Omega}}{\int}\frac{{[q(s)-p(s)]}^{2}}{p(s)}}d\mu (s).$$

(ii) *α*-order Renyi entropy : For *α* > 1 let
$$f(t)={t}^{\alpha},\text{\hspace{0.17em}}t>0.$$

Then *I*_{f} gives *α*-order entropy
$${D}_{\alpha}(q,p)={\displaystyle \underset{\mathrm{\Omega}}{\int}{q}^{\alpha}}(s)\hspace{0.17em}{p}^{1-\alpha}(s)\hspace{0.17em}d\mu (s).$$

(iii) Harmonic distance: Let $$f(t)=-\frac{2t}{1+t},t>0.$$

Then *I _{f}* gives Harmonic distance
$${D}_{H}(q,p)={\displaystyle \underset{\mathrm{\Omega}}{\int}\frac{2p(s)q(s)}{p(s)+q(s)}}d\mu (s).$$

(iv) Kullback-Leibler: Let $$f(t)=t\mathrm{log}t,t>0.$$

Then *f* -divergence functional give rise to Kullback-Leibler distance [13]
$${D}_{KL}(q,p)={\displaystyle \underset{\mathrm{\Omega}}{\int}q}(s)\mathrm{log}(\frac{q(s)}{p(s)})d\mu (s).$$

One parametric generalization of the Kullback-Leibler [13] relative information was studied in a different way by Cressie and Read [14].

(v) Jeffreys divergence: Let $$f(t)=(t-1)\mathrm{log}t,\hspace{0.17em}t>0.$$

Then *f* -divergence functional give Jeffreys divergence
$$J(q,p)={\displaystyle \underset{\mathrm{\omega}}{\int}\left(p(s)-q(s)\right)}\mathrm{ln}\left(\frac{q(s)}{p(s)}\right)d\mu (s).$$

(vi) The Dichotomy class: This class is generated by the family of functions *g _{α}* : (0, ∞) → ℝ,
$${g}_{\alpha}(u)=\{\begin{array}{ll}u-1-\mathrm{log}\text{\hspace{0.17em}}u,\hfill & \alpha =0\hfill \\ \hfill & \hfill \\ \frac{1}{\alpha (1-\alpha )}[\alpha u+1-\alpha -{u}^{\alpha}],\hfill & \alpha \in \mathbb{R}\backslash \{0,1\};\hfill \\ \hfill & \hfill \\ 1-u+u\text{\hspace{0.17em}}\mathrm{log}\text{\hspace{0.17em}}u,\hfill & \alpha =1.\hfill \end{array}$$(11)

This class gives, for particular values of *α*; some important divergences. For instance, for $\alpha =\frac{1}{2}$ it provides a distance, namely, the Hellinger distance.

There are various other divergences in Information Theory and Statistics such as Arimoto-type divergences, Matushita’s divergence, Puri-Vincze divergences etc. ( cf. [15], [16]) used in various problems in Information Theory and statistics. An application of Theorem 1.1 is the following result given by Csiszár and Korner (cf. [17]).

**Theorem 4.1:** *Let f* : [0, ∞) → ℝ *be a convex function and p*, *q be positive functions from S*. *Then the following inequality is valid*,
$${I}_{f}(q,p)\ge f(1).$$(12)

**Theorem 4.2:** *Let f* : [0, ∞] → ℝ *be a convex function, then for any p and q in S we have*:
$${I}_{f}(q,p)\ge \mu (\omega )f\left(\frac{1}{\mu (\omega )}{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)\right)+\mu (\overline{\omega})f\left(\frac{1}{\mu (\overline{\omega})}{\displaystyle \underset{\overline{\omega}}{\int}q}(s)d\mu (s)\right)\ge f(1).$$(13)

**Proof:** *By substituting ϕ(s) = f(s), $f(s)=\frac{q(s)}{p(s)}$ and dμ(s) = p(s)dμ(s) in Theorem 2.1, we deduce (13). □*

**Proposition 4.3:** *Let p*, *q* ∈ *S*, *then we have*
$$V(q,p)\ge 2\underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\left|{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)-\mu (\omega )\right|(\ge 0).$$(14)

**Proof:** *By putting f(x) = |x - 1| for all x ≥ 0 in Theorem 4.2 we get (14). □*

**Proposition 4.4:** *For any p*, *q* ∈ *S*,
$${I}_{{\chi}^{2}}(q,p)\ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\left\{\frac{{({\displaystyle {\int}_{\omega}q}(s)d\mu (s)-\mu (\omega ))}^{2}}{\mu (\omega )(1-\mu (\omega ))}\right\}\ge 4\text{\hspace{0.17em}}\text{\hspace{0.17em}}\underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\left\{{({\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)-\mu (\omega ))}^{2}\right\}(\ge 0).$$(15)

**Proof:** *By making use of the function f(x) = (t - 1)^{2} in Theorem 4.2 we get
$$\begin{array}{r}{\displaystyle \underset{\mathrm{\Omega}}{\int}p}(s){\left(\frac{q(s)}{p(s)}-1\right)}^{2}d\mu (s)\ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\{\mu (\omega ){\left(\frac{1}{\mu (\omega )}{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)-1\right)}^{2}\\ +\mu (\overline{\omega}){\left(\frac{1}{\mu (\overline{\omega})}{\displaystyle \underset{\overline{\omega}}{\int}q}(s)d\mu (s)-1\right)}^{2}\}(\ge 0)\\ \text{i}\text{.e}\text{.\hspace{0.17em}\hspace{0.17em}}{\displaystyle \underset{\mathrm{\Omega}}{\int}\frac{{\left(q(s)-p(s)\right)}^{2}}{p(s)}}d\mu (s)\ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\left\{\frac{{\left({\displaystyle {\int}_{\omega}q}(s)d\mu (s)-\mu (\omega )\right)}^{2}}{\mu (\omega )(1-\mu (\omega ))}\right\}(\ge 0).\end{array}$$Since by Arithmetic-Geometric mean inequality we have
$$\mu (\omega )(1-\mu (\omega ))\le \frac{1}{4}{[\mu (\omega )+(1-\mu (\omega ))]}^{2}=\frac{1}{4},$$therefore
$$\frac{{({\int}_{\omega}q(s)d\mu (s)-\mu (\omega ))}^{2}}{\mu (\omega )(1-\mu (\omega )}\phantom{\rule{thinmathspace}{0ex}}\ge 4(\underset{\omega}{\int}q(s)d\mu (s)-\mu (\omega ){)}^{2}(\ge 0).$$ □*

**Proposition 4.5:** *For any p*, *q* ∈ *S*, *we have*:
$${D}_{KL}(q,p)\ge \mathrm{ln}\left[{\left(\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}\right)}^{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}.{\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right)}^{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}\right](\ge 0).$$(16)

**Proof:** *By putting f(t) = t ln(t) in Theorem 4.2 one can get first inequality in (16).To prove the second inequality, we utilize the inequality between the geometric mean and harmonic mean,
$${x}^{\alpha}{y}^{1-\alpha}\ge \frac{1}{\frac{\alpha}{x}+\frac{1-\alpha}{y}},\text{\hspace{1em}\hspace{1em}}x,y,\alpha \in [0,1],$$we have for
$$x=\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )},y=\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}\text{\hspace{0.17em}\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\alpha ={\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)$$that
$${\left(\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}\right)}^{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}.{\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right)}^{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}\ge 1,$$for any ω ∈ 𝔖, which implies the second inequality in (16). □*

**Proposition 4.6:** *For any p*, *q* ∈ *S*, *we have*:
$$\begin{array}{c}J(q,p)\ge \mathrm{ln}\left(\underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\left\{{\left[\frac{(1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s))\mu (\omega )}{(1-\mu (\omega )){\displaystyle {\int}_{\omega}q}(s)d\mu (s)}\right]}^{(\mu (\omega )-{\displaystyle {\int}_{\omega}q}(s)d\mu (s))}\right\}\right)\\ \ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\left(\frac{{(\mu (\omega )-{\displaystyle {\int}_{\omega}q}(s)d\mu (s))}^{2}}{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)+\mu (\omega )-2{\displaystyle {\int}_{\omega}q}(s)d\mu (s)\mu (\omega )}\right)\ge 0.\end{array}$$(17)

**Proof:** *By putting f(x) = (x - 1) ln(x), x > 0 in Theorem 4.2 we have
$$\begin{array}{r}{\displaystyle \underset{\mathrm{\omega}}{\int}p}(s)\left(\frac{q(s)}{p(s)}-1\right)\mathrm{ln}\left(\frac{q(s)}{p(s)}\right)d\mu (s)\ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}(\mu (\omega )\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}-1\right)\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right)\\ +\mu (\overline{\omega})\left(\frac{{\displaystyle {\int}_{\overline{\omega}}q}(s)d\mu (s)}{\mu (\overline{\omega})}-1\right)\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\overline{\omega}}q}(s)d\mu (s)}{\mu (\overline{\omega})}\right))\\ =\underset{\omega \in \mathfrak{S}}{\mathrm{sup}}(\left({\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)-\mu (\omega )\right)\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right)\\ +\left({\displaystyle \underset{\overline{\omega}}{\int}q}(s)d\mu (s)-\mu (\overline{\omega})\right)\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\overline{\omega}}q}(s)d\mu (s)}{\mu (\overline{\omega})}\right))\end{array}$$that is
$$\begin{array}{r}J(q,p)\ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}(\left(\mu (\omega )-{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)\right)\mathrm{ln}\left(\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}\right)\\ -\left(\mu (\omega )-{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)\right)\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right))\end{array}$$proving the first inequality in (17).Utilizing the elementary inequality for positive numbers,
$$\frac{\mathrm{ln}b-\mathrm{ln}a}{b-a}\ge \frac{2}{a+b},\text{\hspace{1em}}a,b>0$$we have
$$\begin{array}{c}\left(\mu (\omega )-{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)\right)\left[\mathrm{ln}\left(\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}\right)-\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right)\right]\\ =\left(\mu (\omega )-{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)\right)\frac{\mathrm{ln}\left(\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}\right)-\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right)}{\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}-\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}}\\ \times \left[\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}-\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right]\\ =\frac{{\left(\mu (\omega )-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)\right)}^{2}}{\mu (\omega )(1-\mu (\omega ))}.\frac{\mathrm{ln}\left(\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}\right)-\mathrm{ln}\left(\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}\right)}{\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}-\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}}\\ \ge \frac{{\left(\mu (\omega )-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)\right)}^{2}}{\mu (\omega )(1-\mu (\omega ))}.\frac{2}{\frac{1-{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{1-\mu (\omega )}+\frac{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)}{\mu (\omega )}}\\ =\frac{2{(\mu (\omega )-{\displaystyle {\int}_{\omega}q}(s)d\mu (s))}^{2}}{{\displaystyle {\int}_{\omega}q}(s)d\mu (s)+\mu (\omega )-2{\displaystyle {\int}_{\omega}q}(s)d\mu (s)\mu (\omega )}\ge 0,\end{array}$$for each ω ∈ 𝔖, giving the second inequality in (17). □*

**Proposition 4.7:** *For any p*, *q* ∈ *S*, *we have*:
$${D}_{\alpha}(q,p)\ge \underset{\omega \in \mathfrak{S}}{\mathrm{sup}}\left[{(\mu (\omega ))}^{1-\alpha}{\left({\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)\right)}^{\alpha}+{(1-\mu (\omega ))}^{1-\alpha}{\left(1-{\displaystyle \underset{\mathrm{\omega}}{\int}q}(s)d\mu (s)\right)}^{\alpha}\right]\ge 1.$$(18)

**Proof:** *By putting f(x) = x^{α} for α > 1, x > 0, in Theorem 4.2 we get the required inequalities. □*

The authors express their sincere thanks to the referees for their careful reading of the manuscript and very helpful suggestions that improved the manuscript.

- [1]
Adil Khan M., Anwar M., Jakšetić J., and Pečarić J.,

*On some improvements of the Jensen inequality with some applications*, J. Inequal. Appl.**2009**(2009), Article ID 323615, 15 pages. - [2]
Adil Khan M., Khan G. A., Ali T., Batbold T., and Kilicman A.,

*Further refinement of Jensen’s type inequalities for the function defined on the rectangle*, Abstr. Appl. Anal.**2013**(2013), Article ID 214123, 1-8. - [3]
Adil Khan M., Khan G. A., Ali T., and Kilicman A.,

*On the refinement of Jensen’s inequality*, Appl. Math. Comput.**262**(1) (2015), 128-135. - [4]
Beesack P. R. and Pečarić J.,

*On Jessen’s inequality for convex functions*, J. Math. Anal. Appl., 110 (1985), 536-552. - [5]
Dragomir S. S.,

*A refinement of Jensen’s inequality with applications for*f*-divergence measures*, Taiwanese J. Math.,**14**(1) (2010), 153-164. - [6]
Dragomir S. S.,

*A new refinement of Jensen’s inequality in linear spaces with applications*, Math. Comput. Modelling,**52**(2010), 1497-1505. - [7]
Dragomir S. S.,

*Some refinements of Jensen’s inequality*, J. Math. Anal. Appl.,**168**(2) (1992), 518–522. - [8]
Dragomir S. S.,

*A further improvement of Jensen’s inequality*, Tamkang J. Math., 25(1) (1994), 29–36. - [9]
Micić-Hot J., Pečarić J. and Jurica P.,

*Refined Jensen’s operator inequality with condition on spectra*, Oper. Matrices, 7(2) (2013), 293-308. - [10]
Pečarić J., Proschan F. and Tong Y. L., Convex functions, Partial Orderings and Statistical Applications,

*Academic Press, New York*, 1992. - [11]
Csiszár I., Information measures, Acritical survey,

*Trans. 7th Prague Conf. on Info. Th*., Volume B, Academia Prague, (1978), 73-86. - [12]
Pardo M. C. and Vajda I.,

*On asymptotic properties of information-theoretic divergences*, IEEE Trans. Inform. Theory,**49**(3) (2003), 1860-1868. - [13]
Kullback S. and Leiber R. A.,

*On information and sufficency*, Ann. Math. Statist., 22 (1951), 79-86. - [14]
Cressie P. and Read T. R. C.,

*Multinomial goodness-of-fit tests*, J. Roy. Statist. Soc. Ser. B,**46**(1984), 440-464. - [15]
Kafka P., R Osterreicher F. and Vincze I.,

*On powers of f-divergences defining a distance*, Studia Sci. Math. Hungar.**26**(4) (1991), 415-422. - [16]
Liese F. and Vajda I., Convex statistical distances. With German, Teubner-Texte Zur Mathematika [Teubner Texts in mathematics], 95. BSB B. G. Teubner Verlagsgesellschaff, Leipzig, 1987.

- [17]
Csiszár I. and Korner J.,

*Information theory: Coding Theorem for Dicsrete Memoryless systems*, Academic Press, New York, (1981).

**Received**: 2015-09-07

**Accepted**: 2015-12-28

**Published Online**: 2016-04-23

**Published in Print**: 2016-01-01

**Citation Information: **Open Mathematics, ISSN (Online) 2391-5455, DOI: https://doi.org/10.1515/math-2016-0020. Export Citation

## Comments (0)