- #1
psie
- 269
- 32
- TL;DR Summary
- I am reading a proof of Jensen's inequality for conditional expectation in Le Gall's book Measure Theory, Probability and Stochastic Processes. I am a bit surprised that this inequality does not simply follow from the measure theoretic form that has been previously established, but requires a new, somewhat technical proof. I have some questions about the proof.
Theorem. Let ##\varphi:\mathbb R\to\mathbb R_+## be a convex function and ##X\in L^1##, then $$E[\varphi(X)\mid\mathcal B]\geq\varphi(E[X\mid\mathcal B]).$$
Proof: Set $$\mathcal E_\varphi=\{(a,b)\in\mathbb R^2:\forall x\in\mathbb R,\varphi(x)\geq ax+b\}=$$ Then by convexity of ##\varphi##, $$\varphi \left(x\right)=\underbrace{\sup_{\left(a{,}b\right)\in \mathcal E_{\varphi }}\left(ax+b\right)}_{g(x)}=\underbrace{\sup_{\left(a{,}b\right)\in \mathcal E_{\varphi }\cap \mathbb Q^2}\left(ax+b\right)}_{h(x)}.$$ We can take advantage of the fact that ##\mathbb Q^2## is countable to disgard [I think it should be discard] a countable collection of sets of probability zero and to get that, a.s., \begin{align*} E[\varphi(X)\mid \mathcal B]&=E\left[\sup_{\left(a{,}b\right)\in \mathcal E_{\varphi }\cap \mathbb Q^2}\left(aX+b\right)\Bigm\vert \mathcal B\right] \\ &\geq \sup_{\left(a{,}b\right)\in \mathcal E_{\varphi }\cap \mathbb Q^2}E[aX+b\mid\mathcal B] \\ &=\varphi(E[X\mid\mathcal B])\end{align*}
Questions:
1. I am a bit unsure why ##g(x)=h(x)##. Clearly ##g(x)\geq h(x)##, but why is ##g(x)\leq h(x)##? Here's my explanation, which is kind of lengthy, but maybe you have a better one.
If ##(a,b)\in\mathcal E_{\varphi}## is such that ##\varphi(x)>ax+b## for all ##x\in\mathbb R##, then pick a number ##q_x## in between. Let ##q_x=a'x+b'## where by denseness we choose ##(a',b')## sufficiently close to ##(a,b)## so that ##q_x## satisfies the inequality ##\varphi(x)>q_x>ax+b## for all ##x\in\mathbb R##. Then ##(a',b')\in\mathcal E_\varphi\cap\mathbb Q^2##, and since ##(a,b)## was arbitrary, this shows that ##g(x)\leq h(x)## when ##\varphi(x)>ax+b##. If ##(a,b)\in\mathcal E_{\varphi}## is such that ##\varphi(x)=ax+b##, then we approximate ##(a,b)## from below by rational pairs, and the supremum will give that ##g(x)=h(x)##. Does this make sense?
2. I do not understand what the author means by "We can take advantage of the fact that ##\mathbb Q^2## is countable to [discard] a countable collection of sets of probability zero..."? Moreover I am a bit unsure about the last inequality in the proof. Is this simply an application of monotonicity, i.e. $$\sup(aX+b)\geq aX+b\implies E[\sup(aX+b)\mid\mathcal B]\geq E[aX+b\mid\mathcal B],$$ so taking the supremum of this last inequality gives the desired inequality at the end of the proof. If my reasoning is correct, I don't see why we need to consider ##\mathcal E_\varphi\cap\mathbb Q^2##.