- #1
mathmari
Gold Member
MHB
- 5,049
- 7
Hey!
A function $f:\mathbb{R}^n\rightarrow \mathbb{R}$ is convex if for all $x,y\in \mathbb{R}^n$ the inequality $$f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)$$ holds for all $t\in [0,1]$.
Show that a twice continuously differentiable funtion $f:\mathbb{R}^n\rightarrow \mathbb{R}$ is convex iff the Hessian matrix is positive semidefinite for all $x\in \mathbb{R}^n$. I have read on some sites the idea of the proof but I am not sure that I understood that correctly. (Worried)
We suppose that the Hessian matrix is positive semidefinite for all $x\in \mathbb{R}^n$. We define the function:
$$g(t) = f(t \mathbf{x} + (1-t) \mathbf{y})$$
Then we can compute the derivatives of $g$:
$$g'(t) = ( \mathbf{x} - \mathbf{y})^T \mathbf{\nabla}f(t \mathbf{x} + (1-t) \mathbf{y})$$
$$g''(t) = ( \mathbf{x} - \mathbf{y})^T \mathbf{\nabla^2}f(t \mathbf{x} + (1-t) \mathbf{y}) ( \mathbf{x} - \mathbf{y})$$
For the derivatives we use the chain rule, right? Do we use this rule as follows? $$g'(t) = \frac{\partial}{\partial{t}}f(t \mathbf{x} + (1-t) \mathbf{y})=\sum_i\frac{\partial}{\partial{f_i}}f(t \mathbf{x} + (1-t) \mathbf{y})\cdot \frac{\partial{f_i}}{\partial{t}}=\sum_i\frac{\partial}{\partial{f_1}}f(t \mathbf{x} + (1-t) \mathbf{y})\cdot (\mathbf{x}-\mathbf{y})$$
:unsure:
Since the Hessian is positive semidefinite, we have $g''(t) \ge 0$ for all $t$. I haven't really understood this one. How do we get that $g''(t) \ge 0$ ?
Then we use this with Taylor's theorem to prove that:
$$
\begin{aligned}
g(0) &\ge g(t) + g'(t)(-t)\\
g(1) &\ge g(t) + g'(t)(1-t)
\end{aligned}
$$
Then if $t \in [0,1]$, these can then be combined to give:
$$
g(t) \le tg(1) + (1-t)g(0)
$$
which is equivalent to the inequality we wanted to prove. :unsure:
A function $f:\mathbb{R}^n\rightarrow \mathbb{R}$ is convex if for all $x,y\in \mathbb{R}^n$ the inequality $$f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)$$ holds for all $t\in [0,1]$.
Show that a twice continuously differentiable funtion $f:\mathbb{R}^n\rightarrow \mathbb{R}$ is convex iff the Hessian matrix is positive semidefinite for all $x\in \mathbb{R}^n$. I have read on some sites the idea of the proof but I am not sure that I understood that correctly. (Worried)
We suppose that the Hessian matrix is positive semidefinite for all $x\in \mathbb{R}^n$. We define the function:
$$g(t) = f(t \mathbf{x} + (1-t) \mathbf{y})$$
Then we can compute the derivatives of $g$:
$$g'(t) = ( \mathbf{x} - \mathbf{y})^T \mathbf{\nabla}f(t \mathbf{x} + (1-t) \mathbf{y})$$
$$g''(t) = ( \mathbf{x} - \mathbf{y})^T \mathbf{\nabla^2}f(t \mathbf{x} + (1-t) \mathbf{y}) ( \mathbf{x} - \mathbf{y})$$
For the derivatives we use the chain rule, right? Do we use this rule as follows? $$g'(t) = \frac{\partial}{\partial{t}}f(t \mathbf{x} + (1-t) \mathbf{y})=\sum_i\frac{\partial}{\partial{f_i}}f(t \mathbf{x} + (1-t) \mathbf{y})\cdot \frac{\partial{f_i}}{\partial{t}}=\sum_i\frac{\partial}{\partial{f_1}}f(t \mathbf{x} + (1-t) \mathbf{y})\cdot (\mathbf{x}-\mathbf{y})$$
:unsure:
Since the Hessian is positive semidefinite, we have $g''(t) \ge 0$ for all $t$. I haven't really understood this one. How do we get that $g''(t) \ge 0$ ?
Then we use this with Taylor's theorem to prove that:
$$
\begin{aligned}
g(0) &\ge g(t) + g'(t)(-t)\\
g(1) &\ge g(t) + g'(t)(1-t)
\end{aligned}
$$
Then if $t \in [0,1]$, these can then be combined to give:
$$
g(t) \le tg(1) + (1-t)g(0)
$$
which is equivalent to the inequality we wanted to prove. :unsure: