Deriving Extremas of Homogeneous Functions: A Chain Rule Comparison

  • MHB
  • Thread starter mathmari
  • Start date
  • Tags
    Form
In summary, we discussed the properties of a twice differentiable and homogeneous function of degree 2. We explored how to show that the function has its possible local extremas at its roots and how to show that it can be expressed in the form $f(x)=\frac{1}{2}x^T\cdot H_f(0)\cdot x$. We also discussed how to use the fact that the function is homogeneous of degree 2 to prove these statements. Finally, we took a closer look at the case where the function is only twice continuously differentiable and found that the same principles still hold.
  • #1
mathmari
Gold Member
MHB
5,049
7
Hey! :eek:

Let $f:\mathbb{R}^n\rightarrow \mathbb{R}$ be twice differentiable and homogeneous of degree $2$.

To show that the function has its possible local extremas at its roots, do we have show that the first derivative, i.e. the gradient is equal to $0$ if the function is equal to $0$ ?

Also how can we show that $f$ is in the form $f(x)=\frac{1}{2}x^T\cdot H_f(0)\cdot x$, where $H_f(0)$ is the Hessian Matrix of $f$ at $0$ ? Could you give me a hint?

(Wondering)
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
Hey mathmari!

Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)
 
  • #3
I like Serena said:
Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)

We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?
 
  • #4
mathmari said:
We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?

Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
 
  • #5
I like Serena said:
Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
\begin{align*}&f(tx) = f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ & \Rightarrow t^2f(x)=f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ &\Rightarrow f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...\end{align*}

So we have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$ That means that $1=\frac{1}{t^2}$, or not? (Wondering)
 
  • #6
Good! (Happy)

mathmari said:
That means that $1=\frac{1}{t^2}$, or not?

Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)
 
  • #7
I like Serena said:
Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)

I got stuck right now. What do you mean? (Wondering)
 
  • #8
mathmari said:
I got stuck right now. What do you mean?

Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
 
  • #9
I like Serena said:
Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
Ah ok!

We have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + \frac{Df(0)}{t}x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$

$\frac{1}{t^2}f(0) =f(0), \forall t$ holds only when $f(0)=0$.

$\frac{Df(0)}{t}=Df(0), \forall t$ holds only when $Df(0)=0$

$\frac 1{3!} D^3f(0) =\frac 1{3!} D^3f(0) t, \forall t$ holds only when $D^3f(0)=0$

This holds also for any $k\geq 4$ : $D^kf(0)=0$

In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not? (Wondering) And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ? (Wondering)
 
  • #10
mathmari said:
In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not?

And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ?

Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
 
  • #11
I like Serena said:
Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
I just saw that I forgot the word "continuosuly" in my initial post, so $f$ is twice continuously differentiable. (Blush)

But having that $f(0)=Df(0)=0$ means that the function and the gradient is $0$ at the point $0$. Does this mean that the function can have its extremas at the same points as the roots? (Wondering)
 
Last edited by a moderator:
  • #12
Or do we have to do the same for a general point instead of $0$ ? (Wondering)
 
  • #13
I am stuck now. (Worried)
 
  • #14
Thanks to Euge I have a solution to your problem now. (Happy)

From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$. Since $f$ is twice continuously differentiable, taking limits as $t \to 0$ results in (after dividing by $2$)$$f(\mathbf{x}) = \frac{1}{2}\sum\limits_{i,\,j} x^i\frac{\partial^2 f}{\partial x^i \partial x^j}(\mathbf{0})\,x^j = \frac{1}{2}\mathbf{x}^TH_f(\mathbf{0})\,\mathbf{x}$$ as desired. (Nerd)As for the other statement, take the first derivative with respect to $t$ on both sides of the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$ to get $\mathbf{x}\cdot Df(t\mathbf{x}) = 2tf(\mathbf{x})$. Evaluating at $t = 1$ yields $$f(\mathbf{x}) = \frac{1}{2}\mathbf{x}\cdot Df(\mathbf{x})$$ If $f$ has a critical point at $\mathbf{x} = \mathbf{c}$, then $Df(\mathbf{c}) = \mathbf{0}$; in light of the above equation $f(\mathbf{c}) = .5\mathbf{c}\cdot \mathbf{0} = 0$, that is, $\mathbf{c}$ is a zero of $f$. (Thinking)
 
  • #15
I like Serena said:
From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$.

We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)
 
  • #16
mathmari said:
We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)

All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)
 
  • #17
I like Serena said:
All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)

You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)
 
  • #18
mathmari said:
You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)

I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)
 
  • #19
I like Serena said:
I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)

Ah ok! Thank you very much! (Smile)
 

FAQ: Deriving Extremas of Homogeneous Functions: A Chain Rule Comparison

What does it mean to "show that f is that form"?

When you are asked to show that f is that form, it means that you need to demonstrate that the given function, f, can be written in a specific mathematical form or equation. This typically involves manipulating the function algebraically to match the desired form.

How do I know what form f should be in?

The specific form that f should be in will depend on the context of the problem. Often, the form will be given to you in the instructions or you will need to use your knowledge of mathematical concepts to determine the appropriate form.

Can I use any method to show that f is that form?

There are typically multiple methods that can be used to show that f is that form. However, it is important to follow the instructions and use the method that is specified or most appropriate for the given problem.

What are some common mathematical forms that f can be written in?

Some common mathematical forms that f may need to be written in include polynomial form, exponential form, logarithmic form, trigonometric form, and power form. These forms may also have variations, such as standard form or vertex form for polynomials.

Are there any tips for showing that f is that form?

Yes, there are a few tips that can help when trying to show that f is that form. First, make sure to carefully read the instructions and understand the desired form. Also, try to simplify the function as much as possible before attempting to transform it into the desired form. Additionally, practice and familiarity with mathematical concepts and techniques can make the process easier.

Similar threads

Replies
9
Views
1K
Replies
4
Views
1K
Replies
16
Views
2K
Replies
24
Views
3K
Replies
8
Views
2K
Replies
6
Views
1K
Replies
21
Views
2K
Back
Top