Fermat's theorem (stationary points) of higher dimensions

In summary: However, in the one-dimensional case this is not too hard, as we just need to remember that a vector is covariant if and only if its derivative is.
Physics news on Phys.org
  • #2
ianchenmu said:
Look at this page and the Proof part,

Fermat's theorem (stationary points) - Wikipedia, the free encyclopedia

How to change the proof 2 into a proof of higher dimensions or can you give a proof of Fermat's theorem of higher dimensions?

In case of higher dimension You cal substitute the derivative $\displaystyle f^{\ '}(x)= \frac{d f(x)}{d x}$ with the gradient, defined as... $\displaystyle \nabla f(x_{1},\ x_{2}, ...,\ x_{n}) = \frac{ \partial f}{\partial x_{1}}\ \overrightarrow {e}_{1} + \frac{ \partial f}{\partial x_{2}}\ \overrightarrow {e}_{2} + ... + \frac{ \partial f}{\partial x_{n}}\ \overrightarrow {e}_{n}$ (1)

Kind regards

$\chi$ $\sigma$
 
Last edited:
  • #3
chisigma said:
In case of hogher dimension You cal substitute the derivative $\displaystyle f^{\ '}(x)= \frac{d f(x)}{d x}$ with the gradient, defined as... $\displaystyle \nabla f(x_{1},\ x_{2}, ...,\ x_{n}) = \frac{ \partial f}{\partial x_{1}}\ \overrightarrow {e}_{1} + \frac{ \partial f}{\partial x_{2}}\ \overrightarrow {e}_{2} + ... + \frac{ \partial f}{\partial x_{n}}\ \overrightarrow {e}_{n}$ (1)

Kind regards

$\chi$ $\sigma$

Thanks. But how about $f:\mathbb{R}^n\rightarrow \mathbb{R}$
 
  • #4
chisigma said:
In case of higher dimension You cal substitute the derivative $\displaystyle f^{\ '}(x)= \frac{d f(x)}{d x}$ with the gradient, defined as... $\displaystyle \nabla f(x_{1},\ x_{2}, ...,\ x_{n}) = \frac{ \partial f}{\partial x_{1}}\ \overrightarrow {e}_{1} + \frac{ \partial f}{\partial x_{2}}\ \overrightarrow {e}_{2} + ... + \frac{ \partial f}{\partial x_{n}}\ \overrightarrow {e}_{n}$ (1)

Kind regards

$\chi$ $\sigma$

But what's then? what $\frac{ \partial f}{\partial x_{1}},\frac{ \partial f}{\partial x_{2}},...,\frac{ \partial f}{\partial x_{n}}$ equal to?

(I mean, is that $\frac{ \partial f}{\partial x_{1}}=\frac{ \partial f}{\partial a_{1}}=0$,$\frac{ \partial f}{\partial x_{n}}=\frac{ \partial f}{\partial a_{2}}=0$,...,$\frac{ \partial f}{\partial x_{n}}=\frac{ \partial f}{\partial a_{n}}=0$, (where $a$ is a local maximum) ,why?)
 
Last edited:
  • #5
ianchenmu said:
But what's then? what $\frac{ \partial f}{\partial x_{1}},\frac{ \partial f}{\partial x_{2}},...,\frac{ \partial f}{\partial x_{n}}$ equal to?

If You write as $\displaystyle \overrightarrow {x}$ a generic vector of dimension n and as $\displaystyle \overrightarrow {0}$ the nul vector of dimension n, then $\displaystyle \overrightarrow {x}_{0}$ is a relative maximum or minimum only if is...

$\displaystyle \nabla f (\overrightarrow {x}_{0}) = \overrightarrow {0}$ (1)

Kind regards

$\chi$ $\sigma$
 
  • #6
chisigma said:
If You write as $\displaystyle \overrightarrow {x}$ a generic vector of dimension n and as $\displaystyle \overrightarrow {0}$ the nul vector of dimension n, then $\displaystyle \overrightarrow {x}_{0}$ is a relative maximum or minimum only if is...

$\displaystyle \nabla f (\overrightarrow {x}_{0}) = \overrightarrow {0}$ (1)

Kind regards

$\chi$ $\sigma$
But this is what I need to prove. To clarify, I need to prove this:
Let $E\subset \mathbb{R}^n$ and $f:E\rightarrow\mathbb{R}$ be a continuous function. Prove that if $a$ is a local maximum point for $f$, then either $f$ is differentiable at $x=a$ with $Df(a)=0$ or $f$ is not differentiable at $a$.
 
  • #7
ianchenmu said:
But this is what I need to prove. To clarify, I need to prove this:
Let $E\subset \mathbb{R}^n$ and $f:E\rightarrow\mathbb{R}$ be a continuous function. Prove that if $a$ is a local maximum point for $f$, then either $f$ is differentiable at $x=a$ with $Df(a)=0$ or $f$ is not differentiable at $a$.
When I saw this problem, I thought that it would be easy to tackle it by reducing it to the one-dimensional case. In fact, let $b\in\mathbb{R}^n$. If $f$ has a local maximum at $a$, then the function $g:\mathbb{R}\to\mathbb{R}$ defined by $g(t) = f(a+tb)$ must have a maximum at $t=0$. What we need to do here is to choose the vector $b$ suitably. Then, provided that $f$ is differentiable at $a$, we can use the fact that $g'(0) = 0$ to deduce that $Df(a) = 0.$

But that turns out to be a bit tricky. The reason is that if $a\in\mathbb{R}^n$ then the derivative $Df(a)$ belongs to the dual space $\mathbb{R}^n$. In other words, if you think of $a$ as a column vector, then $Df(a)$ will be a row vector. So suppose we take $b=(Df(a))^{\text{T}}$, the transpose of $Df(a)$. According to the higher-dimensional chain rule, $g'(0) = Df(a)\circ b = Df(a)\circ (Df(a))^{\text{T}} = \bigl|Df(a)\bigr|^2.$ But since $0$ is a local maximum for $g$ it follows that $g'(0) = 0$ and hence $Df(a) = 0.$

If you really want to get to grips with duality, and the reasons for distinguishing between row vectors and column vectors, then you will have to come to terms with covariance and contravariance.
 

FAQ: Fermat's theorem (stationary points) of higher dimensions

What is Fermat's theorem of stationary points in higher dimensions?

Fermat's theorem of stationary points in higher dimensions is a mathematical concept that states that a function with multiple variables will have a stationary point (a point where the partial derivatives are equal to zero) at a local minimum, maximum, or saddle point. This theorem is an extension of the classic Fermat's theorem, which only applies to functions with one variable.

How is Fermat's theorem of stationary points used in real-life applications?

Fermat's theorem of stationary points is used in various fields, such as physics, engineering, and economics, to optimize functions with multiple variables. It helps to find the minimum or maximum values of a function, which can be applied to real-world problems, such as finding the most efficient route for a delivery truck or the best design for a bridge.

What is the difference between a local and global extremum in Fermat's theorem of stationary points?

A local extremum is a point where the function has the highest or lowest value in a small neighborhood, while a global extremum is the highest or lowest value of the entire function. In other words, a local extremum is a point within a limited range, while a global extremum is the overall optimal solution.

Can Fermat's theorem of stationary points be applied to functions with an infinite number of variables?

No, Fermat's theorem of stationary points only applies to functions with a finite number of variables. This is because the partial derivatives and the Hessian matrix (used to determine the nature of the stationary point) become infinitely complicated as the number of variables increases, making it impossible to find a clear solution.

What is the relationship between Fermat's theorem of stationary points and the gradient descent algorithm?

The gradient descent algorithm is a numerical method used to find the minimum or maximum of a function. It is based on the principle of Fermat's theorem of stationary points, as it uses the partial derivatives to determine the direction and magnitude of the steepest descent or ascent. However, the gradient descent algorithm can also be applied to functions with one variable, while Fermat's theorem of stationary points only applies to functions with multiple variables.

Back
Top