Stationary points classification using definiteness of the Lagrangian

In summary, the speaker is using the Lagrange multipliers method to find extremums of a function subject to a constraint. They have successfully identified several stationary points using the method and are now trying to classify them as maximums, minimums, or saddle points. However, they are confused about the additional step in constrained optimization and are seeking clarification on when it is enough to check the definiteness of the Hessian to classify stationary points. The responder explains that in this case, the negative eigenvalue is irrelevant and the critical point is a minimum because the eigenvector corresponding to the negative eigenvalue is parallel to the gradient. They also clarify that the direction orthogonal to the gradient is used to determine whether a critical point is a minimum or maximum
  • #1
fatpotato
Homework Statement
Find the max/min/saddle points of ##f(x,y) = x^4 - y^4## subject to the constraint ##g(x,y) = x^2-2y^2 -1 =0##
Use Lagrange multipliers method
Classify the stationnary points (max/min/saddle) using the definiteness of the Hessian
Relevant Equations
Positive/Negative definite matrix
Hello,

I am using the Lagrange multipliers method to find the extremums of ##f(x,y)## subjected to the constraint ##g(x,y)##, an ellipse.

So far, I have successfully identified several triplets ##(x^∗,y^∗,λ^∗)## such that each triplet is a stationary point for the Lagrangian: ##\nabla \mathscr{L} (x^∗,y^∗,λ^∗) = 0##

Now, I want to classify my triplets as max/min/saddle points, using the positive/negative definiteness of the Hessian like I have been doing for unconstrained optimization, so I compute what I think is the Hessian of the Lagrangian:

$$H_{\mathscr{L}}(x,y,λ)= \begin{pmatrix} 12x^2 - 2\lambda & 0 \\ 0 & -12y^2 - 4\lambda \end{pmatrix}$$

Evaluating the Hessian for my first triplet ##(0,\pm \frac{\sqrt{2}}{2},−\frac{1}{2})## gives me:

$$H_{\mathscr{L}}(0,\pm \frac{\sqrt{2}}{2},−\frac{1}{2}) = \begin{pmatrix} 1 & 0 \\ 0 & - 4\end{pmatrix}$$

This matrix is diagonal, meaning that we immediately read its eigenvalues on the diagonal: ##\lambda_1 = 1 > 0## and ##\lambda_2 = -4 < 0##. A positive/negative definite matrix has only positive/negative eigenvalues, thus I conclude that this matrix is neither, due to its eigenvalues' opposite signs.

When I was studying unconstrained optimization, I learned that we have in this case a saddle point, so I would like to think that the points ##(0,\pm \frac{\sqrt{2}}{2})## are both saddle points for my function f, however, the solution to this problem affirms these points are minimums, using the following argument:

Lagrange_Mult_Sol.PNG

Using the fact that ##\nabla g(x,y) = (0,\pm \frac{\sqrt{2}}{2})## and that ##w^T \nabla g(x,y) = 0## if and only if ##w = (\alpha, 0), \alpha \in \mathbb{R}^{\ast}##

I thought that it was enough to check for the definiteness of the Hessian, and now I am really confused...

Here are my questions:
  1. When is it enough to check the definiteness of the Hessian to classify stationnary points?
  2. Why is there this additional step in constrained optimization?
  3. What am I missing?
Thank you for your time.

Edit: PF destroyed my LaTeX formatting.
 
Last edited by a moderator:
  • Like
Likes Delta2
Physics news on Phys.org
  • #2
You are constrained to look at the behaviour of [itex]f[/itex] restricted to the one-dimensional hyperbola [itex]g(x,y) = x^2 - 2y^2 - 1 = 0[/itex]. If [itex]f[/itex] increases or decreases as you move off this curve, then that does not concern you.

To remain on the curve, your direction of travel must be orthogonal to [itex]\nabla g[/itex]. In this case, the eigenvector corresponding to the negative eigenvalue is parallel to [itex]\nabla g[/itex] and the eigenvector corresponding to the positive eigenvalue is orthogonal to it, so as far as you are concerned the negative eigenvalue is irrelevant: this critical point is a minimum.

In general, [itex]\nabla g[/itex] will not be an eigenvector of the hessian of [itex]f[/itex]. The text therefore defines a vector [itex]\mathbf{w}[/itex] orthogonal to [itex]\nabla g[/itex]) and looks at [tex]
f(\mathbf{x} + \alpha\mathbf{w}) \approx f(\mathbf{x}) + \tfrac12\alpha^2 \mathbf{w}^T H
\mathbf{w}[/tex] to determine whether a critical point of [itex]f[/itex] subject to this constraint is a minimum ([itex]\mathbf{w}^T H\mathbf{w} > 0[/itex]) or a maximum ([itex]\mathbf{w}^T H\mathbf{w} < 0[/itex]).
 
Last edited:
  • Informative
  • Like
Likes fatpotato and Delta2
  • #3
Thank you for your answer. There is a new point I do not understand:
pasmith said:
In this case, the eigenvector corresponding to the negative eigenvalue is parallel to and the eigenvector corresponding to the positive eigenvalue is orthogonal to it, so as far as you are concerned the negative eigenvalue is irrelevant: this critical point is a minimum.
How do you deduce that the eigenvector corresponding to the negative eigenvalue is parallel to the gradient? Same question for the eigenvector corresponding to the positive eigenvalue?
 
  • Like
Likes Delta2
  • #4
fatpotato said:
Thank you for your answer. There is a new point I do not understand:

How do you deduce that the eigenvector corresponding to the negative eigenvalue is parallel to the gradient? Same question for the eigenvector corresponding to the positive eigenvalue?

By inspection. In this case the Hessian is diagonal, so we immediately see that the eigenvectors are (1,0) with eigenvalue [itex]H_{11} = 1[/itex] and (0,1) with eigenvalue [itex]H_{22} = -4[/itex]. [itex]\nabla g[/itex] is easily computed to be [itex](2x,-4y)[/itex] and at the critical point this is [itex](0, \pm \sqrt{2})[/itex]. The direction orthogonal to it is therefore (1,0).
 
  • Like
Likes Delta2 and fatpotato
  • #5
Yes of course, we look for the kernel of ##H - \lambda_i I## so the vector ##(1,0)^T## is mapped to ##(0,0)^T## with eigenvalue ##\lambda_1 = 1##, thus being the associated eigenvector.

Sorry, I knew the concept, but could not deduce it myself. Now it is perfectly clear, thank you!
 

FAQ: Stationary points classification using definiteness of the Lagrangian

What is the Lagrangian and how is it used in stationary points classification?

The Lagrangian is a mathematical function that is used in the field of calculus of variations to find the stationary points of a system. In stationary points classification, the Lagrangian is used to determine the nature of a stationary point (maximum, minimum, or saddle point) by analyzing the definiteness of the Hessian matrix of the Lagrangian.

How is the definiteness of the Hessian matrix related to the nature of a stationary point?

The definiteness of the Hessian matrix is determined by the sign of its eigenvalues. If all eigenvalues are positive, the stationary point is a minimum. If all eigenvalues are negative, the stationary point is a maximum. If there are both positive and negative eigenvalues, the stationary point is a saddle point.

Can the definiteness of the Hessian matrix change at a stationary point?

No, the definiteness of the Hessian matrix remains the same at a stationary point. This is because the Hessian matrix is a second-order derivative and its definiteness is determined by the behavior of the first-order derivatives at the stationary point.

Are there any other methods for classifying stationary points?

Yes, there are other methods such as the first derivative test and the second derivative test. These methods also use the first and second-order derivatives of the function to determine the nature of a stationary point.

Can the Lagrangian be used for systems with multiple variables?

Yes, the Lagrangian can be used for systems with multiple variables. In this case, the Hessian matrix will be a matrix of second-order partial derivatives and its definiteness will be determined by the eigenvalues of this matrix.

Back
Top