Stationary points classification using definiteness of the Lagrangian

fatpotato · May 7, 2022

Hello,

I am using the Lagrange multipliers method to find the extremums of ##f(x,y)## subjected to the constraint ##g(x,y)##, an ellipse.

So far, I have successfully identified several triplets ##(x^∗,y^∗,λ^∗)## such that each triplet is a stationary point for the Lagrangian: ##\nabla \mathscr{L} (x^∗,y^∗,λ^∗) = 0##

Now, I want to classify my triplets as max/min/saddle points, using the positive/negative definiteness of the Hessian like I have been doing for unconstrained optimization, so I compute what I think is the Hessian of the Lagrangian:

$$H_{\mathscr{L}}(x,y,λ)= \begin{pmatrix} 12x^2 - 2\lambda & 0 \\ 0 & -12y^2 - 4\lambda \end{pmatrix}$$

Evaluating the Hessian for my first triplet ##(0,\pm \frac{\sqrt{2}}{2},−\frac{1}{2})## gives me:

$$H_{\mathscr{L}}(0,\pm \frac{\sqrt{2}}{2},−\frac{1}{2}) = \begin{pmatrix} 1 & 0 \\ 0 & - 4\end{pmatrix}$$

This matrix is diagonal, meaning that we immediately read its eigenvalues on the diagonal: ##\lambda_1 = 1 > 0## and ##\lambda_2 = -4 < 0##. A positive/negative definite matrix has only positive/negative eigenvalues, thus I conclude that this matrix is neither, due to its eigenvalues' opposite signs.

When I was studying unconstrained optimization, I learned that we have in this case a saddle point, so I would like to think that the points ##(0,\pm \frac{\sqrt{2}}{2})## are both saddle points for my function f, however, the solution to this problem affirms these points are minimums, using the following argument:

Using the fact that ##\nabla g(x,y) = (0,\pm \frac{\sqrt{2}}{2})## and that ##w^T \nabla g(x,y) = 0## if and only if ##w = (\alpha, 0), \alpha \in \mathbb{R}^{\ast}##

I thought that it was enough to check for the definiteness of the Hessian, and now I am really confused...

Here are my questions:

When is it enough to check the definiteness of the Hessian to classify stationnary points?
Why is there this additional step in constrained optimization?
What am I missing?

Thank you for your time.

Edit: PF destroyed my LaTeX formatting.

pasmith · May 25, 2022

You are constrained to look at the behaviour of [itex]f[/itex] restricted to the one-dimensional hyperbola [itex]g(x,y) = x^2 - 2y^2 - 1 = 0[/itex]. If [itex]f[/itex] increases or decreases as you move off this curve, then that does not concern you.

To remain on the curve, your direction of travel must be orthogonal to [itex]\nabla g[/itex]. In this case, the eigenvector corresponding to the negative eigenvalue is parallel to [itex]\nabla g[/itex] and the eigenvector corresponding to the positive eigenvalue is orthogonal to it, so as far as you are concerned the negative eigenvalue is irrelevant: this critical point is a minimum.

In general, [itex]\nabla g[/itex] will not be an eigenvector of the hessian of [itex]f[/itex]. The text therefore defines a vector [itex]\mathbf{w}[/itex] orthogonal to [itex]\nabla g[/itex]) and looks at [tex]
f(\mathbf{x} + \alpha\mathbf{w}) \approx f(\mathbf{x}) + \tfrac12\alpha^2 \mathbf{w}^T H
\mathbf{w}[/tex] to determine whether a critical point of [itex]f[/itex] subject to this constraint is a minimum ([itex]\mathbf{w}^T H\mathbf{w} > 0[/itex]) or a maximum ([itex]\mathbf{w}^T H\mathbf{w} < 0[/itex]).

fatpotato · Jun 8, 2022

Thank you for your answer. There is a new point I do not understand:

pasmith said:

In this case, the eigenvector corresponding to the negative eigenvalue is parallel to and the eigenvector corresponding to the positive eigenvalue is orthogonal to it, so as far as you are concerned the negative eigenvalue is irrelevant: this critical point is a minimum.

How do you deduce that the eigenvector corresponding to the negative eigenvalue is parallel to the gradient? Same question for the eigenvector corresponding to the positive eigenvalue?

pasmith · Jun 8, 2022

fatpotato said:

Thank you for your answer. There is a new point I do not understand:

How do you deduce that the eigenvector corresponding to the negative eigenvalue is parallel to the gradient? Same question for the eigenvector corresponding to the positive eigenvalue?

By inspection. In this case the Hessian is diagonal, so we immediately see that the eigenvectors are (1,0) with eigenvalue [itex]H_{11} = 1[/itex] and (0,1) with eigenvalue [itex]H_{22} = -4[/itex]. [itex]\nabla g[/itex] is easily computed to be [itex](2x,-4y)[/itex] and at the critical point this is [itex](0, \pm \sqrt{2})[/itex]. The direction orthogonal to it is therefore (1,0).

fatpotato · Jun 9, 2022

Yes of course, we look for the kernel of ##H - \lambda_i I## so the vector ##(1,0)^T## is mapped to ##(0,0)^T## with eigenvalue ##\lambda_1 = 1##, thus being the associated eigenvector.

Sorry, I knew the concept, but could not deduce it myself. Now it is perfectly clear, thank you!

Stationary points classification using definiteness of the Lagrangian

FAQ: Stationary points classification using definiteness of the Lagrangian

What is the Lagrangian and how is it used in stationary points classification?

How is the definiteness of the Hessian matrix related to the nature of a stationary point?

Can the definiteness of the Hessian matrix change at a stationary point?

Are there any other methods for classifying stationary points?

Can the Lagrangian be used for systems with multiple variables?

Similar threads

Hot Threads

Recent Insights