Bivariate Function Optimization

In summary, the conversation discusses finding critical points of a function with two variables and classifying them using the Hessian matrix. The theorem mentioned states that if the Hessian is positive semidefinite for all (x,y), then any stationary point is a global minimizer. However, the example given does not satisfy this condition and has a stationary point that is not a global minimum. To find the global minimum, a simple lower bound is used to show that the minimum occurs at (1,1).
  • #1
Ocifer
32
0

Homework Statement


In general, I've been given a few functions of two variables, x and y. I have been asked to find all critical points by setting the gradient of the function equal to 0.

Further we are asked to classify these critical points using some given rules regarding the Hessian matrix.

Homework Equations



The Attempt at a Solution



For example, I'm given

[itex] f(x,y) = x^4 + y^4 − 4xy + 1 [/itex]

By solving the gradient equation, I got the fixed points (0,0) , (1,1) , and (-1,-1).

When I look at the Hessian matrix evaluated at either (1,1) or (-1,-1), the Hessian is the same and is positive definite. From this I conclude that both (1,1) and (-1,-1) are local minima with f(1,1) = f(-1,-1) = -1

How can I show that this is also a global minimum?
 
Physics news on Phys.org
  • #2
As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?
 
  • #3
Ocifer said:
I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer.
The Hessian of your function is not positive semidefinite for all (x,y), is it? According to my calculations, it only has that property where |xy| ≥ 2/3.
 
  • #4
Ocifer said:
As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?

The theorem is not true if S has boundaries, because then the minimum can occur on the boundary and not be a stationary point. Example: minimize f(x) = x such that x ≥ 0. The solution is at x = 0, where f'(x) > 0.

Also: the Hessian in your example is NOT positive semidefinite everywhere. The Hessian, H, has the form H = 4A, where
[tex] A = \pmatrix{3x^2&-1\\-1&3y^2}[/tex]
Here is a little theorem that you may find useful: in order that a symmetric nxn matrix A be positive semidefinite, it must be the case that: (1) all diagonal elements a_{ii} ≥ 0; and (2) if a_{ii} = 0 for some i, then a_{ij} = a_{ji} = 0 for all j (and that fixed i); that is, if a diagonal element is zero, all elements in the corresponding row and column must also be zero. (This is very easy to prove; just assume it does not hold, and show how to get a quadratic form x^T A x that can be > 0 for some x and < 0 for other x.)

So, along either the x or y-axes, we have x = 0 or y = 0, so A is not psd (because it has a nonzero off-diagonal element and a zero diagonal). In fact, in order for A to be psd we need
[tex] 9 x^2 y^2 - 1 \geq 0 \Longrightarrow |x| |y| \geq 1/3.[/tex] Therefore, A is psd in the four regions {x > 0, y > 0, xy ≥ 1/3}, {x < 0, y < 0, xy ≥ 1/3}, {x > 0, y < 0, xy < -1/3} and {x < 0, y > 0, xy < -1/3}. Outside these four regions, A is indefinite.

To check if you have the global minima, just look at a simple lower bound b(x,y) on f(x,y), namely: for all (x,y) we have
[tex] f(x,y ) \geq x^4 + y^4 + 1 - 4|x| |y| \equiv b(x,y),[/tex]
so a global min of f is bounded from below by a global min of b. By symmetry, the four global minima of b are just reflections of the minimum of b in the first quadrant, where b = f. So we can just look at f in the first quadrant, and get the stationary point therein, which are at (0,0) or (1,1). To show that (1,1) is the global min of f in the first quadrant, you can solve the problem of minimizing f in the "indefinite" region R1 = {x ≥ 0, y ≥ 0, xy ≤ 1/3}, but since this has inequality constraints you must use the so-called Karush-Kuhn-Tucker conditions. (These are like Lagrange multiplier conditions, but with modifications to handle inequalities.) I took the lazy way out and just used a package to solve the problem. The solution is x = y = 1/sqrt(3), giving f = -1/9. Note that this point is on the boundary xy = 1/3 of the region and has grad f = <-8*sqrt(3)/9, -8*sqrt(3)/9>, which is nonzero.

It follows that the global min of b in the first quadrant is (1,1), which is therefore also a global min of f.

RGV
 
  • #5
Thanks for your replies. I had made an arithmetic error when taking the 2nd derivatives, so my Hessian was off. Thank you for pointing out my error. It turns out the prof had poorly specified what he wanted with the question. He only cared about local results, but thank you for your feedback.
 

FAQ: Bivariate Function Optimization

1. What is bivariate function optimization?

Bivariate function optimization is a mathematical process used to find the optimal values for two independent variables in a function. This involves finding the maximum or minimum value of the function while varying both variables simultaneously.

2. Why is bivariate function optimization important?

Bivariate function optimization is important because it allows us to find the best combination of two variables that will result in the most favorable outcome for a given function. This can be applied in various fields such as economics, engineering, and data analysis.

3. What are the methods used in bivariate function optimization?

The most commonly used methods in bivariate function optimization are gradient descent, Newton's method, and simulated annealing. These methods involve iteratively adjusting the values of the variables to find the optimal solution.

4. What are the challenges in bivariate function optimization?

One of the main challenges in bivariate function optimization is finding the global optimal solution instead of getting stuck in a local optimum. This can be addressed by using multiple starting points and various optimization algorithms.

5. How is bivariate function optimization different from univariate function optimization?

Bivariate function optimization differs from univariate function optimization in that it involves optimizing two variables simultaneously instead of just one. This adds another dimension to the problem and often requires more complex methods to find the optimal solution.

Back
Top