Linear Algebra - Minimize the Norm

In summary, the conversation discusses finding a unit vector in a two-dimensional span that minimizes the distance between it and a given vector. The solution involves finding an orthonormal basis for the span and calculating the orthogonal projection of the given vector onto this basis. The final solution is found to be (3/2, 3/2, 11/5, 11/5) with a minimal distance of sqrt(1.3).
  • #1
steelphantom
159
0

Homework Statement


In R4, let U = span((1, 1, 0, 0), (1, 1, 1, 2)). Find u in U such that ||u - (1, 2, 3, 4)|| is as small as possible.

Homework Equations



The Attempt at a Solution


I came up with a vector u = (-.5, -.5, 0, 0) + (2, 2, 2, 4) = (1.5, 1.5, 2, 4). Then u - (1, 2, 3, 4) = (0.5, -0.5, -1, 0). Using the standard dot product as the norm, I get ||(0.5, -0.5, -1, 0)|| = sqrt(.25 + .25 + 1) = sqrt(1.5).

This was probably a bad approach, but I can't think of any other real way to do this. Is this the right answer at least? If not, how could I approach this in a better manner other than just randomly picking vectors? Thanks.
 
Physics news on Phys.org
  • #2
Try playing around with orthogonal projections onto U.
 
  • #3
Duh... :rolleyes: Thanks. So now I want to find a unit vector which spans U, but how do I do that? Once I find the unit vector, I can find the orthogonal projection P(u), and then the minimal distance will be ||u - P(u)||.
 
  • #4
Since U is two dimensional you will not find a single vector spanning U. If you are talking about "the" normal vector to U, then I have to tell you that the orthogonal complement of U is two-dimensional as well, so there will be no unique normal vector (even after normalization.)

The rest is right, ||u-Pu|| will be the distance you are looking for.
 
  • #5
Ok, I did what you said, and ended up with P(u) = (3/2)(1, 1, 0, 0) + 2(1, 1, 1, 2), which was what I originally had, so I guess I was right. Thanks for your help!
 
  • #6
I was just writing up my work today, and I found that I must have messed up somewhere. Here's how I went about the problem:

P(u) = <u, e1>e1 + <u, e2>e2
||e1|| = sqrt(12 + 12) = sqrt(2) => e1 = (1/sqrt(2))*(1, 1, 0, 0).
||e2|| = sqrt(1 + 1 + 1 + 4) = sqrt(7) => e2 = (1/sqrt(7))*(1, 1, 1, 2).
P(u) = (1/2)[(1*1) + (1*2) + (0*3) + (0*4)]e1 + (1/7)[(1*1) + (1*2) + (1*3) + (1*4)]e2
= (3/2)*(1, 1, 0, 0) + 2*(1, 1, 1, 2) = (3.5, 3.5, 2, 4)

But when I calculate ||P(u) - (1, 2, 3, 4)|| I get sqrt(10), which is higher than my random guess of sqrt(1.5). Where did I go wrong?
 
  • #7
the coefficient of e2 in P(u) -- shouldn't it be 10/7 instead of 2 ..?

Oh no I see you made a type, it should read 2*4, so the coefficient is right.

As you may have noticed, (Pu-u) as you calculated is orthogonal to neither of the vectors spanning U, so you made a mistake in calculating Pu.

Why do you consider P(u) = <u, e1>e1 + <u, e2>e2 to be the right formula for the projection?
 
Last edited:
  • #8
Pere Callahan said:
the coefficient of e2 in P(u) -- shouldn't it be 10/7 instead of 2 ..?

Oh no I see you made a type, it should read 2*4, so the coefficient is right.

As you may have noticed, (Pu-u) as you calculated is orthogonal to neither of the vectors spanning U, so you made a mistake in calculating Pu.

Why do you consider P(u) = <u, e1>e1 + <u, e2>e2 to be the right formula for the projection?

I have a formula that says Pv = v, e1>e1 + ... + <v, em>em, where {e1, ..., em} is an orthonormal basis for U. Apparently {e1, e2} is not an orthonormal basis. Do I just need to find an orthonormal basis and then calculate P(u)?
 
Last edited:
  • #9
steelphantom said:
Do I just need to find an orthonormal basis and then calculate P(u)?

Sounds like a good idea. :smile:
 
  • #10
Ok, to be orthonormal, <v_i, v_j> = 0 for i != j. So I want <(a, a, 0, 0), (b, b, b, 2b)> = 0. This means sqrt(ab + ab) = sqrt(2ab) = 0. But the only way this can happen is if a and/or b are zero. But if I solve <(a, a, 0, 0), (a, a, 0, 0)> = 1 I get a = 1/sqrt(2) and if I solve <(b, b, b, 2b), (b, b, b, 2b)> = 1 I get a = 1/sqrt(7)>.

This doesn't seem to make sense. What am I doing wrong? Thanks for your continued help.
 
  • #11
steelphantom said:
Ok, to be orthonormal, <v_i, v_j> = 0 for i != j. So I want <(a, a, 0, 0), (b, b, b, 2b)> = 0. This means sqrt(ab + ab) = sqrt(2ab) = 0. But the only way this can happen is if a and/or b are zero. But if I solve <(a, a, 0, 0), (a, a, 0, 0)> = 1 I get a = 1/sqrt(2) and if I solve <(b, b, b, 2b), (b, b, b, 2b)> = 1 I get a = 1/sqrt(7)>.

This doesn't seem to make sense. What am I doing wrong? Thanks for your continued help.

What you were trying to do was taking v_1 to be a multiple of (1,1,0,0) and v_2 a multiple of (1,1,1,2). However, (1,1,0,0) and (1,1,1,2) are not orthogonal, so any multiples of them will not be orthogonal either. That's why the method didn't work.

Try taking v_1 to be equal to (1,1,0,0), that;s fine. Then you need to find another vector in U which is orthogonal to this v_1. A general vector in U is a linear combination of (1,1,0,0) and (1,1,1,2) that is

[tex]
v_2 = a (1,1,0,0) + b (1,1,1,2)
[/tex]

Try to find a and b such that <v_1, v_2> = 0. Then you can normalize v_1 and v_2 and have an orthonormal basis.

In general you might want to look up Gram-Schmidt orthogonalization which is a method to find an orthogonal basis if you start from any given basis in a finite-dimensional Euclidean vector space. It's not necessary for this problem, though.
 
Last edited:
  • #12
Ok, so I take u1 = (1, 1, 0, 0) and then take u2 = a(1, 1, 0, 0) + b(1, 1, 1, 2) = (a + b, a + b, b, 2b). Then <u1, u2> = a+b + a+b = 2(a+b). So the only way for this to be zero is if a = -b. So I take a = 1, b = -1 to get u2 = (0, 0, -1, -2). Then ||u1|| = sqrt(2) and ||u2|| = sqrt(5), so e1 = (1/sqrt(2))*u1, and e2 = (1/sqrt(5))*u2.

So now we have P(u) = 1/2[(1*1) + (1*2) + (0*3) + (0*4)](1, 1, 0, 0) + 1/5[(0*1) + (0*2) + (-1*3) + (-2*4)](0, 0, -1, -2) = (3/2)*(1, 1, 0, 0) - (11/5)*(0, 0, -1, -2) = (3/2, 3/2, 11/5, 11/5).

Then P(u) - u = (0.5, 0.5, -0.8, 0.4) and ||P(u) - u|| = sqrt((0.25) + (0.25) + (0.64) + (0.16)) = sqrt(1.3).

I think I've FINALLY got this problem solved. Please let me know if I did this right. Thanks a lot for your help, Pere! :smile:
 
  • #13
Seems right. So your first guess was pretty close :smile:
 

FAQ: Linear Algebra - Minimize the Norm

What is the purpose of minimizing the norm in linear algebra?

Minimizing the norm in linear algebra is a common technique used to find the minimum distance between a given vector and a set of vectors. It helps to identify the smallest possible error or difference between data points, which is useful in many applications such as regression analysis or data fitting.

How is the norm defined in linear algebra?

In linear algebra, the norm is a mathematical function that assigns a non-negative value to a vector, representing its length or magnitude. The most commonly used norm is the Euclidean norm, which calculates the square root of the sum of squared elements in the vector.

Can minimizing the norm result in negative values?

No, minimizing the norm will never result in negative values. The norm is always non-negative, and minimizing it means finding the smallest possible non-negative value.

What is the relationship between minimizing the norm and solving linear systems of equations?

Minimizing the norm can be used to solve linear systems of equations by finding the vector that minimizes the distance between the given system and the solution set. This is known as the least squares method and is often used in applications where there are more equations than unknowns, making the system overdetermined.

Can minimizing the norm be applied to matrices?

Yes, minimizing the norm can be applied to matrices as well. In this case, the matrix norm is used, which is a measure of the size of a matrix. Minimizing the matrix norm is useful in applications such as matrix approximation or data compression.

Similar threads

Back
Top