Optimizing Bi-Linear Objective Function for Vector Fitting in Flat Space

In summary: Example 1: A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}} \\ \frac{2}{\sqrt{15}} \\ \frac{1}{\sqrt{15}} \end{pmatrix} , B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} (so that A'D = \begin{pmatrix} 1 & \frac{1}{\sqrt{3}} & \frac{2}{\sqrt{15}} & \frac{1}{\sqrt{15}}
  • #1
sunjin09
312
0
I have two matrices, A and D, with same numbers of rows and different numbers of columns (A has many more columns than D), I want to find x and y such that ||Ax-Dy||_2 is minimized. I.e., I want to find the closest vectors in span{A} and span{D}. Seems like a simple problem, but couldn't figure it out. Any suggestions? (A and D are linearly independent, so that span{A} and span{D} have no nonzero intersection)
 
Mathematics news on Phys.org
  • #2
First, let's try to state the problem clearly. Your statement about finding 'x' and 'y' isn't clear because it isn't clear whether "Ax" is supposed to represent a column vector or whether it represents the matrix "A" times a vector "x".

We could try it this way first:

I have two sets of n dimensional vectors A and D. Set D has greater cardinality that set A. The span of set A and the span of set D are vector spaces whose only intersection is the zero vector. How do I find vectors x and y such that x is in the span of A and y is in the span of D and the distance between x and y (i.e. [itex] || x- y||_2 [/itex]) is minimal?

The answer, of course, is to set both x and y equal to the zero vector. Assuming that's not what you want to do, how do we modify the statement of the problem to say what you want?
 
  • #3
Thank you for correcting the problem statement, following your statement what I want to minimize is is the angle between x and y, i.e., maximize [itex]\frac{<x,y>}{||x||_2||y||_2}[/itex], and I don't want the trivial 0 solution. Where do I go from here?
 
  • #4
sunjin09 said:
maximize [itex]\frac{<x,y>}{||x||_2||y||_2}[/itex],

I don't know an easy way to do this. You may as well consider only unit vectors, so the problem becomes to maximize [itex] <\hat{x},\hat{y}> [/itex]. As far as I can see this problem falls under the heading of a "bilinear optimization problem" or, more generally, a "multilinear optimization problem".

My intuition is that if you have two vector subspaces that only intersect at the zero vector, then you should be able to find a set of vectors [itex] {e_1,e_2,..,e_n, f_1,f_2,...,f_m} [/itex] such that this set is a (non-orthogonal) basis for the parent n+m dimensional space, the [itex]e_i [/itex] are an orthonormal basis for the first subspace and the [itex] f_i [/itex] are an orthonormal basis for the second subspace.

If that inutition is correct then let [itex]\hat{x} = \sum_1^n \alpha_i e_i [/itex] and [itex] \hat{y} = \sum_1^m \beta_j f_j [/itex]. Let [itex] c_{i,j} = <e_i, f_j> [/itex].

The problem is to maximize the function [itex] \sum_{i=1}^n \sum_{j=1}^m c_{i,j} \alpha_i \beta_j [/itex] subject to the constraints [itex] \sum_1^n \alpha_i^2 = 1 [/itex] and [itex] \sum_1^n \beta_j^2 = 1 [/itex].

I wonder if there is a simpler formulation.
 
  • #5
It seems that
[tex]
<x,y>=(Aa)'(Db)=a'A'Db=a'U'SVb=(Ua)'S(Vb)
[/tex]
where [itex] A'D=U'SV[/itex] is the SVD, since both U and V are orthonormal, the minimum angle occurs at the largest singular value in S. Does that sound right?
 
  • #7
  • #8
sunjin09 said:
Yes but never needed to implement the details. But my problem does not seem to formulate as an LP problem, does it? It's quadratic.

How is it quadratic?
 
  • #9
Number Nine said:
How is it quadratic?

objective function is the dot product of two unknown vectors x and y.
 
  • #10
sunjin09 said:
It seems that
[tex]
<x,y>=(Aa)'(Db)=a'A'Db=a'U'SVb=(Ua)'S(Vb)
[/tex]
where [itex] A'D=U'SV[/itex] is the SVD, since both U and V are orthonormal, the minimum angle occurs at the largest singular value in S. Does that sound right?

What do you mean by "at" the largest singular value? Do you mean we set all the entries of vector [itex] a [/itex] equal to zero except for one of them?

If [itex] A = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} [/itex], [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex] then [itex] A'D = \begin{pmatrix} 1 & 1 \end{pmatrix} [/itex].

[itex] A'D [/itex] is equal to the same thing if [itex] A = \begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix} [/itex] [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex]
 
  • #11
Stephen Tashi said:
What do you mean by "at" the largest singular value? Do you mean we set all the entries of vector [itex] a [/itex] equal to zero except for one of them?

If [itex] A = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} [/itex], [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex] then [itex] A'D = \begin{pmatrix} 1 & 1 \end{pmatrix} [/itex].

[itex] A'D [/itex] is equal to the same thing if [itex] A = \begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix} [/itex] [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex]

Assuming A and B are both orthnormal, then the coefficient vectors a and b are both unit vectors so that x and y are also unit vectors. Then maximization of <x,y> correspond to minimal angles between x and y. In order to maximize <x,y>=(Ua)'*S*(Vb), given that ||Ua||=1 and ||Vb||=1, I want to choose Ua and Vb to be 1 at the largest singular values and 0 elsewhere, not that a and b are 1 at one place and 0 everywhere else. Is the logic correct?
 
  • #12
sunjin09 said:
I want to choose Ua and Vb to be 1 at the largest singular values and 0 elsewhere

Are you saying that vector 'a' will be chosen so that the vector Ua will be 1 at the jth component iff the largest singular value occurs in S at location S[j][j] and the vector Ua will be zero elsewhere?


In these two examples, do we have the same matrix for A'D but different answers for the maximum angle? (My 4-D intuition isn't good, so I'm not sure.)

Example 1: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{2}{\sqrt{15}}\\ \frac{1}{\sqrt{15}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]

Example 2: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{1}{\sqrt{6}}\\ \frac{1}{\sqrt{6}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]
 
Last edited:
  • #13
Stephen Tashi said:
Are you saying that vector 'a' will be chosen so that the vector Ua will be 1 at the jth component iff the largest singular value occurs in S at location S[j][j] and the vector Ua will be zero elsewhere?


In these two examples, do we have the same matrix for A'D but different answers for the maximum angle? (My 4-D intuition isn't good, so I'm not sure.)

Example 1: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{2}{\sqrt{15}}\\ \frac{1}{\sqrt{15}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]

Example 2: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{1}{\sqrt{6}}\\ \frac{1}{\sqrt{6}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]

As I solved this example, since A'*D is the same, so are a and b, actually a=1 and b=[1/sqrt(2),1/sqrt(2)]. But x1=A1≠x2=A2. However the angle is the same, since <x,y>=(Ua)'*S*(Vb) is totally determined by A'*D. Seemingly logical.
 
  • #14
sunjin09 said:
As I solved this example, since A'*D is the same, so are a and b, actually a=1 and b=[1/sqrt(2),1/sqrt(2)]. But x1=A1≠x2=A2. However the angle is the same, since <x,y>=(Ua)'*S*(Vb) is totally determined by A'*D. Seemingly logical.

sunjin09 said:
As I solved this example, since A'*D is the same, so are a and b, actually a=1 and b=[1/sqrt(2),1/sqrt(2)]. But x1=A1≠x2=A2. However the angle is the same, since <x,y>=(Ua)'*S*(Vb) is totally determined by A'*D. Seemingly logical.

I see what you mean. [itex] A'D = \begin{pmatrix} \frac{1}{\sqrt{3}} & \frac{1}{\sqrt{3}} \end{pmatrix} [/itex]

[itex] A'D = \begin{pmatrix} 1 \end{pmatrix} [/itex] [itex] \begin{pmatrix}\frac{\sqrt{2}}{\sqrt{3}} & 0 \end{pmatrix} [/itex] [itex] \begin{pmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} & \frac{-1}{\sqrt{2}} \end{pmatrix} [/itex]

You have shown that <x,y> is determined by A'D. This isn't a result I have seen before.

You haven't explained why the maximum possible <x,y> is equal to the largest singular value or why vectors a and b must exist that produce this value. (Are we maximizing <x,y> or maximizing the absolute value of <x,y>?)
 
Last edited:
  • #15
sunjin09 said:
objective function is the dot product of two unknown vectors x and y.

All valid dot, inner, or scalar (in this context) functions are always bi-linear. If you are trying to maximize only an inner product, especially in a flat space, you will always get a bilinear problem. I'm assuming it's flat because you mentioned fot product which is usually associated with cartesian space: if it's not then please post your inner product definition.

Since you want to minimize ||Ax-Dy||_2 then just minimize <Ax-Dy,Ax-Dy> which is

<Ax-Dy,Ax-Dy> = <Ax,Ax> - 2<Ax,Dy> + <Dy,Dy>

Now if x and y are vectors, Ax will be bilinear in each component as will Dy which means the whole thing will be a multilinear expression.

Also minimizing the square of the norm is equivalent to minimizing the norm itself as both are purely increasing monotonic functions and since the answer is always greater than or equal to zero.
 

FAQ: Optimizing Bi-Linear Objective Function for Vector Fitting in Flat Space

What is a strange data fitting problem?

A strange data fitting problem is a situation where a dataset does not fit into traditional statistical models or patterns. This can occur due to outliers, missing data, or unexpected relationships between variables.

2. How do you approach a strange data fitting problem?

The first step is to thoroughly examine the dataset and identify any potential issues, such as outliers or missing data. Then, different statistical techniques, such as non-parametric methods or machine learning algorithms, can be used to fit the data and identify patterns.

3. What are some common causes of strange data fitting problems?

Some common causes include measurement errors, data entry mistakes, and inconsistencies in the data collection process. Additionally, complex relationships between variables or rare events can also lead to strange data fitting problems.

4. Can strange data fitting problems be solved?

Yes, with the right approach and tools, strange data fitting problems can be solved. It may require more advanced statistical techniques and careful interpretation of results, but it is possible to find meaningful patterns and insights from a seemingly strange dataset.

5. How can strange data fitting problems impact research or decision making?

Unresolved strange data fitting problems can lead to inaccurate conclusions and decisions. It's important to thoroughly address these issues to ensure the validity and reliability of research findings or data-driven decisions.

Back
Top