Can Least Squares be Applied to Matrices for Solving a Complex Equation?

In summary: Each time you add a new data point the solution changes. It is like a pendulum swinging back and forth.
  • #1
Klandhee
7
0
Hi, as no unique unique analytical solution exists to my problem (as another poster pointed out), I have taken to solving it through a least squares method. My equation is as follows:

(s1x1 + s2x2 - I).^2 = min.

Here s1 and s2 are shift matrices (I know them), and I is a matrix of size nxm, also known. x1 and x2 are two unknown matrices (obviously of size nxm). The .^2 is my way (and MATLAB's) of expressing every element in the matrix is multiplied by itself (similar to least squares). Now I could do this iteratively but the matrices are very large and I'm sure it would require several iterations to converge and thus it becomes very computationally intensive. Thus I've resorted to try to work this out in a least squares method. I understand least squares for a set of values, however I'm not sure how to extend it to matrices (if possible).

Again, if anyone has any guidance in this it would be greatly appreciated! Thanks very much!
 
Physics news on Phys.org
  • #2
Could you provide a link where some previous poster had pointed out that your original is not solvable analytically.

Because, your matrix I is nxm and x1 and x2 are nxm, it means that s1 and s2 ought to be nxn. But the whole result under the square is an nxm matrix, which cannot be squared.
 
  • #3
Here is the link:

https://www.physicsforums.com/showthread.php?t=519406

I do not mean to square the matrix, I used MATLAB's syntax. What I mean is, say
a = [1 2 3; 4 5 6] then a.^2 = [1 4 9; 16 25 36]

Still perserving it's size of 3x2. I put the square into ensure that x1 and x2 do not come back filled with negative infinities.
 
  • #4
What you have is more unknowns than equations. Your problem does not have a unique solution. Not just analytic, but in any way. Your two matrices x1 and x2 can be stacked together into one bigger matrix that is 2n x m. Then your equations would look like:

[tex]
\left[\begin{array}{c|c}
S_{1} & S_{2}
\end{array}\right]_{n \times 2 n} \cdot \left[\begin{array}{c}
x_{1} \\
\hline
x_{2}
\end{array}\right]_{2n \times m} = \left[\begin{array}{c}
I
\end{array}\right]_{n \times m}
[/tex]
 
  • #5
Ah, I see. My linear algebra is very rusty and I'm really trying to wrap my head around this and solve my problem. I can think of a physical constraint that the sum of all the values in x1 plus the sum of all the vaules in x2 should be equal to the sum of the values in I, but that's not sufficient for a unique solution.

Hmm, I'll have to think about this more. I must not be writing down the mathematics of my system properly. The reason why I know that there must be a solution is my problem is physical in nature. Think multi-sources with a single detector, leading to the overlap of data. I have to disentangle the data to process it... hmm!
 
  • #6
But, if I understood well, you're given a map A:V-->W between normed spaces; then the matrix least squares gives you the _best_ solution x^ for Ax=b, in the sense that ||Ax^-b|| < ||Ax-b|| for all x in V. Specifically, x^ will be the orthogonal projection of b into the subspace A(V) of W . As an example, given A:R^2-->R^2 ( both as vector spaces over R ), with A( 1,0)=(1,0) and A((0,1))=0 (and extend linearly) ,, and you want to find the solution for Av=(0,2); here the vector (0,2) will not be in the range of A (since the range of A is a copy of R in R^2 ), _but_ you can then find the least-squares, which gives you the point in the range which is closest to (1,2) , i.e., least sqaures will give you a vector v^ with ||Av^-(1,2)|| <||Av-(1,2)|| ; specifically here, v^ is the orthogonal projection of (1,2) into the image A(cv)=(c,0).

The whole thing generalizes to statistical least-squares, tho there the problem is sort of the reverse one: instead of having a subspace given and then finding the points closest to Ax=b, you are given a collection of points , and then you want to find the subspace that is closest to a collection of given points (and then also you need to worry and do other tests to see whether a linear relationship between the variables exists; e.g., by calculating r^2 , or by doing inference on regression with the assumption r=0, etc.)
 
Last edited:
  • #7
So if I understand correctly then you think there does exist a matrix x^ which is the best solution. Unfortunately I am still very unclear as to how to construct this matrix beyond iterative (and too bulky) means.
 
  • #8
Sorry, Klandhee, I did not think it thru carefully-enough. Let me give it some more thought.
 
  • #9
OK so here is the latest on this problem. I agree with any poster who said there were an infinite number of solutions, whereas the physical limits of the problem actaully permit one.

Say each matrix x1, x2, and I are nxm matrices. Then since x1 and x2 are both nxm there would be 2*n*m unknown values, with n*m equations. However, the physical limits require the matrix to be, excuse me if I use the wrong terminology, symmetric about x. That is to say that if the matrix is [a1,1 a2,1; a1,2 a2,2] then the we have that a1,1 = a1,2 and a2,1 = a2,2. So this reduces the number of unknowns by 1/2. Thus we have n*m unknowns with n*m equations, making this potentially solvable, I believe. Now the question still remains of the way to do this, which is what I am not sure about but if anyone has any ideas (or sees an error in what I've said here) please let me know! Again thanks for the assistance!
 

Related to Can Least Squares be Applied to Matrices for Solving a Complex Equation?

1. What is the least squares method for matrices?

The least squares method for matrices is a mathematical technique used to find the best fit line or plane for a set of data points. It involves minimizing the sum of the squared differences between the actual data points and the predicted values from the line or plane.

2. How is the least squares method used in scientific research?

The least squares method is commonly used in scientific research to analyze and interpret data. It can be applied in various fields such as physics, economics, engineering, and more. It allows researchers to identify patterns and relationships in their data and make predictions or draw conclusions based on the fitted line or plane.

3. What is the difference between ordinary least squares and weighted least squares?

Ordinary least squares (OLS) is a type of least squares method that assumes all data points have equal importance. Weighted least squares (WLS) is a modified version of OLS that assigns different weights to each data point based on its significance. WLS is often used when there is heteroscedasticity in the data, meaning the variance of the data is not constant.

4. How do you calculate the least squares solution for a matrix?

The least squares solution for a matrix can be calculated by using the normal equation: (X^T X)^-1 X^T y. This involves finding the inverse of the product of the transpose of the matrix and the matrix itself, and then multiplying it by the transpose of the matrix and the vector of dependent variables.

5. What are the assumptions of the least squares method?

The least squares method assumes that the errors in the data are normally distributed, the errors have a mean of zero, and the errors are independent of each other. It also assumes that the independent variables are not perfectly correlated and that there is no multicollinearity among the independent variables.

Similar threads

Replies
7
Views
2K
  • Linear and Abstract Algebra
Replies
5
Views
2K
  • Linear and Abstract Algebra
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
3
Views
2K
  • Linear and Abstract Algebra
Replies
1
Views
3K
  • Linear and Abstract Algebra
Replies
2
Views
3K
  • Linear and Abstract Algebra
Replies
4
Views
2K
Replies
1
Views
2K
Replies
30
Views
2K
Back
Top