Can you give me a least squares example?

In summary, the least squares method is a way to estimate the coefficients of a polynomial function by minimizing the sum of squared errors between the predicted values and the actual values. This can be done by setting up a matrix equation and solving it using various methods, such as Gauss Elimination or matrix factoring techniques. The resulting solution is an estimate of the coefficients and not the exact solution.
  • #1
hkBattousai
64
0
Can you give me a "least squares" example?

Assume that, I have a function to estimate like below:

f(x) = a3x3 + a2x2 + a1x1 + a0x0

After several experiments I have obtained these (x, f(x)) pairs:
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
(x6, y6)

How can I estimate a0, a1, a2 and a3?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.
 
Physics news on Phys.org
  • #2


Minimize sum

[tex]\sum (y_i - f(x_i))^2[/tex]
 
  • #3


hkBattousai said:
Assume that, I have a function to estimate like below:

f(x) = a3x3 + a2x2 + a1x1 + a0x0

After several experiments I have obtained these (x, f(x)) pairs:
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
(x6, y6)

How can I estimate a0, a1, a2 and a3?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.


Can you please give me a matrix representation?

Experimental input vector:
X = [x1 x2 x3 x4 x5 x6]T

Output vecotr of the experiment:
Y = [y1 y2 y3 y4 y5 y6]T

Coefficients of the polynomial in f(x):
A = [a0 a1 a2 a3]T
(Or, A = [a3 a2 a1 a0]T, please specify which A matrix you choose.)


How can I find the vector A in terms of X and Y experiment result vectors.
 
  • #4


To understand the idea there is no need for matrix representation.

Let's say you want to do linear regression, y=ax+b. You have set of pairs (xi, yi). You look for a & b such that the sum

[tex]\sum (y_i - ax_i - b)^2[/tex]

has minimum value. Calculate derivatives (d/da, d/db) of the sum, compare them to zero, solve for a & b - and you are done. This is high school math.

Your example - with third degree polynomial - is not linear in x, so I don't think you can use simple vector X for your purposes. But I can be wrong.
 
  • #5


^ Thank you for your answer.

I'm a grad-student, one of my courses include this LMS topic. My textbook doesn't explain how the theorem is applied, it just gives the solution in an example. I need to learn the implementation of this theorem by means of matrices. Internet sources give the formal definition of this theorem, unfortunately there is no example.

I will be happy if you could give me a start point.
 
  • #6


Think of [itex]a_3x^3+ a_2x^2+ a_1x+ a_0[/itex] as the matrix product
[tex]\begin{bmatrix}x^3 & x^2 & x & 1 \end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \ a_0 \end{bmatrix}[/tex]

Since you have 6 data points, you have that repeated 6 times- a matrix product with 6 rows:

[tex]\begin{bmatrix} x_1^3 & x_1^2 & x_1 & 1 \\ x_2^3 & x_2^2 & x_3 & 1 \\ x_3^3 & x_3^2 & x_3 & 1 \\ x_4^3 & x_4^2 & x_4 & 1 \\ x_5^3 & x_5^2 & x_5 & 1 \\ x_6^3 & x_6^2 & x_6 & 1\end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \\ a_0\end{bmatrix}= \begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6\end{bmatrix}[/tex]

Writing that as Ax= y where x, the vector of "a"s, is 4 dimensional, and y is in a six dimensional space, Ax is in a 4 dimensional subspace and that has an exact solution only if y happens to be in that subspace. If it is not, then the "closest" we can get to y is to the projection of y in that vector space. In particular, that means that y- Ax must be orthogonal to that space: <Au, y- Ax>= 0 for all v in [itex]R^4[/itex]. Letting [itex]A^*[/itex] be the adjoint (transpose) of A, [itex]<u, A^*(y- Ax)= 0[/itex].

But now, since that inner product is in [itex]R^4[/itex] and u could be any vector in [itex]R^4[/itex], we must have [itex]A^*(y- Ax)= A^*y- A^*Ax= 0[/itex] or [itex]A^*Ax= A^*y[/itex]. If [itex]A^*A[/itex] has an inverse (which it typically does in problems like this), [itex]x= (A^*A)^{-1}A^*y[/itex] gives the coefficients for the "least squares" cubic approximation.
 
  • #7


Sorry for the late reply.

HallsofIvy said:
[tex]\begin{bmatrix} x_1^3 & x_1^2 & x_1 & 1 \\ x_2^3 & x_2^2 & x_3 & 1 \\ x_3^3 & x_3^2 & x_3 & 1 \\ x_4^3 & x_4^2 & x_4 & 1 \\ x_5^3 & x_5^2 & x_5 & 1 \\ x_6^3 & x_6^2 & x_6 & 1\end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \\ a_0\end{bmatrix}= \begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6\end{bmatrix}[/tex]

This equation is XA = Y, isn't it?
Only the Y matrix is given. How can I form the X matrix here?

Thank you so much for your help.
 
  • #8


I'm thinking of the 4 by 6 matrix, made from the [itex]x_i[/itex] as "A" and the column matrix, made from the [itex]a_i[/itex] as "X".

You said, in your original post, that
After several experiments I have obtained these (x, f(x)) pairs:
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
(x6, y6)

so you have both [itex]xi[/itex] and [itex]y_i[/itex]. If you were given only the y-values with no corresponding x information, there is no possible way to set up a formula.
 
  • #9


hkBattousai said:
How can I estimate a0, a1, a2 and a3?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.

Here is an explanation that might be useful.

http://www.personal.psu.edu/jhm/f90/lectures/lsq2.html

The final matrix equation is equivalent to the linear system ATAx = ATb (called normal equations) that can be solved by Gauss Elimination or via matrix factoring techniques (e.g. LU, Cholesky, QR, SVD).
 
  • #10


I actually found the description on Mathworld rather good.
 
  • #11


I found the explanation of the method in a textbook, and I want to share it here. But since I'm not quite familiar with Latex, I will attach photos instead:
[PLAIN]http://img704.imageshack.us/img704/6940/dscf4205.jpg
I realized that this solution is the same as HallsofIvy offered, I wish I understood what he meant earlier...

Q1) The equation is XA=Y, why don't we just solve it by means of A=X-1Y, and use A=(XTX)-1XTY instead?

Q2) Is this solution of A an "estimate" or the real A? It is obvious that the solution is just an estimate, but why? The solution of the equation in the picture (XA=Y) is straight forward, after which step we say that the "A" vector is an estimate rather than the real A?

Q3) We use this method to estimate the test results as a polynomial. But do we have to estimate it as a polynomial only? I mean, can we estimate f(x) in term of other kinds functions? The picture below illustrates what I'm trying to ask:
[PLAIN]http://img80.imageshack.us/img80/7320/dscf4204.jpg
 
Last edited by a moderator:
  • #12


hkBattousai said:
I found the explanation of the method in a textbook, and I want to share it here. But since I'm not quite familiar with Latex, I will attach photos instead:
[PLAIN]http://img704.imageshack.us/img704/6940/dscf4205.jpg
I realized that this solution is the same as HallsofIvy offered, I wish I understood what he meant earlier...

Q1) The equation is XA=Y, why don't we just solve it by means of A=X-1Y, and use A=(XTX)-1XTY instead?
because X, in general, doesn't have an inverse. Here, you are trying to fit a cubic, with four coefficients, to six points so you have a 6 by 4 matrix. That is not a square matrix and so does not have an inverse. You can always fit a line to two points, a quadratic to three points and a cubic to four points, exactly, because that way you have the same number of coefficients as equations and so have a square matrix that you can invert.

Q2) Is this solution of A an "estimate" or the real A? It is obvious that the solution is just an estimate, but why? The solution of the equation in the picture (XA=Y) is straight forward, after which step we say that the "A" vector is an estimate rather than the real A?
What do you mean by a "real" A? In general there is NO cubic that actually passes through six given points. There is NO "real" A in that sense.

Q3) We use this method to estimate the test results as a polynomial. But do we have to estimate it as a polynomial only? I mean, can we estimate f(x) in term of other kinds functions? The picture below illustrates what I'm trying to ask:
[PLAIN]http://img80.imageshack.us/img80/7320/dscf4204.jpg[/QUOTE]
No, there are many other functions that are generally used- exponential and sine and cosine functions are often used. And, yes, exactly the same formulas apply.
 
Last edited by a moderator:
  • #13


I'm sorry for the late reply.
I don't know why but I didn't receive email notification for your reply this time, though by default I'm receiving email notification of replies.

Anyway,
All your answers are satisfactory for me.
Thank you so much for your help.
 
  • #14


Borek said:
To understand the idea there is no need for matrix representation.

Let's say you want to do linear regression, y=ax+b. You have set of pairs (xi, yi). You look for a & b such that the sum

[tex]\sum (y_i - ax_i - b)^2[/tex]

has minimum value. Calculate derivatives (d/da, d/db) of the sum, compare them to zero, solve for a & b - and you are done. This is high school math.

Your example - with third degree polynomial - is not linear in x, so I don't think you can use simple vector X for your purposes. But I can be wrong.

im finding a least squares method on these points. (-10,1),(-10,-1),(10,1),(10,-1)
and I ended up with 4+4(a)squared+400(b)squared=minimum
this might sound retarted to you guys, but what do i do now?
 
  • #15


joecampbell said:
im finding a least squares method on these points. (-10,1),(-10,-1),(10,1),(10,-1)
and I ended up with 4+4(a)squared+400(b)squared=minimum
this might sound retarted to you guys, but what do i do now?

Why don't you use the general formula in the picture in the image file in post #11?
You only have to modify the X matrix by only including x0 and x1 terms.
 

FAQ: Can you give me a least squares example?

1. What is the concept of least squares in statistics?

Least squares is a statistical method used to find the best-fit line or curve for a set of data points. It minimizes the sum of squared differences between the observed values and the predicted values, making it a useful tool for analyzing and predicting trends in data.

2. How is least squares used in linear regression?

In linear regression, least squares is used to find the line of best fit for a set of data points. The method calculates the slope and intercept of the line that minimizes the sum of squared differences between the observed values and the predicted values. This line is then used to make predictions about future data points.

3. Can you give an example of how least squares is used in real-world applications?

One example of using least squares in real-world applications is in finance, where it is used to analyze stock market trends and make predictions about future stock prices. By fitting a least squares line to historical stock prices, analysts can make informed decisions about which stocks to buy, sell, or hold.

4. What are the assumptions of least squares?

The main assumptions of least squares include that the data points are independent, the errors are normally distributed, and the errors have constant variance. Additionally, the relationship between the variables should be linear, and there should be no significant outliers in the data.

5. How do you calculate least squares?

To calculate least squares, you first need to determine the slope and intercept of the line that minimizes the sum of squared differences between the observed values and the predicted values. This can be done using various methods, such as the normal equations or gradient descent algorithm. Once the slope and intercept are calculated, they can be used to make predictions about future data points.

Back
Top