Least Squares Derivation question

In summary: That clears up everything for me. I really appreciate your help and the time you took to figure that out. Have a great day!In summary, the conversation discusses the method of least squares for finding the best fit straight line and focuses on the derivation of the formula for the slope (m). The conversation discusses how the book's formula for m is different from the one derived in the conversation and explains how the two are related. By setting the derivative of the sum of residuals (S) to zero, the formula for m (slope) is obtained. The conversation also explains the reasoning behind the book's formula and clarifies any confusion.
  • #1
Xyius
508
4
So I am learning how to use the method of least squares to find the best fit straight line and there is this one part in the derivation I do not fully understand and I was wondering if anyone could help me out.

So basically we start out with the residuals of the equation of a straight line..

[tex]y_i-mx_i-c[/tex]
And now we take the root mean square of these residuals and try to find the minimum points of m and c by taking the partial derivatives.

[tex]\Sigma (y_i-mx_i-c)^2[/tex]
[tex]\frac{\partial S}{\partial m}=-2\Sigma x_i(y_i-mx_i-c)=0[/tex]
[tex]\frac{\partial S}{\partial c}=-2\Sigma (y_i-mx_i-c)=0[/tex]

Now from the second equation it is easy to see that
[tex]c=\overline{y}-m\overline{x}[/tex]
since
[tex]\overline{y}=\frac{1}{n}\Sigma y_i[/tex]
and
[tex]\overline{x}=\frac{1}{n}\Sigma x_i[/tex]

Solving the first equation for m we get..
[tex]m=\frac{\Sigma x_i y_i -c\Sigma x_i}{\Sigma x_i ^2}[/tex]

The part I do not understand is the book says that m is equal to..

[tex]m=\frac{\Sigma (x_i-\overline{x})y_i}{\Sigma (x_i-\overline{x})^2}[/tex]
I feel like I must be missing something simple due to the books lack of explanation. Can anyone help me get this formula from the one I got for m? Why did the [itex]x_i[/itex]'s turn into [itex](x_i-\overline{x})[/itex]'s?? Did they made c=0 for some reason? Any help would be appreciated!
 
Physics news on Phys.org
  • #2
Xyius said:
[tex]m=\frac{\Sigma x_i y_i -c\Sigma x_i}{\Sigma x_i ^2}[/tex]

The part I do not understand is the book says that m is equal to..
[tex]m=\frac{\Sigma (x_i-\overline{x})y_i}{\Sigma (x_i-\overline{x})^2}[/tex]

In the first equation, if you replace [itex] c [/itex] by [itex] \overline{y} - m \overline{x} [/itex] then you get m's on both sides and you have to solve for m again. Maybe that's how to do it.
 
  • #3
Not quite right - looks like they assumed a model with zero y-mean. Another way of looking at this is a model, in x about x-mean, through the origin. With this later terminology, you only have one parameter, the slope (m), to estimate

[tex] y_i = m \, (x_i-\bar{x})[/tex]

Hence, setting the derivative of S(m) to zero yields

[tex] \sum y \, (x_i-\bar{x}) = m \cdot \sum (x_i-\bar{x})^2[/tex]

Solve for m to yield text answer
 
  • #4
[tex]m=\frac{\Sigma x_i y_i -c\Sigma x_i}{\Sigma x_i ^2}[/tex]
[tex] m = \frac{\Sigma x_i y_i - (\overline y - m \overline x) \Sigma x_i}{\Sigma x_i ^2}[/tex]
[tex] m = \frac{ \Sigma x_i y_i -\overline y \Sigma x_i + m \overline x \Sigma x_i}{\Sigma x_i^2} [/tex]
[tex] m \Sigma x_i^2 - m \overline x \Sigma x_i = \Sigma x_i y_i - \overline y \Sigma x_i [/tex]
[tex] m= \frac{ \Sigma x_i y_i - \overline y \Sigma x_i}{\Sigma x_i^2 - \overline x \Sigma x_i} [/tex]

[tex] = \frac{ \Sigma x_i y_i - \frac{\Sigma y_i}{N} \Sigma x_i}{\Sigma x_i^2 - (\frac{\Sigma x_i}{N}) \Sigma x_i } [/tex]

The numerator is equal to [itex] \Sigma x_i y_i - \Sigma y_i (\frac{\Sigma x_i}{N}) [/itex]
[tex] = \Sigma x_i y_i - \Sigma y_i \overline x [/tex]
[tex] = \Sigma (x_i - \overline x) y_i [/tex]

To deal with the denominator, consider an estimator for the sample variance:

[tex] \sigma^2 = \frac { \Sigma (x_i - \overline x)^2}{N} [/tex]
[tex] = \frac{ \Sigma ( x_i^2 - 2 x_i \overline x + \overline x \overline x)}{N} [/tex]
[tex] = \frac{ \Sigma x_i^2 - 2 \overline x \Sigma x_i + \Sigma \overline x \overline x }{N} [/tex]
[tex] = \frac {\Sigma x_i^2}{N} - 2 \overline x \frac{\Sigma x_i}{N} + \frac{ N \overline x \overline x}{N} [/tex]
[tex] = \frac {\Sigma x_i^2}{N} - 2 \overline x \overline x + \overline x \overline x [/tex]
[tex] = \frac{ \Sigma x_i^2}{N} - \overline x \overline x [/tex]

This establishes that
[tex] \frac {\Sigma (x_i - \overline x)^2}{N} = \frac{\Sigma x_i^2}{N} - \overline x \overline x [/tex]

So
[tex] \Sigma (x_i - \overline x)^2 = \Sigma x_i^2 - N \overline x \overline x [/tex]
[tex] = \Sigma x_i^2 - N ( \frac{\Sigma x_i}{N} \frac{\Sigma x_i}{N}) [/tex]
[tex] = \Sigma x_i^2 - \frac{\Sigma x_i \Sigma x_i}{N} [/tex]
 
  • #5
The only thing that confuses me Stephen Tashi, is line 4 and 5. Where do you get those relations and how does "m" turn into the expression in line 5? Otherwise, everything after it is fine and I understand completely. :]
 
  • #6
[tex] m = \frac{ \Sigma x_i y_i -\overline y \Sigma x_i + m \overline x \Sigma x_i}{\Sigma x_i^2} [/tex]
Multiply both sides of the equation by [itex] \Sigma x_i^2 [/itex] and then subtract [itex] m\overline x \Sigma x_i [/itex] from both sides.
[tex] m \Sigma x_i^2 - m \overline x \Sigma x_i = \Sigma x_i y_i - \overline y \Sigma x_i [/tex]

[tex] m( \Sigma x_i^2 - \overline x \Sigma x_i) = \Sigma x_i y_i - \overline y \Sigma x_i [/tex]

Divide both sides by [itex] \Sigma x_i^2 - \overline x \Sigma x_i [/itex]

[tex] m= \frac{ \Sigma x_i y_i - \overline y \Sigma x_i}{\Sigma x_i^2 - \overline x \Sigma x_i} [/tex]
 
Last edited:
  • #7
Thank you very much!
 

Related to Least Squares Derivation question

1. What is the purpose of a Least Squares Derivation?

The purpose of a Least Squares Derivation is to find the line of best fit for a set of data points. This line minimizes the sum of the squared distances between each data point and the line, making it the most accurate representation of the data.

2. How is the Least Squares Derivation calculated?

The Least Squares Derivation is calculated by finding the slope and y-intercept of the line of best fit using the method of least squares. This involves finding the partial derivatives of the sum of squared errors function and setting them equal to zero, then solving for the slope and y-intercept.

3. What assumptions are made in a Least Squares Derivation?

The main assumptions made in a Least Squares Derivation are that the data points are independent, the errors are normally distributed, and the errors have equal variances. These assumptions help to ensure the accuracy and validity of the calculated line of best fit.

4. Can the Least Squares Derivation be used for non-linear relationships?

No, the Least Squares Derivation is only applicable for linear relationships. Non-linear relationships require different methods of finding the line of best fit, such as polynomial regression or exponential regression.

5. How does the Least Squares Derivation relate to the concept of correlation?

The Least Squares Derivation is closely related to the concept of correlation, as both involve finding the degree of relationship between two variables. However, correlation measures the strength and direction of the relationship, while the Least Squares Derivation calculates the line of best fit for that relationship.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
902
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
Back
Top