Least Square Method to fit a line to a set of datapoints

  • #1
MatinSAR
612
188
Homework Statement
Given a set of data points, we aim to fit a line to these points by minimizing the total error and finding the coefficients ##a_0##(intercept) and ##a_1##(slope).

Each of these data points has an associated error. Derive the expressions that give the errors (uncertainties) of ##a_0## and ##a_1##.
Relevant Equations
I'll mention the relevant equations later in my solution.
We are given a set of points ##(x_i , y_i)##. If we want to fit a line to these points which has the form of ##y=a_0+a_1x##, we need to do it in a way which minimizes the total error E:$$E = \sum_{i=1}^n (y_i - a_0 - a_1x_i)^2$$So we set ##\frac{\partial E}{\partial a_0} = 0## and ##\frac{\partial E}{\partial a_1} = 0## and solve the system of equations. Then we get:

1734639701853.png

1734639731835.png


My problem , I have no idea how to start with errors to find uncertainties of ##a_0## and ##a_1##.
 
Physics news on Phys.org
  • #4
I believe you need to make some assumptions on the residuales; IIRC, normality and homostadicity, in order to find the distribution of the slope, intercept. Under " reasonable" conditions, they are both normal and converge to the true value.
 
  • Like
Likes Gavran and MatinSAR
  • #5
WWGD said:
homostadicity
homoschedasticity ?
 
  • Like
Likes MatinSAR
  • #6
haruspex said:
homoschedasticity ?
Something like that.
 
  • Like
Likes PhDeezNutz and MatinSAR
  • #7
According to this Wikipedia article,
In statistics, a sequence of random variables is homoscedastic (/ˌhoʊmoʊskəˈdæstɪk/) if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used. “Skedasticity” comes from the Ancient Greek word “skedánnymi”, meaning “to scatter”.
 
  • Like
Likes MatinSAR and WWGD
  • #8
MatinSAR said:
Homework Statement: Given a set of data points, we aim to fit a line to these points by minimizing the total error and finding the coefficients ##a_0##(intercept) and ##a_1##(slope).

Each of these data points has an associated error. Derive the expressions that give the errors (uncertainties) of ##a_0## and ##a_1##.
...
My problem , I have no idea how to start with errors to find uncertainties of ##a_0## and ##a_1##.

Here are a few links in an old thread (with a link to an even older thread, etc. etc. -- sigh -- turtles all the way down).

I specially recommend Kirchner

##\ ##
 
  • Like
Likes MatinSAR
  • #9
BvU said:
Here are a few links in an old thread (with a link to an even older thread, etc. etc. -- sigh -- turtles all the way down).

I specially recommend Kirchner

##\ ##
Thank you for providing the links. Of course, I'll eventually find a post that doesn't link to an older one.
 
  • Like
Likes BvU

Similar threads

Replies
13
Views
1K
Replies
9
Views
2K
Replies
4
Views
1K
Replies
1
Views
1K
Replies
3
Views
2K
Replies
12
Views
691
Replies
2
Views
1K
Back
Top