Is There a Way to Fit Lines of Worst Fit Using Statistical Packages?

Beam me down · Sep 5, 2008

I recently completed an experiment for university level physics and am currently doing the analysis of the data. This data has been transformed to fit a linear model and it fits this very well, with the line of best fit lying within all error bars. I need to provide estimate of the gradient and intercept as these have a physical interpretation. I was hoping to do a line of worst fit, the worst line that still fits all error bars.

I can do this to some degree of accuracy by plotting the data by hand, but I was wondering if there was a more systematic approach using any statistical package?

I have access to SPSS, Excel and R, though I am not that familiar with the latter. Though I suppose I could always download a trial version of other packages.

So is there anyway to fit lines of worst fit?

Thanks for any help you can provide.

rbeale98 · Sep 5, 2008

what are the error bars you refer to? standard deviation? When doing linear regression, it is not a requirement that the estimate line fit the error bars. the fact that all your measurements fit within one standard deviation of the line is completely by chance. best fit is done by minizing the squared error sum. there is no such thing as a worst fit line. why can't you just use the slope and intercept of your best fit line as your estimates? if you want you can find the confidence intervals (99%, 95%, etc) for these estimates.

jackdagg · Aug 12, 2009

Wow - lots of options here. If standard methods are not sufficient, there are many ways to approximate data fit so that you can consider complex transformation for testing and estimation. Currently, the R language might be most amenable to this sort of modeling. You might start with some spline methods. Jack Dagg

statdad · Aug 13, 2009

By "gradient" do you mean slope? if so, there are traditional confidence interval estimates for the slope (and the intercept as well). You can also get R to create a graph showing the fitted regression line and both the confidence and prediction bounds (or exactly one of those) for your data: at the prompt in R type

help(predict)

for more details.

jackdagg · Aug 13, 2009

Excellent suggestion statdad. Sometimes it's easy to overlook established methods. I see a lot of students have an aha moment after they calculate the estimated confidence bands that increase in width as the data limits are approached. Cheers!

SW VandeCarr · Aug 14, 2009

There is something called the minimum likelihood estimate or "badness of fit" based the negative log likelihood. I don't know of this helps: See Sec 6.4.1 of the following:

http://books.google.com/books?id=yU...esult&ct=result&resnum=5#v=onepage&q=&f=false

The Excel version of MLE is pretty good for most purposes, but most major software packages can do MLE including SPSS.

statdad · Aug 14, 2009

I'm not sure how the likelihood idea would apply, or really be useful, in a simple linear regression setting. I'm also leery of using Excel for any statistics work past simple means, but that's not the point of this post.

I should expand my earlier comment that other software will also plot the prediction and/or confidence bands for a regression; it's not limited to R. and yes, jackdagg, it is amusing to see how students respond to seeing those graphs: I've had some actually ask "Is that why extrapolation with these models is so risky?" (thanks for the comments too)

SW VandeCarr · Aug 14, 2009

statdad said:

I'm not sure how the likelihood idea would apply, or really be useful, in a simple linear regression setting. I'm also leery of using Excel for any statistics work past simple means, but that's not the point of this post.

I should expand my earlier comment that other software will also plot the prediction and/or confidence bands for a regression; it's not limited to R. and yes, jackdagg, it is amusing to see how students respond to seeing those graphs: I've had some actually ask "Is that why extrapolation with these models is so risky?" (thanks for the comments too)

I don't know what the OPs requirements were. I simply suggested it as another alternative given the OP's need for a "worst fit" estimate. MLE is robust regardless of the validity of the normal assumption, is used by many frequentist statisticans, and the section I cited allows for the calculation of probabilities and confidence intervals. Have you used Excel for MLE? I didn't mention it, but the citation refers to R. I was simply suggesting other possibilities since the OP said he wasn't familiar with R.

http://www.jstatsoft.org/v30/i07/paper

statdad · Aug 14, 2009

SW VandeCarr said:

I don't know what the OPs requirements were. I simply suggested it as another alternative given the OP's need for a "worst fit" estimate. MLE is robust regardless of the validity of the normal assumption, is used by many frequentist statisticans, and the section I cited allows for the calculation of probabilities and confidence intervals. Have you used Excel for MLE? I didn't mention it, but the citation refers to R. I was simply suggesting other possibilities since the OP said he wasn't familiar with R.

http://www.jstatsoft.org/v30/i07/paper

My response wasn't very organized and a couple ideas were jumbled. My Excel comment was directed at the OP - I am not a fan of Excel for problems of this type - any, as I said, reasonably advanced stat work. I referenced R for that because I'm most familiar with it but, as noted, any software worth its salt will get that job done. I wasn't implying you suggested Excel, even if poor wording made it seem that way.

I do disagree with the "likelihood methods are robust" comment - but that's not the point of the OP's inquiry.

Is There a Way to Fit Lines of Worst Fit Using Statistical Packages?

Related to Is There a Way to Fit Lines of Worst Fit Using Statistical Packages?

1. What is the purpose of fitting lines of worst fit?

2. How is the line of worst fit calculated?

3. Can the line of worst fit be used to make predictions?

4. What does a large distance between the line of worst fit and the data points indicate?

5. How is the line of worst fit useful in data analysis?

Similar threads

Hot Threads

Recent Insights