Is There a Way to Fit Lines of Worst Fit Using Statistical Packages?

In summary, the speaker completed an experiment for university level physics and is currently analyzing the data. The data has been transformed to fit a linear model and fits well, with the line of best fit within all error bars. They need to provide estimates of the gradient and intercept, and were hoping to also find a line of worst fit. They ask for suggestions on how to do this using statistical packages such as SPSS, Excel, or R. Other participants in the conversation suggest using traditional confidence interval estimates and plotting prediction and confidence bands. Another person suggests using minimum likelihood estimates, while others mention the limitations of using Excel for advanced statistical work. In conclusion, there are various methods that can be used to estimate the worst fit line and plot confidence and
  • #1
Beam me down
47
0
I recently completed an experiment for university level physics and am currently doing the analysis of the data. This data has been transformed to fit a linear model and it fits this very well, with the line of best fit lying within all error bars. I need to provide estimate of the gradient and intercept as these have a physical interpretation. I was hoping to do a line of worst fit, the worst line that still fits all error bars.

I can do this to some degree of accuracy by plotting the data by hand, but I was wondering if there was a more systematic approach using any statistical package?

I have access to SPSS, Excel and R, though I am not that familiar with the latter. Though I suppose I could always download a trial version of other packages.

So is there anyway to fit lines of worst fit?

Thanks for any help you can provide.
 
Physics news on Phys.org
  • #2
what are the error bars you refer to? standard deviation? When doing linear regression, it is not a requirement that the estimate line fit the error bars. the fact that all your measurements fit within one standard deviation of the line is completely by chance. best fit is done by minizing the squared error sum. there is no such thing as a worst fit line. why can't you just use the slope and intercept of your best fit line as your estimates? if you want you can find the confidence intervals (99%, 95%, etc) for these estimates.
 
  • #3
Wow - lots of options here. If standard methods are not sufficient, there are many ways to approximate data fit so that you can consider complex transformation for testing and estimation. Currently, the R language might be most amenable to this sort of modeling. You might start with some spline methods. Jack Dagg
 
  • #4
By "gradient" do you mean slope? if so, there are traditional confidence interval estimates for the slope (and the intercept as well). You can also get R to create a graph showing the fitted regression line and both the confidence and prediction bounds (or exactly one of those) for your data: at the prompt in R type

help(predict)

for more details.
 
  • #5
Excellent suggestion statdad. Sometimes it's easy to overlook established methods. I see a lot of students have an aha moment after they calculate the estimated confidence bands that increase in width as the data limits are approached. Cheers!
 
  • #6
Last edited:
  • #7
I'm not sure how the likelihood idea would apply, or really be useful, in a simple linear regression setting. I'm also leery of using Excel for any statistics work past simple means, but that's not the point of this post.

I should expand my earlier comment that other software will also plot the prediction and/or confidence bands for a regression; it's not limited to R. and yes, jackdagg, it is amusing to see how students respond to seeing those graphs: I've had some actually ask "Is that why extrapolation with these models is so risky?" (thanks for the comments too)
 
  • #8
statdad said:
I'm not sure how the likelihood idea would apply, or really be useful, in a simple linear regression setting. I'm also leery of using Excel for any statistics work past simple means, but that's not the point of this post.

I should expand my earlier comment that other software will also plot the prediction and/or confidence bands for a regression; it's not limited to R. and yes, jackdagg, it is amusing to see how students respond to seeing those graphs: I've had some actually ask "Is that why extrapolation with these models is so risky?" (thanks for the comments too)

I don't know what the OPs requirements were. I simply suggested it as another alternative given the OP's need for a "worst fit" estimate. MLE is robust regardless of the validity of the normal assumption, is used by many frequentist statisticans, and the section I cited allows for the calculation of probabilities and confidence intervals. Have you used Excel for MLE? I didn't mention it, but the citation refers to R. I was simply suggesting other possibilities since the OP said he wasn't familiar with R.

http://www.jstatsoft.org/v30/i07/paper
 
Last edited:
  • #9
SW VandeCarr said:
I don't know what the OPs requirements were. I simply suggested it as another alternative given the OP's need for a "worst fit" estimate. MLE is robust regardless of the validity of the normal assumption, is used by many frequentist statisticans, and the section I cited allows for the calculation of probabilities and confidence intervals. Have you used Excel for MLE? I didn't mention it, but the citation refers to R. I was simply suggesting other possibilities since the OP said he wasn't familiar with R.

http://www.jstatsoft.org/v30/i07/paper

My response wasn't very organized and a couple ideas were jumbled. My Excel comment was directed at the OP - I am not a fan of Excel for problems of this type - any, as I said, reasonably advanced stat work. I referenced R for that because I'm most familiar with it but, as noted, any software worth its salt will get that job done. I wasn't implying you suggested Excel, even if poor wording made it seem that way.

I do disagree with the "likelihood methods are robust" comment - but that's not the point of the OP's inquiry.
 
Last edited:

Related to Is There a Way to Fit Lines of Worst Fit Using Statistical Packages?

1. What is the purpose of fitting lines of worst fit?

The purpose of fitting lines of worst fit is to determine the relationship between two variables by finding the line that has the maximum distance from the data points. This can help identify outliers and any potential trends or patterns in the data.

2. How is the line of worst fit calculated?

The line of worst fit is calculated by finding the point that has the maximum distance from the line of best fit. This point is known as the worst point or outlier. The line is then drawn through this point and the rest of the data points, creating the line of worst fit.

3. Can the line of worst fit be used to make predictions?

No, the line of worst fit is not a reliable tool for making predictions. This is because it is based on the outlier point, which may not be representative of the overall trend in the data. It is better to use the line of best fit for making predictions.

4. What does a large distance between the line of worst fit and the data points indicate?

A large distance between the line of worst fit and the data points indicates that the outlier has a significant impact on the overall relationship between the two variables. This could mean that the outlier is not a true representation of the data and should be further investigated.

5. How is the line of worst fit useful in data analysis?

The line of worst fit is useful in data analysis as it helps identify any extreme values or outliers in the data. It can also be used to compare with the line of best fit and determine the overall fit of the data. Additionally, it can provide insights into any potential anomalies or trends in the data set.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
999
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
940
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
28
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
26
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
Back
Top