Nonlinear Least Squares or OLS for Nonlinear Models?

  • I
  • Thread starter fog37
  • Start date
  • Tags
    Ols
In summary, the discussion on "Nonlinear Least Squares or OLS for Nonlinear Models?" highlights the distinctions between Ordinary Least Squares (OLS) and Nonlinear Least Squares (NLS) methods in the context of fitting nonlinear models. OLS assumes linearity in parameters and is easier to compute, making it suitable for linear relationships. Conversely, NLS directly handles nonlinearity, allowing for more accurate parameter estimation in complex models. The choice between these methods depends on the nature of the data and the underlying model, with NLS typically preferred for true nonlinear relationships despite its computational complexity.
  • #1
fog37
1,569
108
TL;DR Summary
Difference between nonlinear least squares vs ordinary least squares
hello,

I understand that the method of ordinary least squares (OLS) is about finding the coefficients that minimize the sum where is the statistical model chosen to fit the data. Beside OLS, there clearly other coefficient estimation methods (MLE, etc.)

In general, OLS is fair game when the model is "linear with respect with the parameters" (linear regression, polynomial regression, etc.): any model that is the sum of several terms with each term being the product of the estimated coefficient and whatever variable: where are like the basis functions. For example, is linear and the basis functions are the three functions ...

Of course, the OLS approach is valid as long as specific assumptions on the residuals are met. Additionally, after taking the first derivative and setting them to zero, we are able to arrive to nice analytical formulas for the coefficients.

That said, what is the issue with using OLS when is a nonlinear model? I know that sometimes we "convert" a nonlinear model so that it assume the form of a linear model. That strategy then allows us to use OLS on the new model based on the transformed variables...That is a useful hack.

But I have reading about "nonlinear least squares". Isn't it the same approach as OLS but when the model is nonlinear where we directly plug the nonlinear model in ? We may not end up with analytical estimators and have to solve for the coefficients using some numerical method...But I don't see an issue apply OLS to nonlinear models...

Thank you.
 
  • Like
Likes Dale
Physics news on Phys.org
  • #2
OLS minimized the sum of squared errors of the actual samples versus the estimated values. If that is your goal, then that is the thing to do.
 
  • Like
Likes fog37
  • #3
FactChecker said:
OLS minimized the sum of squared errors of the actual samples versus the estimated values. If that is your goal, then that is the thing to do.
That is the goal but many resources I read state that OLS is only for linear models and that puzzled me....Is it because the estimates resulting from applying OLS to linear models are not as good as they could be when the model is linear?
 
  • #4
I should probably not have called it OLS. If your goal is to minimize the sum-squared-errors, then do that, whether it requires OLS or a numerical technique.
These problems do not exist in a vacuum. You should have a reason for the model you propose and have something that you want to use the results for. That should determine what approach you can use. What you need to be aware of is that the statistical results like confidence intervals of the parameters may not be valid if certain assumptions are not met.
 
  • Like
Likes fog37
  • #5
FactChecker said:
I should probably not have called it OLS. If your goal is to minimize the sum-squared-errors, then do that, whether it requires OLS or a numerical technique.
These problems do not exist in a vacuum. You should have a reason for the model you propose and have something that you want to use the results for. That should determine what approach you can use. What you need to be aware of is that the statistical results like confidence intervals of the parameters may not be valid if certain assumptions are not met.
I see.

Inferential statistics is either about estimation, hypothesis testing, or both. Estimation is really just about coming up with a reasonably good numerical, unbiased, consistent, low variance estimate of the parameter.

Hypothesis testing focuses on a different task: it hypothesizes the unknown population parameter and uses the limited sample data to check if that hypothesis (H0) is valid or not. Confidence intervals, standard errors, p-values result from hypothesis testing, not from estimation, correct?

If the required assumptions are not met by the chosen model, estimation may still work just fine...but confidence intervals, standard errors, p-values, etc. will not be reliable, statistically speaking.

For example, in linear regression, the response variable does not have to be normally distributed for the model to be sound and get good estimates of the slope and intercept. The Markov-Gauss assumptions don't force or the residuals to have normal distribution at all....But confidence intervals, standard errors, p-values, the output of hypothesis testing, will not be good if Y in not normal which implies that the residuals will also not be normally distributed...

Am I thinking correctly here?
 
  • #6
fog37 said:
For example, in linear regression, the response variable does not have to be normally distributed for the model to be sound and get good estimates of the slope and intercept. The Markov-Gauss assumptions don't force or the residuals to have normal distribution at all....But confidence intervals, standard errors, p-values, the output of hypothesis testing, will not be good if Y in not normal which implies that the residuals will also not be normally distributed...

Am I thinking correctly here?
When you talk about a normal distribution, you should be talking about the random term, , not about . There can be many ways that random behavior influences . I have not seen you mention that yet. You need to pay special attention to how the random term enters into the equation. Without that, your model is incomplete.
Some example models are:

or

or
 
  • Like
Likes fog37
  • #7
I see. Your point is that the residuals can be normally distributed (and have equal variance) at each value...But that does not automatically imply that the observed response variable has also normally distributed values....

However, I have always thought that if the error is normal, then is also normally distributed...
 
  • #8
fog37 said:
I see. Your point is that the residuals can be normally distributed (and have equal variance) at each value...But that does not automatically imply that the observed response variable has also normally distributed values....

However, I have always thought that if the error is normal, then is also normally distributed...
IMO, we shouldn't talk about "residuals" and "error" as though they are a simple normal random variable with a mean of 0. They are the errors of an estimated model versus the true model and can be changed by other errors in the estimated model.
Suppose we have an actual physical relationship , where is a normal variable with a mean of zero, and estimate it with a linear equation .
Then the errors or residuals are
is different from the term . It includes a term that depends on
 
Back
Top