Multiple linear regression + QQplots problem Includes pics

In summary, the conversation discusses the need for normally distributed residuals in multiple linear regression and the suggestion to use QQ plots to check for this. However, it is noted that there is a problem with the tails of the data, which can impact the accuracy of the least squares method. The possibility of using robust regression as an alternative is also mentioned.
  • #1
emelie_earl
3
0
I want to do multiple linear regression, but one of the requirements is the residuals to be normally distributed, and I can check that with QQplots but then the QQ plot shows it is about 95% of data fit into the normal line, but 5% is way off!

can I still proceed ?*or do I have to find a way to transform the data ?*


5.jpg


5_residuals.jpg
 
Physics news on Phys.org
  • #2
Your plots show serious non-normality in the error structure, with (as you've noted) problems in the tails, and since least squares is incredibly non-robust you're correct to be concerned.
1) Have you noticed any strange behavior in your estimates (coefficients with signs opposite what you might expect)?
2) Have you tried a robust regression? The MASS package in R provides several good options.
 
  • #3
statdad said:
Your plots show serious non-normality in the error structure, with (as you've noted) problems in the tails, and since least squares is incredibly non-robust you're correct to be concerned.
1) Have you noticed any strange behavior in your estimates (coefficients with signs opposite what you might expect)?
2) Have you tried a robust regression? The MASS package in R provides several good options.


Thank you!
I will try Robust regression.
 

FAQ: Multiple linear regression + QQplots problem Includes pics

1. What is multiple linear regression?

Multiple linear regression is a statistical method used to analyze the relationship between two or more independent variables and a single dependent variable. It is commonly used to predict the value of the dependent variable based on the values of the independent variables.

2. How is multiple linear regression different from simple linear regression?

Simple linear regression involves only one independent variable, while multiple linear regression involves two or more independent variables. This allows for a more complex analysis of the relationship between the variables and can provide more accurate predictions.

3. What is the purpose of a QQ plot in multiple linear regression?

A QQ plot, or quantile-quantile plot, is used to assess the normality of the residuals in a multiple linear regression model. It compares the quantiles of the observed residuals to the quantiles of a theoretical normal distribution, and any deviations from a straight line can indicate non-normality in the data.

4. How do I interpret the results of a QQ plot in multiple linear regression?

If the points on the QQ plot fall close to the straight line, it indicates that the residuals are normally distributed, which is a desirable outcome. However, if the points deviate significantly from the line, it suggests that the residuals may not be normally distributed and further investigation may be needed.

5. Can QQ plots be used for other types of regression models?

Yes, QQ plots can be used to assess the normality of residuals in various types of regression models, such as logistic regression or polynomial regression. However, the interpretation may differ slightly depending on the specific model and its assumptions.

Back
Top