Bias in Linear Regression (x-intercept) vs Statistics

In summary, there is a difference in the use of bias in simple regression for machine learning and in terms of estimators for population parameters. The bias in simple regression is equal to b, while the bias of an estimator for a population parameter is defined as the difference between the expected value and the true value. The two uses are only logically similar when X and Y are assumed to be proportional.
  • #1
WWGD
Science Advisor
Gold Member
7,455
11,455
TL;DR Summary
Trying to Reconcile two apparently/superficially different usages of the tern "Bias"
Hi,
In simple regression for machine learning , a model :

Y=mx +b ,

Is said AFAIK, to have bias equal to b. Is there a relation between the use of bias here and the use of bias in terms of estimators

for population parameters, i.e., the bias of an estimator P^ for a population parameter P is defined as the difference E[P^]- P?

The two do not seem to coincide as Y^= mx^ +b^ is an unbiased estimator of the population parameter Y . Can anyone explain the

disparity?
 
Physics news on Phys.org
  • #2
Words have more than one meaning. I have never seen bias used with the first meaning, so that appears to be a specialized field of study just “hijacking” terminology from other fields of study. It happens often. I am afraid there is not much justification needed or provided for that type of thing.
 
  • Like
Likes FactChecker and WWGD
  • #3
I think that the two uses are only logically similar in the context of a model where X and Y are known or assumed to be proportional (Y = mx). In that case, b would be a bias due to something.
 
  • Like
Likes Dale and WWGD

FAQ: Bias in Linear Regression (x-intercept) vs Statistics

What is the difference between bias in linear regression and statistics?

Bias in linear regression refers to the systematic error in the estimated coefficients of a linear regression model. It can occur when the model is misspecified or when there is a relationship between the independent variables. On the other hand, bias in statistics refers to the tendency of a statistical method to consistently overestimate or underestimate a population parameter.

How does bias in linear regression affect the accuracy of the model?

Bias in linear regression can lead to inaccurate predictions and incorrect conclusions about the relationship between the independent and dependent variables. It can also result in biased estimates of the model parameters, leading to incorrect interpretations of the data.

Can bias in linear regression be eliminated?

No, bias in linear regression cannot be completely eliminated. However, it can be reduced by using appropriate techniques such as cross-validation and regularization. It is important to carefully assess and address potential sources of bias in order to improve the accuracy of the model.

How is the x-intercept related to bias in linear regression?

The x-intercept, also known as the intercept or constant term, is the point where the regression line crosses the x-axis. It represents the value of the dependent variable when all independent variables are equal to zero. In linear regression, the x-intercept can be affected by bias, as it is one of the estimated coefficients in the model. Bias in the x-intercept can lead to incorrect interpretations of the relationship between the variables.

How can I detect bias in linear regression?

There are several ways to detect bias in linear regression, such as examining the residuals, conducting hypothesis tests, and using diagnostic plots. Residuals are the differences between the actual and predicted values of the dependent variable, and they should be normally distributed around zero in a well-fitted model. Hypothesis tests, such as the F-test and t-test, can also indicate the presence of bias in the model. Diagnostic plots, such as the residual plot and QQ plot, can visually show if there are any patterns or outliers in the data that may indicate bias.

Back
Top