Bivariate regression, dummy variables and r^2

Overall, this will allow you to determine if there is a correlation between house type and electricity usage.
  • #1
Richard_R
14
0
Hi all.

I am trying to find out if it's possible to calculate an r^2 value (% of variation explained) when performing a linear bivariate regression using dummy variables.

Let me provide an example of what I'm working on. I'm trying to find if a correlation exists between the type of house people live in and the amount of electricity they use, so my house types are:

Detached, semi-detached, terraced and flats.

I have monitoring data from several hundred houses and have worked out the average for each house type - let's suppose in terms of average kWh per year we have:

Detached - 4000 kWh/yr
Semi-detached - 3600 kWh/yr
Terraced - 3300 kWh/yr
Flats - 3200 kWh/yr

I've done ANOVA on the groups and found that there is a statistically significant differnce between at least two of them. My issue now is: how do I put these into a single equation that gives me some kind of r^2 value so that I can show how much variability is explained by the house type?

Is this even possible? How do I go about this?

Any advice greatly appreciated.
 
Physics news on Phys.org
  • #2
ThanksYes, it is possible to calculate an r^2 value for a bivariate regression model using dummy variables. To do this, you will need to create a dummy variable for each house type, with the reference category being detached houses. Then, you can use the dummy variables in a linear regression model, and the coefficient for each dummy variable will tell you the difference in kWh/yr between that house type and the reference category. The R-squared value from this model will tell you how much of the variation in kWh/yr is explained by the house type.
 

Related to Bivariate regression, dummy variables and r^2

1. What is bivariate regression?

Bivariate regression is a statistical method used to analyze the relationship between two variables. It helps to determine how one variable (known as the independent variable) affects or predicts the other variable (known as the dependent variable).

2. What are dummy variables in regression?

Dummy variables are categorical variables that are coded as either 0 or 1 to represent different categories or groups. They are used in regression analysis to include qualitative information in the model, such as gender, race, or geographical location.

3. How do I interpret the coefficient of a dummy variable in regression?

The coefficient of a dummy variable in regression represents the difference in the dependent variable between the two categories or groups. For example, if the coefficient for a dummy variable representing gender is 0.5, it means that the average value of the dependent variable for one gender is 0.5 units higher than the other gender.

4. What is the significance of r^2 in regression?

r^2, also known as the coefficient of determination, measures the proportion of variation in the dependent variable that is explained by the independent variable(s) in the regression model. It ranges from 0 to 1, where 0 indicates no relationship and 1 indicates a perfect relationship.

5. Can r^2 be negative in regression?

No, r^2 cannot be negative in regression. It is always a value between 0 and 1. A negative value would indicate that the model is worse at predicting the dependent variable than using no independent variables at all.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
Replies
80
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
5K
  • General Math
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
6
Views
2K
  • Introductory Physics Homework Help
Replies
4
Views
7K
Back
Top