- #1
Tom McCurdy
- 1,020
- 1
There was a problem that was talked about in class where we had the amount of quantity sold in one column and the promotion level in another column. The promotion took values between 0 and 0.88 with a number of values being zero.
The problem discussed was to test the following hypothesis
The average quantity sold by Bob when there is a promotion (p>0) is significantly higher than when there is no promotion.
My professor claimed the problem should be solved using regression. He had the independent column as the promotion value and the dependent column as the quantity sold. Then his plan was to use the regression results in excel to test whether or not the slope was equal to zero.
Now the biggest mistake I see right away is that he would need to categorize it into a discrete system where you have group 1) promotion, and group 2) no promotion. The second problem I see is then what arbitrary value do you give the group 1) of promotion
Now it's been awhile since I have taken a statistics class but from what I remember when you are doing hypothesis testing for two sample means and you have
[tex] H_0 : \mu_{promo} = \mu_{no-promo} [/tex]
[tex] H_1 : \mu_{promo} > \mu_{no-promo} [/tex]
Now the sample sizes are different
the promo category had 201 samples
the non promo category had 17 samples
Now you would need to decide if you could consider the population variances to be equal.
If they were equal you would test the means in one fashion, and if they weren't equal you had to test the means in another fashion... which was a substantial amount of work.
Is my professor right... can you just bypass this all by simply putting the numbers into two categories and doing a linear regression and checking the p-value for their slope?
The problem discussed was to test the following hypothesis
The average quantity sold by Bob when there is a promotion (p>0) is significantly higher than when there is no promotion.
My professor claimed the problem should be solved using regression. He had the independent column as the promotion value and the dependent column as the quantity sold. Then his plan was to use the regression results in excel to test whether or not the slope was equal to zero.
Now the biggest mistake I see right away is that he would need to categorize it into a discrete system where you have group 1) promotion, and group 2) no promotion. The second problem I see is then what arbitrary value do you give the group 1) of promotion
Now it's been awhile since I have taken a statistics class but from what I remember when you are doing hypothesis testing for two sample means and you have
[tex] H_0 : \mu_{promo} = \mu_{no-promo} [/tex]
[tex] H_1 : \mu_{promo} > \mu_{no-promo} [/tex]
Now the sample sizes are different
the promo category had 201 samples
the non promo category had 17 samples
Now you would need to decide if you could consider the population variances to be equal.
If they were equal you would test the means in one fashion, and if they weren't equal you had to test the means in another fashion... which was a substantial amount of work.
Is my professor right... can you just bypass this all by simply putting the numbers into two categories and doing a linear regression and checking the p-value for their slope?