- #1
ezfzx
- 51
- 15
OK, so, I've forgotten more statistics than my students will ever know, and I'm not too proud to ask for help, because I'm just blanking out on this. I would appreciate it if someone could patiently follow along and let me know what I've got right or wrong please.
My understanding of the chi-sqr is this:
I have two data sets: "observed" (O) and "expected" (E).
In this case, I'm comparing "O" data which is a sub-set of a much large "E" population.
I'm going to make this assumption, which I will call the "null" hypothesis: If the "O" data was a clean unbiased sub-sample from "E", the distribution in the "O" data set should match the "E" distribution. Yes?
For example, if I'm grabbing a sample of people from the general population, my sample should have the same percentage of each age group, racial group, education level, etc. of the general population. If my sample DOES NOT have this distribution, there may be a bias somewhere.
So, I take the difference between each O & E item pair, square it, divide by the expected (E) value, Σ up all the results and we have the chi-sqr (χ2) value. Use this and the degrees of freedom (D) to get a p-value.
The alpha level (α) is the traditionally acceptable cut off point in doctoral research, which we set to 0.05 (or 5%).
We can say this: "The probability of randomly drawing a sample that produces a chi-sqr value of χ2 with D degrees of freedom is p-value."
We compare the p-value to the α.
IF p-value > α, THEN we can claim NO statistically significant difference between "O" and "E".
IF p-value < α, THEN we CAN claim a statistically significant difference between "O" and "E".
So, here's MY "simple" interpretation problem.
If want to insure that I'm getting a clean unbiased sub-sample from the general population, I should WANT to see a strong correlation between "O" and "E".
Does that mean I WANT p-value > α?
Or maybe I'm misusing the chi-square?
Seems to me I do want p-value > α, but I got kinda turned around in my own box, so a confirmation or correction would be GREAT.
The reason I'm questioning myself is that I'm seeing a distribution in "O" that looks fairly different than "E", but I'm getting a p-value around 0.96, like the math's sayin' "Don't worry, dude. It's cool." ... but it doesn't LOOK cool. So how to I explain to someone else that what looks "off" is really "cool" when I'm not convinced myself? My faith in math is shaken!
Please phrase all answers using small words and talk to me like I'm stupid, because I haven't had my coffee yet. :)
My understanding of the chi-sqr is this:
I have two data sets: "observed" (O) and "expected" (E).
In this case, I'm comparing "O" data which is a sub-set of a much large "E" population.
I'm going to make this assumption, which I will call the "null" hypothesis: If the "O" data was a clean unbiased sub-sample from "E", the distribution in the "O" data set should match the "E" distribution. Yes?
For example, if I'm grabbing a sample of people from the general population, my sample should have the same percentage of each age group, racial group, education level, etc. of the general population. If my sample DOES NOT have this distribution, there may be a bias somewhere.
So, I take the difference between each O & E item pair, square it, divide by the expected (E) value, Σ up all the results and we have the chi-sqr (χ2) value. Use this and the degrees of freedom (D) to get a p-value.
The alpha level (α) is the traditionally acceptable cut off point in doctoral research, which we set to 0.05 (or 5%).
We can say this: "The probability of randomly drawing a sample that produces a chi-sqr value of χ2 with D degrees of freedom is p-value."
We compare the p-value to the α.
IF p-value > α, THEN we can claim NO statistically significant difference between "O" and "E".
IF p-value < α, THEN we CAN claim a statistically significant difference between "O" and "E".
So, here's MY "simple" interpretation problem.
If want to insure that I'm getting a clean unbiased sub-sample from the general population, I should WANT to see a strong correlation between "O" and "E".
Does that mean I WANT p-value > α?
Or maybe I'm misusing the chi-square?
Seems to me I do want p-value > α, but I got kinda turned around in my own box, so a confirmation or correction would be GREAT.
The reason I'm questioning myself is that I'm seeing a distribution in "O" that looks fairly different than "E", but I'm getting a p-value around 0.96, like the math's sayin' "Don't worry, dude. It's cool." ... but it doesn't LOOK cool. So how to I explain to someone else that what looks "off" is really "cool" when I'm not convinced myself? My faith in math is shaken!
Please phrase all answers using small words and talk to me like I'm stupid, because I haven't had my coffee yet. :)