Comparing Proportions: Choosing the Right Statistical Test for Your Data

In summary, the conversation is about determining the appropriate statistical test for comparing proportions of car colors in a school parking lot to the North American average. The suggested test is a chi-squared test for goodness of fit, using observed frequencies compared to expected frequencies. The degrees of freedom for the test is the number of car colors minus one. The expected frequencies should remain fractional, as opposed to rounding to whole numbers.
  • #1
Chas3down
60
0
Sorry, no subforum for statistics so I posted it here..

Homework Statement



So, if I have a given list of proportions

n = 64
.11
.14
.16
.14
.13
.16
.16

and I want to compare it to another group of percentages

n = North American Average
.18
.19
,20
.18
.14
.06
.05

What type of test would I use?
 
Physics news on Phys.org
  • #2
Chas3down said:
So, if I have a given list of proportions and I want to compare it to another group of percentages, what type of test would I use?
[I condensed your question for clarity and to save space]

What kind of comparison are you looking for? I'm thinking it's just a two-sample z-test for proportions.
 
  • #3
Mandelbroth said:
[I condensed your question for clarity and to save space]

What kind of comparison are you looking for? I'm thinking it's just a two-sample z-test for proportions.

Just to see if there is a significant difference between the proportions or not.

I am comparing car colors from my school against the National Average. The first group (n=64) is 64 car colors I got from my school parking lot. The second group of proportions, is that of the North American Average.
 
Last edited:
  • #4
Try a goodness-of-fit chi-square test.
 
  • #5
Chas3down said:
Just to see if there is a significant difference between the proportions or not.

I am comparing car colors from my school against the National Average. The first group (n=64) is 64 car colors I got from my school parking lot. The second group of proportions, is that of the North American Average.

Validity, etc., depends on the nature of your data.

For instance, how do you condense information about car colors into a proportion? Typically, a proportion like 0.11 could be thought of as the number of 'yes' vs. 'no' answers to some type of question. How does a car's color fit into that type of scheme? The point is that if the numbers you show represent some type of highly 'massaged' figures, their distribution may be an artifact of your data-aggregation method and not reflective of reality. So: show us how you obtained those figures.
 
  • #6
Well, I collected 64 car color samples (Chose 64 so it is a large enough number so I could generalize for my entire student parking lot) then had for example 10 black cars and did 10/64 to get the proportion of students who drive black color cars to school on that day.
 
  • #7
Chas3down said:
Well, I collected 64 car color samples (Chose 64 so it is a large enough number so I could generalize for my entire student parking lot) then had for example 10 black cars and did 10/64 to get the proportion of students who drive black color cars to school on that day.
How formal do you want your test to be? If you really want a formal test, state between what wavelengths of light you consider to be blue, etc. There are other things to consider, but that might be better if you want a boolean ("yes/no") answer. For example, "red" can be very subjective. You could argue that all pink cars are red.

Mandelbroth said:
[I condensed your question for clarity and to save space]

What kind of comparison are you looking for? I'm thinking it's just a two-sample z-test for proportions.
Excuse my haste. You can use a two-sample z-test, but it would be weird since you are working with more than one proportion. Use a chi-squared based test for goodness of fit. That should work better.
 
  • #8
Mandelbroth said:
How formal do you want your test to be? If you really want a formal test, state between what wavelengths of light you consider to be blue, etc. There are other things to consider, but that might be better if you want a boolean ("yes/no") answer. For example, "red" can be very subjective. You could argue that all pink cars are red.Excuse my haste. You can use a two-sample z-test, but it would be weird since you are working with more than one proportion. Use a chi-squared based test for goodness of fit. That should work better.
It is pretty informal.

Okay, yeah, I was thinking of using a chi-squared test, thanks for your thoughts.
 
  • #9
hmm.. quick addition..

Would I do a chi squared GOF for proportions the same as I would for non-proportions?

just a summation of ((Observed proportion - Expected proportion)^2 / Expected Proportion) to get my chi squared value? And my degrees of freedom would just be number of cars i used to get my data -1 ?
 
  • #10
Not quite.

The summation should be of ((Observed frequency - Expected frequency)^2 / Expected frequency).
You converted the frequencies to proportions, but you really need the frequencies.

The degrees of freedom is the number of colors - 1.
 
  • #11
I like Serena said:
Not quite.

The summation should be of ((Observed frequency - Expected frequency)^2 / Expected frequency).
You converted the frequencies to proportions, but you really need the frequencies.

The degrees of freedom is the number of colors - 1.

Oh gotcha, so I should convert the average car color proportions to frequencies out of 64?
 
  • #12
Yep.
 
  • #13
I like Serena said:
Yep.

Alright thanks, really a big help.. one final question, I should round all my expected car colors to whole numbers correct? Because you can't have 5.3 cars..
 
  • #14
No. The expected cars should remain fractional.
 
  • #15
I like Serena said:
No. The expected cars should remain fractional.

Okay, only reason i thought it would be the other way was because my ti-84 would only accept whole numbers or else it would error out... time to do it by hand.

Really a big help, thanks a lot man.
 
  • #16
Huh? I'd expect your ti-84 to required whole numbers for the observed frequencies, which should indeed be whole, but not for the expected frequencies.
 
  • Like
Likes 1 person
  • #17
/facepalm.. thanks put it in wrong lol.
 
  • #18
:smile:
 

FAQ: Comparing Proportions: Choosing the Right Statistical Test for Your Data

What is the purpose of comparing proportions?

The purpose of comparing proportions is to determine if there is a significant difference between two or more groups or populations in terms of a specific characteristic or outcome. This can help researchers make informed decisions and draw conclusions about their data.

How do you choose the right statistical test for comparing proportions?

The choice of statistical test depends on the number of groups being compared, the type of data being analyzed (categorical or continuous), and the research question being addressed. Commonly used tests for comparing proportions include the chi-square test, Z-test, and t-test.

What is the difference between a one-tailed and two-tailed test when comparing proportions?

In a one-tailed test, the researcher is only interested in determining if the proportion in one group is higher or lower than the proportion in another group. In a two-tailed test, the researcher is interested in determining if there is any difference in proportions between the two groups, regardless of direction.

Can you compare proportions using non-parametric tests?

Yes, non-parametric tests such as the Mann-Whitney U test or the Wilcoxon signed-rank test can also be used to compare proportions. These tests are often preferred when the data does not meet the assumptions of parametric tests.

What are some limitations of comparing proportions?

One limitation of comparing proportions is that it only allows for the analysis of one variable at a time. Additionally, the results may be affected by sample size and the choice of statistical test. It is important to carefully consider these factors when interpreting the results of a proportion comparison.

Similar threads

Replies
7
Views
2K
Replies
3
Views
2K
Replies
13
Views
1K
Replies
8
Views
4K
Replies
4
Views
2K
Replies
2
Views
3K
Back
Top