Question about Chi-Square Test Regarding Normal Distribution

In summary, the conversation discusses the importance of grouping data and creating tables to determine the observed frequency for each data interval. It is mentioned that using different groupings can lead to different conclusions, and the test being used is sensitive to the bin size. It is suggested to use the Shapiro-Wilk test for normality, and the issue of low statistics is addressed. The conversation concludes with the understanding that different groupings can result in different outcomes due to the level of detail and degrees of freedom.
  • #1
songoku
2,384
351
TL;DR Summary
Let say I have 50 raw data of height of students. I want to do goodness of fit test to check whether normal distribution is appropriate model for the data at a certain significance level
The first step is to group the data and make a table so I can get the observed frequency for each data interval. I did two different groupings (something like 150 - 160 , 160 - 170 , etc and the other is 150 - 170, 170 - 190, etc) and found out that the conclusion of the hypothesis is different, one resulting in accepting null hypothesis and the other rejecting the null hypothesis.

Is it possible different grouping resulting in different conclusion? Or there should be mistake in my working?

Thanks
 
Physics news on Phys.org
  • #3
You suffer from low statistics -- 50 events isn't much to confirm a distribution.
 
  • Like
Likes songoku
  • #4
In reality, as everyone knows the height of individuals has finite variance, you can just rely on the CLT with n=50 to assume normality
 
  • Like
Likes songoku
  • #5
It certainly is possible to get different results. Your first grouping would show more detail than your second grouping. It would also have twice the degrees of freedom, so the Chi-Squared distribution is different.
 
  • Like
Likes songoku
  • #6
Thank you very much for the help and explanation BWV, BvU, FactChecker
 
  • Like
Likes BvU

FAQ: Question about Chi-Square Test Regarding Normal Distribution

What is a Chi-Square Test?

A Chi-Square Test is a statistical test used to determine if there is a significant difference between the observed and expected frequencies of categorical data. It is often used to analyze data that is collected from different groups or categories.

How is a Chi-Square Test used in relation to Normal Distribution?

In a Chi-Square Test, the data is compared to a theoretical distribution, such as the Normal Distribution. This allows for the determination of whether the observed data follows a similar pattern as the expected distribution, or if there are significant differences between the two.

What is the purpose of using a Chi-Square Test with Normal Distribution?

The purpose of using a Chi-Square Test with Normal Distribution is to determine if the data follows a normal pattern. This can be useful in determining the appropriate statistical analysis to use for the data, as well as identifying any potential outliers or abnormalities in the data.

What are the assumptions of using a Chi-Square Test with Normal Distribution?

The assumptions of using a Chi-Square Test with Normal Distribution include: a large enough sample size (usually at least 20 observations), independent observations, and expected frequencies of at least 5 in each category. It is also important to ensure that the data is normally distributed.

How can I interpret the results of a Chi-Square Test with Normal Distribution?

The results of a Chi-Square Test with Normal Distribution will provide a p-value, which indicates the probability of obtaining the observed data if the null hypothesis (that there is no significant difference between the observed and expected frequencies) is true. If the p-value is less than the chosen significance level (usually 0.05), then the null hypothesis is rejected and it can be concluded that there is a significant difference between the observed and expected frequencies.

Similar threads

Back
Top