How to set up and interpret Chi^2 test results for my data?

  • Thread starter liquidFuzz
  • Start date
  • Tags
    Chi square
In summary, to set up and interpret Chi^2 test results for your data, first ensure your data is categorical and organized into a contingency table. Calculate the Chi^2 statistic by comparing observed frequencies to expected frequencies under the null hypothesis. Use the formula χ² = Σ((Observed - Expected)² / Expected) for calculation. Determine the degrees of freedom based on the number of categories. Then, consult a Chi^2 distribution table or use statistical software to find the p-value, which indicates the significance level of your results. A low p-value (typically < 0.05) suggests a significant association between variables, while a high p-value indicates no significant relationship. Interpret the results in the context of your research question.
  • #1
liquidFuzz
103
6
I have a curve fit of a nonlinear function (a growth model). As a sanity check I do a chi2 test, but I'm not really sure how to set it up properly. My data is as such: sample point and estimated points. In a chi2 test the variables are often referred to as observations and expected. What would this translate to in a chi test of a least square method. In addition, if I get a really low chi2 test value, is that always a good thing, i.e., there's nothing I should worry about in close proximity to origin or such?

Thanks!
 
  • Like
Likes Agent Smith
Physics news on Phys.org
  • #2
The ##\chi^ 2##-test function is defined as $$\sum_{i} \frac{\left(y_i-e_i\right)^2}{e_i},$$
where ##y_i## are the observed value and ##e_i## is the estimated value (say, if you want to test if your data come from a binomial distribution, ##\text{Bin}\left(n,p\right)##, then ##e_i=n\cdot p##.) After extracting the test function value you have to compare it to a table of values (according to the degrees of freedom you have), otherwise you could extract the related to the test value p-value and compare it to the significance level ##\alpha ##. I hope I gave some kind of an answer to your question, as (tbh) I didn't understand it completely.
 
  • Like
Likes Agent Smith and FactChecker
  • #3
What are my observed values yi and expected value ei if you calculated a model with a least square method?
 
  • #4
The observed values are the data given and the expected ones, if your model is, say the simple linear, are $$e_i=\hat{\beta}_0+\hat{\beta}_1x_i,$$
where ##\hat{\beta}_0## and ##\hat{\beta}_1## are the least square estimators.
Analogously, for any other kind of model, e.g. multiple regression etc.
 
  • Informative
Likes berkeman
  • #5
Thanks!
 
  • Like
Likes berkeman
  • #6
I am only familiar with the Chi-squared goodness of fit test which compared the histogram of data with the expected theoretical frequency distribution. This seems to be a different test.
 
  • #7
liquidFuzz said:
f I get a really low chi2 test value, is that always a good thing, i.e., there's nothing I should worry about in close proximity to origin or such?
Look at the ChiSq per degree of freedom (P) table to see how reasonable a low value can be. It will give the probability that your value of p will be exceeded for the degrees of freedom (number of data points minus the number of parameters). If the probability is large then the ChiSq is probably too low. One thing that can account for this is overestimating the uncertainties of the data.
 
  • #8
A question regarding zero entries in expected values.

Lets say I want to test whether a set of data could be considered normal distributed. How do I treat bins where the expected value is close to zero. Fewer bins or just upright rejecting the hypothesis..?

Edit, additional, if I instead tests against the accumulative distribution, can I use that as a test?
 
  • #9
liquidFuzz said:
A question regarding zero entries in expected values.

Lets say I want to test whether a set of data could be considered normal distributed. How do I treat bins where the expected value is close to zero. Fewer bins or just upright rejecting the hypothesis..?

Edit, additional, if I instead tests against the accumulative distribution, can I use that as a test?
If you include those bins in your test, does it change the results? You can combine some bins to add up to non-zero expected numbers. If your hypothesized distribution has many expected zero bins and your sample has results in those bins, than the hypothesis might be rightfully rejected. It is not unusual for the extreme tails of an actual distribution to be different from a normal distribution. You will have to use your judgement, based on the situation, on what to do in that case.
 
  • #10
Thanks! I'll play around with merged bins and see if I get something useful out of it.

I was hoping for a clear yes or no... 🤪
 
  • Haha
Likes Agent Smith
  • #11
If you have a growth model, linearize it by taking differences or log returns. If you don’t do this the data won’t be stationary and most statistical tests won’t make sense. Look at geometric Brownian motion or ARIMA models for examples
 
  • #12
liquidFuzz said:
I was hoping for a clear yes or no...
You poor soul! :-p
I read Chi-squared test as part of the null hypothesis. It's interesting.

##\displaystyle \sum_i \frac{(y_i - e_i)^2}{e^i}## is the crux of it. Gracias @mathguy_1995
 

Similar threads

Replies
1
Views
1K
Replies
7
Views
2K
Replies
5
Views
3K
Replies
20
Views
3K
Replies
1
Views
4K
Replies
2
Views
3K
Replies
2
Views
2K
Back
Top