Showing that a model is not a good fit

  • Thread starter indie452
  • Start date
  • Tags
    Fit Model
In summary: In parameter estimation, you are trying to estimate a parameter (like the mean or variance) of a model from a data set. The statistic you use to measure the fit of the model to the data is usually the chi-squared statistic. In the context of parameter estimation, the "confidence region" around the parameter estimate is important because it tells you how likely it is that the parameter estimate is correct. The confidence region is the set of values of the statistic around the parameter estimate that would be expected if the model were true. The important thing to remember is that the confidence region is not a probability. It is a set of values of the statistic around the parameter estimate that are expected if the model were
  • #1
indie452
124
0
ok so i have some data (d) of star counts (N=181), and a model (m = b-Fo where b=5 and Fo-constant flux)

I have found the chi squared value = 216
I know that the number of degrees of freedom here is N-parameters = 181-1 = 180

my question is:
"show that the model is not a good fit to the data, and use an appropriate statistical table to estimate the confidence at which you can reject the hypothesis of a constant source flux"

All i can come up with so far is that if we have a good model we usually expect the chi squared to be approx the number of degrees of freedom which is not the case here. As such one could imply that the data is not a good fit from that.
Also I know that as the degree of freedom is so large the probability function for this will approach gaussian so we would use the gaussian one tailed table.

However, notes i have read talk about comparing the chi squared to some significance level, but i do not know how to calculate this.

any help one getting started and for understanding please?
 
Physics news on Phys.org
  • #2
To determine whether a model is a "good fit" one has to decide what is meant by "good". And that means determing a "level of significance"- Typically a probability of .10 or .05. Here is a pretty easy to use chi-square "calculator": http://www.stat.tamu.edu/~west/applets/chisqdemo.html

Put in your degrees of freedom, then put in the level of significance you want- .10 or .05, and see if your value is too far to the right.
 
Last edited by a moderator:
  • #3
thanks for replying

so this is what i have got from your response:
so if my area calculated is the prob of getting more than the chi squared (216) is 0.0345, then this means that at a 5% significance level it is unlikely that we will get a result of more than 216.

but I am not quite sure how this shows it is a bad model, or how i go about finding the confidence at which i can reject the hypothesis of a constant source
 
  • #4
indie452 said:
how i go about finding the confidence at which i can reject the hypothesis of a constant source

As far as I can tell "confidence at which I can reject" is terminology that you have invented. If your course materials use that teminology, perhaps you can explain it to me using the language of probability.

In the ordinary scenario for hypothesis testing, once you establish a range of statistical values for which you will "accept" the null hypothesis, you can compute probabilities only if you assume the null hypothesis is true. The probabilities that you can compute are the probability of accepting the null hypothesis and the probability of (incorrectly) rejecting the null hypothesis.

Subjectively, if the observed statistic is outside the acceptance region and the probability of this happening by chance is "small" then the null hypothesis is "bad". However, you can't compute the probability that the null hypothesis is incorrect unless you use Bayesian statistics.

The term "confidence" is usually applied to the scenario of parameter estimation.
 
  • #5


To show that the model is not a good fit to the data, we can calculate the reduced chi squared value by dividing the chi squared value by the number of degrees of freedom. In this case, the reduced chi squared value is 1.2, which is greater than 1. This indicates that the model is not a good fit to the data, as a reduced chi squared value greater than 1 suggests that the model is overfitting the data or that there are additional factors not accounted for by the model.

To estimate the confidence at which we can reject the hypothesis of a constant source flux, we can use the chi squared distribution table. The table provides the critical value of chi squared at a given level of significance (α) and degrees of freedom. In this case, we have 180 degrees of freedom and we want to reject the hypothesis at a significance level of 0.05. From the table, we can see that the critical value for a one-tailed test is 205.3. Since our chi squared value of 216 is greater than the critical value, we can reject the hypothesis of a constant source flux with 95% confidence.

In summary, the chi squared test and the reduced chi squared value both indicate that the model is not a good fit to the data. The chi squared distribution table can be used to estimate the confidence at which we can reject the hypothesis of a constant source flux.
 

FAQ: Showing that a model is not a good fit

1. How can you determine if a model is a good fit?

To determine if a model is a good fit, you can use various statistical measures such as the coefficient of determination (R-squared), root mean square error (RMSE), and chi-square test. These measures can show how well the model fits the data and if there are any significant differences between the predicted values and the actual values.

2. What does it mean if a model has a low R-squared value?

A low R-squared value means that the model does not explain a large portion of the variation in the data. This could indicate that the model is not a good fit for the data, and other models should be considered.

3. Can a model be a good fit for some data but not for others?

Yes, a model can be a good fit for some data but not for others. This is known as overfitting, where a model fits the training data too closely and does not perform well on new data. It is important to test the model on different datasets to ensure its generalizability.

4. What are some visual methods for evaluating the fit of a model?

Some visual methods for evaluating the fit of a model include scatter plots, residual plots, and Q-Q plots. These plots can show the relationship between the predicted values and the actual values, as well as any patterns or trends in the residuals.

5. How can you improve a model that is not a good fit?

There are several ways to improve a model that is not a good fit. One approach is to collect more data and retrain the model. Another option is to use a different algorithm or adjust the parameters of the current model. It is also important to consider the underlying assumptions of the model and make necessary adjustments to improve its fit.

Similar threads

Replies
5
Views
1K
Replies
20
Views
2K
Replies
26
Views
2K
Replies
5
Views
2K
Replies
5
Views
1K
Back
Top