Histogram fitting: fit parameter errors not corresponding with optimizer results

In summary: Expert SummarizerIn summary, the conversation discusses the problem of small errors in parameter estimates when using a least-squares optimizer and extracting errors from a covariance matrix. The possible causes for these small errors are a misfit of the fit function, non-normal distribution of the data, and a small sample size. It is recommended to carefully consider the assumptions and methods used and to seek guidance from an expert in the field.
  • #1
alex-weej
1
0
Hi

I'm having some big problems with some data! I will try to keep this as simple as possible...

I have a random variable that admits a probability distribution that I have a fit function for. With a large enough number of samples I can get good estimates of the fit function parameters via a least-squares optimizer (minpack from scipy.optimize.leastsq I believe). The optimizer gives me a covariance matrix from which I extract approximate errors on the parameters (square root of the diagonal).

The problem is that the errors obtained by this method are too small, because If I fit a different set of data from the same distribution I get some other estimate for the fit parameters with an error which is also very small and these do not overlap. As a test, I fit ~10,000 different sets of the data (with ~10,000 samples in each) and saw that I get a nicely shaped gaussian for the fit parameters. By eye, the standard deviation is about 10 times larger than the error I get from the covariance matrix.

I have manually verified that the errors calculated from the covariance matrix correspond to a change in ~1 of the chi-squared for the fit.

Am I doing anything obviously wrong? Please save me!

Thank you

Alex
 
Physics news on Phys.org
  • #2



Hi Alex,

It sounds like you are on the right track with using a least-squares optimizer and extracting the errors from the covariance matrix. However, there could be a few things that could be causing the small errors you are getting.

First, it's important to make sure that your fit function is appropriate for the data you are working with. If the function is not a good fit for the data, it could lead to small errors in the parameter estimates. You may want to try fitting the data with a few different functions to see if you get similar results.

Another possibility is that your data may not be normally distributed, which is often assumed when using a least-squares optimizer. If your data is not normally distributed, this could lead to small errors in the parameter estimates. You could try using a different optimizer or a different statistical method to fit your data.

It's also possible that the number of samples you are using is not large enough to accurately estimate the errors. Typically, the larger the sample size, the more accurate the estimates will be. You may want to try increasing the number of samples and see if that changes the results.

Overall, it's important to carefully consider the assumptions and methods you are using when fitting data and interpreting the results. It may also be helpful to consult with a statistician or other expert in your field for further guidance. Best of luck with your research!


 

FAQ: Histogram fitting: fit parameter errors not corresponding with optimizer results

What is histogram fitting?

Histogram fitting is a statistical method used to estimate the parameters of a probability distribution that best fit a given set of data points represented by a histogram. It involves finding the best-fitting curve or line that describes the data points and their distribution.

What is the purpose of fitting a histogram?

The purpose of fitting a histogram is to understand the underlying distribution of a dataset and to estimate the parameters that best describe it. This can help in making predictions and drawing conclusions about the data.

What are fit parameter errors?

Fit parameter errors are estimates of the uncertainties associated with the estimated parameters of a fitted curve or line. They indicate how well the fitted curve or line represents the data points and their distribution.

Why do fit parameter errors not correspond with optimizer results?

In some cases, fit parameter errors may not correspond with optimizer results due to various factors such as the choice of optimizer algorithm, the complexity of the data, and the assumptions made during the fitting process. It is important to carefully consider these factors when interpreting the results.

How can discrepancies between fit parameter errors and optimizer results be resolved?

To resolve discrepancies between fit parameter errors and optimizer results, it is important to carefully review the fitting process and consider using alternative optimization methods. It may also be helpful to consult with other experts in the field for their insights and recommendations.

Similar threads

Back
Top