Correlation between parameters in a likelihood fit

In summary, the speaker is facing a conceptual problem with the correlation matrix between maximum likelihood estimators. They have attempted to evaluate the correlation using the Hessian matrix and a different approach involving profiling and scanning the likelihood function. However, these methods have yielded different results and the speaker is unsure if their reasoning is correct. They are seeking help to clarify their ideas and are open to the suggestion of using a Bayesian approach. Additionally, they mention that defining the estimators as random variables is necessary for calculating the correlation.
  • #1
Aleolomorfo
73
4
TL;DR Summary
Which is the right way to estimate the correlation between parameters estimated with a likelihood function?
Hello community!

I am facing a conceptual problem with the correlation matrix between maximum likelihood estimators.

I estimate two parameters (their names are SigmaBin0 and qqzz_norm_0) from a multidimensional likelihood function, actually the number of parameters are larger than the two I am focusing my attention now. I need to evaluate the correlation between that two parameters.
I know the best way to evaluate the correlation between two parameters from a likelihood fit is starting from the Hessian matrix. Performing that method I get a correlation of -0.14.

Then I tried a different approach: studying the SigmaBin0 vs qqzz_norm_0 values when one of the parameters is the Parameters Of Interest (POI) and the other is profiled, and viceversa. I mean, I scan the likelihood function along SigmaBin0 while "watching" the qqzz_norm_0 profiled values, and then I run another scan along qqzz_norm_0 while "watching" the SigmaBin0 profiled values. My expectation is to find the same trend in both cases, but what I find is the right plot in attachment. The vertical line is the former case, instead the horizontal line is the latter one. If I apply the definition of correlation (ratio of the covariance wrt the standard deviations) I get a correlation of -0.29.
On the other hand, if I consider the two trends independently and apply again the definition of correlation I got a correlation of -1 (central and left plot in attachment).

In my mind all these approaches should be equivalent and giving the same value of the correlation, but it is not the case. So there might be some bug in my reasoning. Can someone help me to sort out my ideas, please?
 

Attachments

  • Screenshot 2021-02-26 at 11.16.34.png
    Screenshot 2021-02-26 at 11.16.34.png
    17 KB · Views: 163
Physics news on Phys.org
  • #2
Have you thought about using a Bayesian approach? With a Bayesian approach you get a posterior sample of the joint distribution of all of your parameters. So you can just directly calculate the correlation.

That said, I am not sure what you mean by:
Aleolomorfo said:
I scan the likelihood function along SigmaBin0 while "watching" the qqzz_norm_0 profiled values
So I cannot really assess what problems you run into. But this procedure does not seem clear to me so I am not surprised that it gives different results than the Hessian approach.
 
  • #3
Aleolomorfo said:
I mean, I scan the likelihood function along SigmaBin0 while "watching" the qqzz_norm_0 profiled values, and then I run another scan along qqzz_norm_0 while "watching" the SigmaBin0 profiled values.

It isn't clear to me what you mean by "scan" and "profile". These are not standard terms in statistics although they may be familiar to people in your particular field of study.

I am facing a conceptual problem with the correlation matrix between maximum likelihood estimators.
For a "correlation between two estimators" to make sense, we have to define the estimators as random variables. We could imagine that you have multiple independent data sets ##D_1,D_2,D_3,...## and from each data set ##D_i## we get one pair of maxiumum likihood estimators ##(S_i, q_i)## (estimating different two parametrs). Then we can regard these pairs of values as random samples from two random variables. Is that what you are doing?
 
  • #4
Stephen Tashi said:
For a "correlation between two estimators" to make sense, we have to define the estimators as random variables.
I agree. That is another reason to use the Bayesian approach here. This information is very natural and easy to obtain in that approach.
 

FAQ: Correlation between parameters in a likelihood fit

What is a likelihood fit?

A likelihood fit is a statistical method used to determine the parameters of a model by comparing the observed data to the expected data from the model. It calculates the probability of obtaining the observed data given a set of model parameters, and the goal is to find the set of parameters that maximizes this probability.

How is correlation between parameters determined in a likelihood fit?

Correlation between parameters in a likelihood fit is determined by looking at the covariance matrix. This matrix shows how each parameter varies with respect to the others, and a higher covariance indicates a stronger correlation between the parameters.

Why is it important to consider correlation between parameters in a likelihood fit?

Considering correlation between parameters is important because it can affect the accuracy of the fit and the interpretation of the results. If two parameters are highly correlated, it means that they are dependent on each other, and changing one will affect the value of the other. This can lead to biased estimates and incorrect conclusions.

How can correlation between parameters be reduced in a likelihood fit?

Correlation between parameters can be reduced by using a larger dataset, including more independent variables in the model, or using a different modeling approach. It is also important to carefully choose the initial values for the parameters and to check for convergence of the fit.

Can correlation between parameters be completely eliminated in a likelihood fit?

No, it is not possible to completely eliminate correlation between parameters in a likelihood fit. However, it can be minimized by following best practices and using appropriate statistical methods. It is also important to carefully analyze the results and consider the limitations of the model and the data.

Back
Top