Derivative of Log Likelihood Function

NATURE.M · Nov 15, 2015

So looking through my notes I can't seem to understand how to get from one step to the next. I have attached a screenshot of the 2 lines I'm very confused about. Thanks.

BTW: The equations are for the log likelihood in a mixture of gaussians model

EDIT: To elaborate I am particularly confused about how they get numerator term π_{k} N(x_{n}|μ_{k}, Σ). I can't seem to understand how they are differentiating this to obtain that. I understand how they obtain the denominator term from differentiating the log but that's about all. To differentiate the multivariate gaussian I would think the log function needs to be used to break up the internal terms. Although I can't put this intuition together.

andrewkirk · Nov 15, 2015

I think it's because ##\Sigma_k## appears both inside and outside (as an inverse) the exponent in the cdf function ##\mathscr{N}##.
So
$$\frac{\partial}{\partial \Sigma_k}\mathscr{N}(\mu,\Sigma_k)=
\frac{\partial}{\partial \Sigma_k}\left[C\Sigma_k{}^{-1}\exp[f(\mu,\Sigma_k)]\right]$$
for known constant ##C## and function ##f##.
By the product rule, this is then equal to
$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}\Sigma_k{}^{-1}+\Sigma_k{}^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

There will be some messy algebra involved.

You might find it easier to first work through the univariate case, differentiating wrt ##\sigma## and seeing if you can obtain an analogous expression. If that works out, it shouldn't be too hard to extend it to the multivar case.

NATURE.M · Nov 15, 2015

andrewkirk said:

I think it's because ##\Sigma_k## appears both inside and outside (as an inverse) the exponent in the cdf function ##\mathscr{N}##.
So
$$\frac{\partial}{\partial \Sigma_k}\mathscr{N}(\mu,\Sigma_k)=
\frac{\partial}{\partial \Sigma_k}\left[C\Sigma_k{}^{-1}\exp[f(\mu,\Sigma_k)]\right]$$
for known constant ##C## and function ##f##.
By the product rule, this is then equal to
$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}\Sigma_k{}^{-1}+\Sigma_k{}^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

There will be some messy algebra involved.

You might find it easier to first work through the univariate case, differentiating wrt ##\sigma## and seeing if you can obtain an analogous expression. If that works out, it shouldn't be too hard to extend it to the multivar case.

I don't understand how you got $$C\Sigma_{k}^{-1}$$ In the multivariate gaussian we have $$\frac{1}{|\Sigma_{k}|}$$ How did you convert that determinant into an inverse ? Maybe you meant the same thing but forgot the determinant sign ?

andrewkirk · Nov 15, 2015

NATURE.M said:

I don't understand how you got $$C\Sigma_{k}^{-1}$$ In the multivariate gaussian we have $$\frac{1}{|\Sigma_{k}|}$$ How did you convert that determinant into an inverse ?

I didn't. What I wrote is only broadly indicative of the structure. I didn't look up the multivariate Gaussian formula. With your correction that line becomes:

$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}|\Sigma_k|^{-1}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
which is
$$C\exp[f(\mu,\Sigma_k)]\left[-|\Sigma_k|^{-2}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

I think if you work through the univariate case first it'll become much clearer.

NATURE.M · Nov 15, 2015

andrewkirk said:

I didn't. What I wrote is only broadly indicative of the structure. I didn't look up the multivariate Gaussian formula. With your correction that line becomes:

$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}|\Sigma_k|^{-1}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
which is
$$C\exp[f(\mu,\Sigma_k)]\left[-|\Sigma_k|^{-2}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

I think if you work through the univariate case first it'll become much clearer.

Okay so rewriting with exponents of -1/2 (for the gaussian) and repeating the operation we would have:
$$C\exp[f(\mu,\Sigma_k)]\left[-\frac{1}{2}|\Sigma_k|^{\frac{-3}{2}}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{\frac{-1}{2}}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
So the problem becomes the extra $$|\Sigma_k|^{-1}$$ that gets left over after we factor out $$|\Sigma_k|^{\frac{-1}{2}}$$ Any ideas ?

NATURE.M · Nov 16, 2015

So I think I resolved my troubles using a few properties outlined in the matrix cookbook.

Derivative of Log Likelihood Function

Attachments

Related to Derivative of Log Likelihood Function

1. What is the purpose of taking the derivative of the log likelihood function?

2. How is the derivative of the log likelihood function calculated?

3. What does it mean if the derivative of the log likelihood function is 0?

4. Can the derivative of the log likelihood function ever be negative?

5. How does the derivative of the log likelihood function relate to statistical hypothesis testing?

Similar threads

Hot Threads

Recent Insights