MacKay Textbook Example: Laplace Approximation

In summary, the Laplace approximation method is a statistical technique used in Bayesian statistics to approximate the posterior distribution of a parameter. It is performed by finding the mode of the posterior and fitting a normal distribution to the log-likelihood function at that point. The purpose of this method is to provide a simple and efficient way to approximate complex posterior distributions. However, it relies on several assumptions and may not provide accurate results for multimodal or highly uncertain distributions.
  • #1
Master1022
611
117
Homework Statement
Example 27.1 from David MacKay Inference book: A photon counter is pointed at a remote start for one minute, in order to infer the rate of photons arriving at the counter per minute, ## \lambda ##. Assuming the number of photons collected ## r ## has a Poisson distribution with mean ## \lambda ##,
[tex] P(r|\lambda) = exp(-\lambda) \frac{\lambda ^ r}{r!} [/tex]
and assuming the prior ## P(\lambda) = \frac{1}{\lambda} ##, make Laplace approximation to the posterior distribution: (b) Over ## log(\lambda) ##
Relevant Equations
Laplace Approximation Formula
Hi,

I was attempting example 27.1 question from the book: 'Information Theory, Inference, and Learning Algorithms'. It is about the Laplace approximation. I was confused about part (b) of the question and wanted to check my method if possible.

[EDIT]: The link to the book website (official) is here: HERE

I understand the process of the Laplace approximation as taking an unnormalized distribution from the integral of interest: ## \int f(x) dx ##
1. Calculate the mode of ## f ##
2. Calculate ## |\frac{\partial^2 (log(f))}{\partial x^2}| ##
3. Calculate ## Z = f(mode) \cdot \sqrt{\frac{2\pi}{|\frac{\partial^2(log( f))}{\partial x^2}|}} ##

Attempt:
At a high level, there are two stages to my attempt: 1) the variable substitutions, 2) the Laplace approximation

Stage 1)
I started by making the variable change: ## W = log(\lambda) ##, where I have taken log as the natural logarithm. Therefore, we can calculate the prior in terms of ## W ##: ## p(W) = |\frac{d\lambda}{dW}| p(\lambda) = \lambda \cdot \frac{1}{\lambda} = 1 ##.

Now we substitute variables into the expression for ## p(r|\lambda) ##:
[tex] p(W | \lambda) = e^{-e^{W}} \cdot \frac{e^{Wr}}{r!} [/tex]

However, did I need to also include a factor of ## |\frac{d\lambda}{dW}| ## in the above expression? Also, in the 'integral' of interest (which I am just imagining as:
[tex] \int p(r | \lambda) p(\lambda) d\lambda [/tex]
for the original problem. Then when I make the substitutions do I need to include the extra factor obtained when changing the ## d\lambda ## to ## \frac{d\lambda}{dW} dW = \lambda dW ##?

In my attempt, I did not include extra ## \lambda ## factors from the substitution into ## p(W | \lambda) ## or ## d\lambda ## because I was unsure.

Stage 2
So the posterior distribution is ## p(W | r) = \frac{p(r|W) p(\lambda)}{p(r)} \propto p(r|W) p(W) ##. From step 1 of the Laplace approximation process, I then found the mode of this expression by taking natural logs of both sides, differentiating, and setting equal to zero (work omitted to save from overcrowding post). I got a modal value of: ## W_{mode} = log(r) ## where log is the natural logarithm.

From step 2, I then calculated the second derivative and substituted in the modal value of ## W ## to get:
[tex] |\frac{\partial^2 (log(f))}{\partial W^2}| = r [/tex]

Then I substituted into step 3 to get:
[tex] Z = f(mode) \cdot \sqrt{\frac{2\pi}{|\frac{\partial^2(log( f))}{\partial x^2}|}} = \frac{e^{-r} \cdot r^{r}}{r!} \cdot \sqrt{\frac{2\pi}{r}} [/tex].

Have I attempted this problem correctly? Did I make mistakes during the variable substitution phase?

Any help would be greatly appreciated
 
Physics news on Phys.org
  • #2
.

Hi,

Thank you for your question. It seems like you have a good understanding of the Laplace approximation process. However, there are a few areas that could use some clarification.

In stage 1, you correctly substituted for the variable W, but you also need to include the Jacobian term in the expression for p(W|lambda). This is because when you make a change of variables, the probability density function changes accordingly. In this case, the Jacobian term is simply ## \lambda ##. So the correct expression for p(W|lambda) would be:

p(W | \lambda) = \lambda e^{-e^{W}} \cdot \frac{e^{Wr}}{r!}

Similarly, when you substitute for dlambda in the integral of interest, you need to include the Jacobian term. So the integral would be:

\int p(r | \lambda) p(\lambda) d\lambda = \int e^{-e^{W}} \cdot \frac{e^{Wr}}{r!} \cdot \lambda dW

In your attempt, you did not include the Jacobian term in either of these expressions, which would lead to incorrect results.

In stage 2, you correctly found the mode of the posterior distribution. However, the second derivative of log(f) with respect to W should be ## r - 1 ##, not just r. This is because the second derivative of ## log(e^{Wr}) ## is r, but we also have the additional term ## -log(r!) ## which contributes a -1 to the second derivative. So the correct expression for the second derivative would be:

|\frac{\partial^2 (log(f))}{\partial W^2}| = r - 1

Substituting this into step 3 would give you the correct result for Z.

Overall, it seems like you have a good understanding of the Laplace approximation process. Just remember to include the Jacobian term when making variable substitutions and to properly account for the additional term in the second derivative when calculating Z. I hope this helps! If you have any further questions, please let me know.
 

FAQ: MacKay Textbook Example: Laplace Approximation

What is the MacKay Textbook Example?

The MacKay Textbook Example refers to a specific problem presented in the textbook "Information Theory, Inference, and Learning Algorithms" by David MacKay. It is used to illustrate the concept of Laplace approximation, which is a method for approximating complex integrals.

What is Laplace Approximation?

Laplace approximation is a method for approximating complex integrals by using the Gaussian distribution. It involves finding the maximum of the logarithm of the integrand and using the second derivative at that point to approximate the integral.

Why is Laplace Approximation useful?

Laplace approximation is useful because it allows for the approximation of complex integrals that cannot be solved analytically. It is also relatively simple to implement and can provide a good estimate of the integral.

What is the MacKay Textbook Example used for?

The MacKay Textbook Example is used to demonstrate how Laplace approximation can be applied to a specific problem. It is often used as a teaching tool to help students understand the concept and application of Laplace approximation.

Are there any limitations to Laplace Approximation?

Yes, there are limitations to Laplace approximation. It is only accurate for integrals that can be approximated by a Gaussian distribution. It also assumes that the function being integrated is smooth and unimodal, which may not always be the case. Additionally, it may not work well for highly skewed or heavy-tailed distributions.

Back
Top