Log-concave posterior density function adaptive rejection sampling

In summary: The derivative of the Log of the posterior is non-increasing in $\theta$, which means that $g(\theta)$ is not log-concave.
  • #1
numbersense
5
0
Let $x_1,x_2,\ldots,x_n$ be binary observations which have independent and identical Bernoulli distributions with parameter $\theta$, so that $f(x_i | \theta) = \theta^{x_i} (1 - \theta)^{1-x_i}$. Suppose that the prior density for $\theta$ is
\begin{equation*}
f(\theta) \propto \theta^{-1} (1 - \theta)^{-1} \exp\left(- \frac{1}{2\nu} \log \left(\frac{\theta}{1-\theta}\right)^2\right)
\end{equation*}
Show that the posterior density for $\theta$ is not generally log-concave, but that the posterior density for $\displaystyle \phi = \frac{\theta}{1-\theta}$ is and hence can be sampled by adaptive rejection sampling. Write down the acceptance probability for a random walk Metropolis-Hastings algorithm where the proposal $\phi^*$ is generated from a normal distribution with mean $\phi^i$ and variance $\sigma^2$.
Likelihood function is
\begin{equation*}
f(\boldsymbol{x} | \theta) = \theta^{\sum x_i} (1- \theta)^{n - \sum x_i}
\end{equation*}
Posterior density function is
\begin{equation*}
g(\theta) = f(\theta | \boldsymbol{x}) \propto f(\boldsymbol{x} | \theta) f(\theta) = \theta^{(\sum x_i) - 1} (1-\theta)^{n-1-\sum x_i} \exp\left(-\frac{1}{2\nu} \log\left(\frac{\theta}{1-\theta}\right)^2\right)
\end{equation*}
\begin{equation*}
\ln g(\theta) = \left(\left(\sum x_i\right) - 1\right) \ln \theta + \left(n-1-\sum x_i\right) \ln (1-\theta) - \frac{1}{2\nu} \log\left(\frac{\theta}{1-\theta}\right)^2
\end{equation*}
\begin{align*}
\frac{d (\ln g)}{d\theta}(\theta) & = \left(\left(\sum x_i\right) - 1\right) \frac{1}{\theta} - \left(n-1-\sum x_i\right) \frac{1}{1-\theta} - \frac{1}{\nu} \log\left(\frac{\theta}{1-\theta}\right) \frac{1-\theta}{\theta} \frac{(1-\theta) + \theta}{(1-\theta)^2} \\
& = \left(\left(\sum x_i\right) - 1\right) \frac{1}{\theta} - \left(n-1-\sum x_i\right) \frac{1}{1-\theta} - \frac{1}{\nu} \log\left(\frac{\theta}{1-\theta}\right) \frac{1}{\theta (1- \theta)}
\end{align*}
For $g(\theta)$ to be log-concave, the derivative of the Log of the posterior, i.e. $\displaystyle \frac{d(\ln g)}{d\theta}$, has to exist, and be non-increasing in $\theta$. This derivative seems to exist, how to show that it is not non-increasing in $\theta$ and hence that $g(\theta)$ is not log-concave?
 
Physics news on Phys.org
  • #2
We can use the fact that $\theta$ and $\phi = \frac{\theta}{1-\theta}$ have an inverse relationship with each other, i.e. $\theta = \frac{\phi}{1+\phi}$. Thus, the posterior density for $\phi$ is \begin{align*} g(\phi) & = f(\phi | \boldsymbol{x}) \propto f(\boldsymbol{x} | \phi) f(\phi) \\ & = \left(\frac{\phi}{1+\phi}\right)^{(\sum x_i) - 1} \left(1-\frac{\phi}{1+\phi}\right)^{n-1-\sum x_i} \exp\left(-\frac{1}{2\nu} \log\left(\phi\right)^2\right) \end{align*} Thus, we can write the Log of the posterior as \begin{align*} \ln g(\phi) & = \left((\sum x_i) - 1\right) \ln \left(\frac{\phi}{1+\phi}\right) + \left(n-1-\sum x_i\right) \ln \left(1-\frac{\phi}{1+\phi}\right) - \frac{1}{2\nu} \log\left(\phi\right)^2 \end{align*} Taking the derivative of the Log of the posterior with respect to $\phi$, we get \begin{align*} \frac{d (\ln g)}{d\phi}(\phi) & = \left(\left(\sum x_i\right) - 1\right) \frac{1}{\phi} - \left(n-1-\sum x_i\right) \frac{1}{1+\phi} - \frac{1}{\nu} \log\left(\phi\right) \frac{1+\phi}{\phi} \frac{(1+\phi) + \phi}{(1+\phi)^2} \\ & = \left(\left(\sum x_i
 

FAQ: Log-concave posterior density function adaptive rejection sampling

What is a log-concave posterior density function?

A log-concave posterior density function is a type of probability distribution that is commonly used in Bayesian statistics. It is defined as a function that is both log-concave and normalized, meaning that the logarithm of the function is concave and the area under the curve integrates to one. This type of function is often used to model a posterior distribution, which represents the probability of a set of parameters given observed data.

What is adaptive rejection sampling?

Adaptive rejection sampling is a type of Monte Carlo method used to generate random samples from a log-concave posterior density function. It involves constructing a piecewise approximation of the function using a set of "support points", and then using that approximation to generate candidate samples. These candidates are then accepted or rejected based on a comparison to the true function, and the process is repeated until a sufficient number of samples are obtained.

How is adaptive rejection sampling different from other sampling methods?

Unlike other sampling methods, such as Markov chain Monte Carlo (MCMC), adaptive rejection sampling does not require the calculation of derivatives or the use of a proposal distribution. This makes it a useful alternative when those methods are not feasible or efficient. Additionally, adaptive rejection sampling is known to have good convergence properties and can handle a wide range of posterior distributions.

When is adaptive rejection sampling most useful?

Adaptive rejection sampling is most useful when the posterior distribution has a complex shape or is difficult to sample from using other methods. It can also be used when the dimensionality of the problem is high, as it does not suffer from the "curse of dimensionality" like some other sampling methods. However, it may not be as efficient as other methods in certain scenarios, such as when the posterior distribution is unimodal and well-behaved.

What are some potential drawbacks of adaptive rejection sampling?

One potential drawback of adaptive rejection sampling is that it requires a user to specify the initial set of support points, which can be time-consuming and may affect the efficiency of the sampling process. Additionally, if the initial set of support points is not chosen well, the algorithm may not converge or may produce biased samples. It is also worth noting that adaptive rejection sampling may not be as efficient as other methods in certain scenarios, and may not be suitable for all types of posterior distributions.

Similar threads

Replies
1
Views
1K
Replies
1
Views
936
Replies
6
Views
2K
Replies
1
Views
1K
Replies
1
Views
1K
Replies
1
Views
2K
Replies
0
Views
1K
Replies
1
Views
1K
Replies
12
Views
2K
Back
Top