Rearranging a formula (Multivariate Gaussian function)

Pi-Bond · Dec 16, 2012

Homework Statement

See image, p(y|θ) is the Likelihood function which has to be rearranged in the form of equation (3). θ is a vector variable.

Homework Equations

None?

The Attempt at a Solution

I first expanded the exponent in the original function, equation (2).

[tex](b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta[/tex]

Now suppose I can write the function equivalently as

[tex]C \exp\left( -\frac{1}{2} (\theta - \theta_0)^T L (\theta - \theta_0) \right)[/tex]

where C represents the same constant multiplying with exp in equation (2). In this case, the exponential parts must be the same. So:

[tex]b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta = \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0 [/tex]

If this this expression is true for all θ, then all coefficients of (...)θ , θ^T(...) , θ^T(...)θ and the constants must match.

So L = A^TA and θ₀=L^-1A^Tb.

Now I don't know how to get L₀. (equation (5) ). I only have this condition left from my assumption about the constants above:

b^Tb=θ^T₀Lθ₀

There just don't seem to be enough terms in either exponent to allow the exponential part of L₀. I don't think I can add and subtract anything either...

Any ideas?

Ray Vickson · Dec 16, 2012

Pi-Bond said:

Homework Statement

See image, p(y|θ) is the Likelihood function which has to be rearranged in the form of equation (3). θ is a vector variable.

Homework Equations

None?

The Attempt at a Solution

I first expanded the exponent in the original function, equation (2).

[tex](b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta[/tex]

Now suppose I can write the function equivalently as

[tex]C \exp\left( -\frac{1}{2} (\theta - \theta_0)^T L (\theta - \theta_0) \right)[/tex]

where C represents the same constant multiplying with exp in equation (2). In this case, the exponential parts must be the same. So:

[tex]b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta = \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0 [/tex]

If this this expression is true for all θ, then all coefficients of (...)θ , θ^T(...) , θ^T(...)θ and the constants must match.

So L = A^TA and θ₀=LA^Tb.

Now I don't know how to get L₀. (equation (5) ). I only have this condition left from my assumption about the constants above:

b^Tb=θ^T₀Lθ₀

There just don't seem to be enough terms in either exponent to allow the exponential part of L₀. I don't think I can add and subtract anything either...

Any ideas?

Try to show that
[tex] (b-A\theta)^T(b-A\theta) = (\theta - \theta_0)^T L (\theta - \theta_0) + K[/tex] for some matrix L and some constant K, and for ##\theta_0## as given in the question.

Pi-Bond · Dec 16, 2012

Ok.

[tex](b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta[/tex]
[tex]= b^Tb - (L L^{-1}A^T b)^T\theta - \theta^T L( L^{-1}A^T b) + \theta^T L \theta[/tex]
[tex]= b^Tb + \theta_0^T L \theta - \theta^T L \theta_0 + \theta^T L \theta[/tex]
[tex]= b^Tb + (\theta-\theta_0)^T L \theta - \theta^T L \theta_0[/tex]
[tex]= b^Tb + (\theta-\theta_0)^T L \theta - (\theta-\theta_0)^T L \theta_0 -\theta_0^T L \theta_0 [/tex]
[tex]= b^Tb + (\theta-\theta_0)^T L (\theta-\theta_0)-\theta_0^T L \theta_0 [/tex]

I used the fact that L^T=L=A^TA.

On the basis of this it seems K = b^Tb - θ₀^TLθ₀

I still can't see the origin of L₀ here though..

Pi-Bond · Dec 20, 2012

Bump. I haven't been able to find any leads. Anyone have an idea?

pasmith · Dec 20, 2012

I'm somewhat confused by the fact that if [itex]L = A^TA[/itex] and [itex]\theta_0 = L^{-1}A^Tb[/itex] then [itex]L^{-1} = A^{-1}(A^T)^{-1}[/itex] so that
[tex]A\theta_0 = AL^{-1}A^Tb = A(A^{-1}(A^T)^{-1})A^Tb = b[/tex]
which means that [itex]b - A\theta_0 = b - b = 0[/itex]. So I'm at a loss to explain why the author has included the exponential in [itex]\mathcal{L}_0[/itex], when its value appears to be [itex]\exp(0) = 1[/itex].

On the other hand, I was able to show (using the additional fact that [itex]L[/itex] is symmetric so that [itex](L^T)^{-1} = (L^{-1})^T = L^{-1}[/itex]) that
[tex](\theta - \theta_0)^TL(\theta - \theta_0) = (A\theta - b)^T(A\theta - b)
= (b - A\theta)^T(b - A\theta)[/tex]
It's just a case of expanding the left hand side, substituting the definitions of [itex]L[/itex] and [itex]\theta_0[/itex] and simplifying.

pasmith · Dec 21, 2012

pasmith said:

I'm somewhat confused by the fact that if [itex]L = A^TA[/itex] and [itex]\theta_0 = L^{-1}A^Tb[/itex] then [itex]L^{-1} = A^{-1}(A^T)^{-1}[/itex]

Ignore that: A is not square, so cannot be invertible.

The aim is to show that [itex](b - A\theta)^T(b - A\theta) = (b - A\theta_0)^T(b - A\theta_0) + (\theta - \theta_0)^T(\theta - \theta_0)[/itex]

Now the left hand side is
[tex]b^Tb - b^TA\theta - \theta^TA^Tb + \theta^TA^TA\theta[/tex]

The right hand side is
[tex]b^Tb - b^TA\theta_0 - \theta_0^TA^Tb + \theta_0^TA^TA\theta_0
+ \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0\\
= b^Tb - (b^TA\theta_0 + \theta_0^TA^Tb) + 2\theta_0^TL\theta_0
+ \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta \\
= b^Tb - 2b^TAL^{-1}A^Tb + 2b^TAL^{-1}A^Tb
+ \theta^TA^TA\theta - \theta^TA^Tb - b^TA\theta \\
= b^Tb -b^TA\theta - \theta^TA^Tb + \theta^TA^TA\theta
[/tex]
as required.

Pi-Bond · Dec 22, 2012

Edit: there seems to be a mistake between the second and third lines. You go from -(b^TAθ₀ + θ₀^TA^Tb) to -2b^T(...)+2b^T(...)

The plus sign should be minus, and I'm not sure of where the factor of 2 comes from. Anyway I will investigate your approach.

Rearranging a formula (Multivariate Gaussian function)

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

FAQ: Rearranging a formula (Multivariate Gaussian function)

How do I rearrange a multivariate Gaussian function?

2. What is the purpose of rearranging a multivariate Gaussian function?

3. Can I rearrange a multivariate Gaussian function for any variable?

4. Are there specific rules or methods for rearranging a multivariate Gaussian function?

5. Can rearranging a multivariate Gaussian function change the nature of the function?

Similar threads

Hot Threads

Recent Insights