- #1
Master1022
- 611
- 117
- Homework Statement
- Two classes ## C_1 ## and ## C_2 ## have equal priors. The likelihoods of ## x## belonging to each class are given by 2D normal distributions with different means, but the same covariance: [tex] p(x|C _1) = N(\mu_x, \sum) \text{and} p(x|C_2) = N(\mu_y, \sum) [/tex]
where we know the relationship between ## \mu_x ## and ## \mu_y ##
Determine the shape of the discriminant curve.
- Relevant Equations
- [tex] g(x) = ln \left( \frac{p(C_1 | x)}{p(C_2 | x)} \right) [/tex]
Hi,
I was working on the following problem:
Two classes ## C_1 ## and ## C_2 ## have equal priors. The likelihoods of ## x## belonging to each class are given by 2D normal distributions with different means, but the same covariance: [tex] p(x|C _1) = N(\mu_x, \Sigma) \text{and} p(x|C_2) = N(\mu_y, \Sigma) [/tex]
where we know the relationship between ## \mu_x ## and ## \mu_y ##
Determine the shape of the discriminant curve.
In a previous part of the question, we are told that ## y = Ax + t ## and thus we know that ## \mu_y = A \mu_x + t ## and ## \Sigma_y = A \Sigma_x A^T ##
Attempt:
From some online lecture notes, I gather that the shape of this discriminant curve should be a hyperplane, but I now want to verify this to be the case.
We can define ## g(x) ## as follows:
[tex] g(x) = ln \left( \frac{p(C_1 | x)}{p(C_2 | x)} \right) \rightarrow ln \left( \frac{p(x | C_1)}{p(x | C_2)} \right) + ln \left( \frac{p(C_1)}{p(C_2)} \right) [/tex]
Given that the priors are equal, then the second term is 0. Thus we are left with the first term. For the first term, the components can be defined as:
[tex] p(x | C_i ) = \frac{1}{(2 \pi) |\Sigma_i|^{1/2}} exp\left( -\frac{1}{2} (x - \mu_i)^T \Sigma_i ^{-1} (x - \mu_i) \right) [/tex]
Given that ## \Sigma_x = \Sigma_y ##, then we can just separate the logarithms and look at the terms in the exponents as:
[tex] g(x) = -(x - \mu_x)^T \Sigma^{-1} (x - \mu_x) + (x - \mu_y)^T \Sigma^{-1} (x - \mu_y) [/tex]
The discriminant curve should be where the classes are equiprobable, and thus ## g(x) = ln \left( \frac{p(C_1 | x)}{p(C_2 | x)} \right) = 0 ##
[tex] 0 = -(x - \mu_x)^T \Sigma^{-1} (x - \mu_x) + (x - \mu_y)^T \Sigma^{-1} (x - \mu_y) [/tex]
Now I suppose there are two ways to proceed:
1. Algebra
2. There is a hint about transforming ## \Sigma ## to the identity matrix, but I am not sure (a) how to properly do that, and (b) how that can help us. How could I do this second method?
Given that I don't quite understand how to do the transformation of the covariance matrix, I will continue with the algebra:
[tex] 0 = - ( x^T \Sigma^{-1} x - x^T \Sigma^{-1} \mu_x - \mu_x ^T \Sigma^{-1} x + \mu_x ^T \Sigma^{-1} \mu_x) + ( x^T \Sigma^{-1} x - x^T \Sigma^{-1} \mu_y - \mu_y ^T \Sigma^{-1} x + \mu_y ^T \Sigma^{-1} \mu_y) [/tex]
[tex] 0 = 2 x^T \Sigma^{-1} \mu_x - \mu_x ^T \Sigma^{-1} \mu_x - 2 x^T \Sigma^{-1} \mu_y + \mu_y ^T \Sigma^{-1} \mu_y [/tex]
[tex] 0 = (\mu_x - \mu_y)^T \Sigma^{-1} (\mu_x - \mu_y) + 2 x^T \Sigma^{-1} (\mu_x - \mu_y) [/tex]
We know that: ## \mu_x - \mu_y = \mu_x (I - A) - t ##, but I am not really sure how to proceed from here. I see that the second two terms are common to both terms, but I am not sure how this brings us closer to seeing that this is a hyperplane.
Any help would be greatly appreciated.
I was working on the following problem:
Two classes ## C_1 ## and ## C_2 ## have equal priors. The likelihoods of ## x## belonging to each class are given by 2D normal distributions with different means, but the same covariance: [tex] p(x|C _1) = N(\mu_x, \Sigma) \text{and} p(x|C_2) = N(\mu_y, \Sigma) [/tex]
where we know the relationship between ## \mu_x ## and ## \mu_y ##
Determine the shape of the discriminant curve.
In a previous part of the question, we are told that ## y = Ax + t ## and thus we know that ## \mu_y = A \mu_x + t ## and ## \Sigma_y = A \Sigma_x A^T ##
Attempt:
From some online lecture notes, I gather that the shape of this discriminant curve should be a hyperplane, but I now want to verify this to be the case.
We can define ## g(x) ## as follows:
[tex] g(x) = ln \left( \frac{p(C_1 | x)}{p(C_2 | x)} \right) \rightarrow ln \left( \frac{p(x | C_1)}{p(x | C_2)} \right) + ln \left( \frac{p(C_1)}{p(C_2)} \right) [/tex]
Given that the priors are equal, then the second term is 0. Thus we are left with the first term. For the first term, the components can be defined as:
[tex] p(x | C_i ) = \frac{1}{(2 \pi) |\Sigma_i|^{1/2}} exp\left( -\frac{1}{2} (x - \mu_i)^T \Sigma_i ^{-1} (x - \mu_i) \right) [/tex]
Given that ## \Sigma_x = \Sigma_y ##, then we can just separate the logarithms and look at the terms in the exponents as:
[tex] g(x) = -(x - \mu_x)^T \Sigma^{-1} (x - \mu_x) + (x - \mu_y)^T \Sigma^{-1} (x - \mu_y) [/tex]
The discriminant curve should be where the classes are equiprobable, and thus ## g(x) = ln \left( \frac{p(C_1 | x)}{p(C_2 | x)} \right) = 0 ##
[tex] 0 = -(x - \mu_x)^T \Sigma^{-1} (x - \mu_x) + (x - \mu_y)^T \Sigma^{-1} (x - \mu_y) [/tex]
Now I suppose there are two ways to proceed:
1. Algebra
2. There is a hint about transforming ## \Sigma ## to the identity matrix, but I am not sure (a) how to properly do that, and (b) how that can help us. How could I do this second method?
Given that I don't quite understand how to do the transformation of the covariance matrix, I will continue with the algebra:
[tex] 0 = - ( x^T \Sigma^{-1} x - x^T \Sigma^{-1} \mu_x - \mu_x ^T \Sigma^{-1} x + \mu_x ^T \Sigma^{-1} \mu_x) + ( x^T \Sigma^{-1} x - x^T \Sigma^{-1} \mu_y - \mu_y ^T \Sigma^{-1} x + \mu_y ^T \Sigma^{-1} \mu_y) [/tex]
[tex] 0 = 2 x^T \Sigma^{-1} \mu_x - \mu_x ^T \Sigma^{-1} \mu_x - 2 x^T \Sigma^{-1} \mu_y + \mu_y ^T \Sigma^{-1} \mu_y [/tex]
[tex] 0 = (\mu_x - \mu_y)^T \Sigma^{-1} (\mu_x - \mu_y) + 2 x^T \Sigma^{-1} (\mu_x - \mu_y) [/tex]
We know that: ## \mu_x - \mu_y = \mu_x (I - A) - t ##, but I am not really sure how to proceed from here. I see that the second two terms are common to both terms, but I am not sure how this brings us closer to seeing that this is a hyperplane.
Any help would be greatly appreciated.
Last edited: