Joint probability from conditional probability?

The interesting thing is that p(a,b) obviously is a joint probability distribution derived from both q(a,b) and r(a,b) , and that given only q(a,b) and r(a,b) , this is the only joint distribution that could have given rise to them."An even more common problem is that p(a \mid b) and p(b \mid a) may be derived from different sources, and it may in such cases be incorrect to view them as conditionals of the same joint distribution." I see what you mean now. Thanks for the clarification!In summary, the conversation discussed a problem where two correlated variables, A and B
  • #1
Demystifier
Science Advisor
Insights Author
Gold Member
14,323
6,805
Hi,
I am a quantum physicist who needs a practical help from mathematicians. :smile:

The physical problem that I have can be reduced to the following mathematical problem:
Assume that we have two correlated variables a and b. Assume that we know all conditional probabilities
P(a|b), P(b|a)
for all possible values of the variables a and b.
What I want to know are all joint probabilities P(a,b). However, a priori they are not given. I want to ask the following:
What is the best I can conclude about P(a,b) from knowledge of P(a|b), P(b|a)?
Are there special cases (except the trivial case in which a and b are independent) in which P(a,b) can be determined uniquely?
Any further suggestions?

Thank you in advance! :smile:
 
Physics news on Phys.org
  • #2
Denoting random variables with capital letters and omitting the Prob{.} part:

A|B = AB/B
B|A = AB/A

where AB is the joint probability. Since you know A|B and B|A, you have 2 equations in 3 unknowns (AB, A, B); you need a 3rd equation; for example: A = f(B). Or, without the shorthand notation, Prob{A} = f(Prob{B}).

See also: http://en.wikipedia.org/wiki/Copula_(statistics )
 
Last edited by a moderator:
  • #3
Well, since

[tex]p(a,b) = p(b|a) p(a) = p(a|b) p(b)[/tex]
where
[tex]p(a) = \int p(a,b)db[/tex]
[tex]p(b) = \int p(a,b)da[/tex]

we have that

[tex]\frac{p(a)}{p(b)} = \frac{p(a|b)}{p(b|a)}[/tex]

[itex]p(a) / p(b)[/itex] is obviously the product of [itex]p(a)[/itex] and [itex]1/p(b)[/itex], so [itex]p(a|b) / p(b|a)[/itex] must also be possible to rewrite as such a product. Once that is done, it is trivial to identify [itex]p(a)[/itex] and [itex]1/p(b)[/itex] out of that expression.

If [itex]p(a|b) / p(b|a)[/itex] is not separable in one a-dependent and one b-dependent factor, then there is a inconsistency between [itex]p(a|b)[/itex] and [itex]p(b|a)[/itex], and they can not be conditional probability distributions from the same joint distribution.

Good luck,

-Emanuel
 
  • #4
I don't think that will always work, winterfors. For one thing, what happens if one of the denominators is 0?

Consider the example were a is a Gaussian variable with mean zero and unit variance, and b is exactly equal to a. The marginal density of b is, naturally, also a unit Gaussian, and the joint density is degenerate (it's like a scalar Gaussian on the diagonal in the (a,b)-plane). The conditional density is then [itex]P(a|b) = 1_b(a)[/itex], and vice-versa. The ratio of the conditional densities, then, is 1 when a=b, and undefined otherwise. This is enough for us to see that a and b are actually the same variable, and so of course have the same marginal, but it doesn't give us any idea what said marginal is. I.e., the whole thing would work out exactly the same if a were given a different marginal distribution.
 
  • #5
quadraphonics said:
I don't think that will always work

You're absolutely right.

There are situations where [itex]p(a)/p(b)[/itex] is undefined because both [itex]p(a|b)[/itex] and [itex]p(b|a)[/itex] are zero. In that case, there is no way of deducing a joint distribution without additional information. The expression of [itex]p(a|b)/p(b|a)[/itex] may also be just too complicated to easily separated into an [itex]a[/itex]-dependent and a [itex]b[/itex]-dependent factor.

An even more common problem is that [itex]p(a|b)[/itex] and [itex]p(b|a)[/itex] may be derived from different sources, and it may in such cases be incorrect to view them as conditionals of the same joint distribution [itex]p(a,b)[/itex].

-Emanuel
 
  • #6
Isn't it also a problem if just one of the conditional distributions assigns zero probability to some region where the corresponding marginal has nonzero probability? I.e., you're trying to infer something about the marginal from a condition that eliminates all information about it. And, after all, divide by zero is undefined...

But I think the approach should be fine, in principle, if you add the restriction that all of the distributions in question are nonzero on the support of the pertinent random variables (which is probably the case that most people are interested in). It may still be impractical to actually work out the expressions for the distributions (and there may be no closed form expression, as they will typically require normalization), but it should all be well-defined... For exponential family distributions, my intuition is that this should always work out nicely, due to the exponents playing nicely with the ratio (i.e., it turns into a linear separation of functions of a and b, instead of a ratio separation). Also note that exponential distributions tend to fulfill the nonzero requirement up front, except for a few boundary cases (which should be degenerate anyway, if my intuition holds...)
 
  • #7
"An even more common problem is that [tex] p(a \mid b) [/tex] and [tex] p(b \mid a) [/tex] may be derived from different sources, and it may in such cases be incorrect to view them as conditionals of the same joint distribution."

I'm not sure what you mean by this.
 
Last edited:
  • #8
quadraphonics said:
Isn't it also a problem if just one of the conditional distributions assigns zero probability to some region where the corresponding marginal has nonzero probability?

If just one of [tex]p(a)[/tex] and [tex]p(b)[/tex] is zero, one can just invert both sides of

[tex]\frac{p(a)}{p(b)} = \frac{p(a|b)}{p(b|a)}[/tex]

and get a well-defined equation.
 
  • #9
statdad said:
"An even more common problem is that [tex] p(a \mid b) [/tex] and [tex] p(b \mid a) [/tex] may be derived from different sources, and it may in such cases be incorrect to view them as conditionals of the same joint distribution."

I'm not sure what you mean by this.

Uhm, it gets a bit technical in terms of what information sources have been used to construct each of the two conditional probability distributions. The short answer is that if they come from completely different sources, one has to assume a marginal distribution for each of them separately, and these can not be derived from the other conditional distribution.

One could do like this: Let's call our two conditional distributions [tex] q(b | a) [/tex] and [tex] r(a | b) [/tex].

We can construct two separate joint distributions by using two non-informative priors [tex] q(a) [/tex] and [tex] r(b) [/tex] :

[tex] q(a,b) = q(b | a)q(a) [/tex]
[tex] r(a,b) = r(a | b)r(b) [/tex]

These can then be combined into a third, joint probability distribution

[tex] p(a,b) = K \frac{q(a,b) r(a,b)}{\mu(a)\mu(b)} [/tex] ,

where [itex] \mu(a)[/itex] and [itex] \mu(b)[/itex] are homogeneous probability densities and [tex] K[/tex] is a normalization constant

[tex] \frac{1}{K} = \int \int \frac{q(a,b) r(a,b)}{\mu(a)\mu(b)} da db [/tex]

I'm not sure this makes things any clearer for you, if you're interested this kind of combining of probability distributions you can have a look in the book "Inverse Problem Theory and Model Parameter Estimation" by Albert Tarantola, around pages 13 and 32.
It's available for download at

http://www.ipgp.jussieu.fr/~tarantola/Files/Professional/Books/index.html

Cheers,

-Emanuel
 
  • #10
For a particular a or b, sure, but we need the expression to hold for all a and b in the support of the marginals in order for the approach to work, don't we? Perhaps it would still work to do the factorization in a piecewise manner and then stitch the results back together? This would work as long as there is no region where both conditionals are zero (but where the marginals are not... which there can't be, if the conditionals come from the same joint distribution). However, I'm not sure this is possible, since you wouldn't know how to normalize each of the pieces. But perhaps it can all be worked out... I will think on it a bit more...

Regardless of the prospects for a piecewise solution, a sufficient condition is that at least one of the conditionals never equals zero in regions where the marginal is positive. Then, you put that conditional in the denominator and proceed. If both conditionals have (disjoint) zero regions, then neither choice of denominator works for the entire support of the marginals.
 
  • #11
Thank you winterfors - I understand your calculations (although I've never seen [tex] \mu [/tex] to represent a density rather than a distribution measure - merely notation), and actually am aware of their basis. I am guilty of one of two things:
* Either taking away an incomplete understanding of the o.p.'s question, or
* Noting that you were referring to a more general situation than the one in the current discussion
 

FAQ: Joint probability from conditional probability?

1. What is joint probability?

Joint probability is the probability of two or more events occurring simultaneously. It is a measure of the likelihood of two or more events occurring together.

2. How is joint probability related to conditional probability?

Conditional probability is the probability of an event occurring given that another event has already occurred. Joint probability is used to calculate conditional probability by taking into account the probability of both events occurring together.

3. What is the formula for calculating joint probability from conditional probability?

The formula for joint probability from conditional probability is P(A and B) = P(A) x P(B|A), where P(A) is the probability of event A occurring and P(B|A) is the conditional probability of event B occurring given that event A has already occurred.

4. How is joint probability represented visually?

Joint probability can be represented using a Venn diagram, where the overlapping area between two events represents the joint probability of those events occurring together.

5. How is joint probability used in real-world applications?

Joint probability is commonly used in fields such as statistics, data analysis, and machine learning to understand the relationship between two or more events and make predictions based on their likelihood of occurring together. It is also used in risk management and decision-making processes to calculate the probability of multiple events happening simultaneously.

Back
Top