Variance of Y^p: How to Calculate Using Conditional Probabilities

  • Thread starter Gridvvk
  • Start date
  • Tags
    Variance
In summary, the conversation was discussing how to find the variance of Y^p for any p > 0, where Y is defined as { X if the coin toss is heads and 1 if the coin toss is tails. Suggestions were given to find the conditional distribution of Y given C, use the joint distribution over all possibilities, and determine the probability distribution of Y. It was also noted that the approach may involve mixing integrals and sums together.
  • #1
Gridvvk
56
1
X ~ standard uniform random variable
We toss a coin randomly and define
Y := { X if the coin toss is heads
...{ 1 is the coin toss is tails

Question wants the Var(Y^p) for any p > 0.

My work:
Var(Y^p) = E(Y^(2p)) - E(Y^p)^2

I'm not sure how to go about finding E(Y^p) and E(Y^(2p)). I thought of using moment generating functions, but the preferred method is supposed to utilize conditional probabilities. Any hint on how to compute E(Y^p) would help.

Thanks
 
Physics news on Phys.org
  • #2
Hey Gridvvk.

If your coin is Bernoulli = C (for coin) then your Y random variable is defined by:

Y = C + (1-C)X where C = 0 corresponds to heads and C = 1 corresponds to tails. You could also switch the values around to give you:

Y = (1-C) + CX.

I would suggest that you find the conditional distribution of Y given C first. To do this you have:

P(Y=1|C=1) = 1, P(Y=y|C=0) = 1, P(C=0) = P(C=1) = 0.5. From this you obtain the joint distribution and get:

P(A=a|B=b) = P(A=a,B=b)/P(B=b) which implies P(A=a,B=b) = P(A=a|B=b)*P(B=b).

You then use the joint distribution over all possibilities to get the moments and thus the Var[Y^p].

Note that you will be mixing integrals and sums together.
 
  • Like
Likes 1 person
  • #3
chiro said:
P(Y=y|C=0) = 1
I'm not at all sure what you mean by that. Rather than work with P(Y|C), how about going straight to E(Yp|C)?
 
Last edited:
  • #4
Gridvvk said:
X ~ standard uniform random variable
We toss a coin randomly and define
Y := { X if the coin toss is heads
...{ 1 is the coin toss is tails

Question wants the Var(Y^p) for any p > 0.

My work:
Var(Y^p) = E(Y^(2p)) - E(Y^p)^2

I'm not sure how to go about finding E(Y^p) and E(Y^(2p)). I thought of using moment generating functions, but the preferred method is supposed to utilize conditional probabilities. Any hint on how to compute E(Y^p) would help.

Thanks

Besides the other suggestions, you could do it by determining the probability distribution of Y. For example, it is not hard to get ##F(y) = P\{Y \leq y\}## for y ≥ 0, and from that get ##F_p(z) = P\{ Y^p \leq z \}## for z ≥ 0. Since ##Z = Y^p \geq 0## you can get its expected value by using the more-or-less standard expression
[tex] EZ = \int_0^{\infty} P\{ Z > z \} \, dz[/tex]
Similarly, you can get ##E Y^{2p}##.
 
  • #5
haruspex said:
I'm not at all sure what you mean by that. Rather than work with P(Y|C), how about going straight to E(Yp|C)?

It has a uniform distribution over [0,1] which is P(X=x) = 1 where x is in [0,1].
 
  • #6
Thanks for all the hints and quick replies. I saw them right away, but didn't fully know how to proceed, so I thought I'd come back look at it later and figure it out, but I'm drawing a blank.

chiro said:
Hey Gridvvk.

If your coin is Bernoulli = C (for coin) then your Y random variable is defined by:

Y = C + (1-C)X where C = 0 corresponds to heads and C = 1 corresponds to tails. You could also switch the values around to give you:

Y = (1-C) + CX.

I would suggest that you find the conditional distribution of Y given C first. To do this you have:

P(Y=1|C=1) = 1, P(Y=y|C=0) = 1, P(C=0) = P(C=1) = 0.5. From this you obtain the joint distribution and get:

P(A=a|B=b) = P(A=a,B=b)/P(B=b) which implies P(A=a,B=b) = P(A=a|B=b)*P(B=b).

You then use the joint distribution over all possibilities to get the moments and thus the Var[Y^p].

Note that you will be mixing integrals and sums together.

I really liked the idea letting Y = C + (1-C)X and C = 0 corresponds to heads and C = 1 corresponds to tails. But I wasn't sure on how that exactly translates to getting the the joint distribution since Y is defined piecewise as a mixed random variable.

Instead I tried finding E[Y] = 1/2 * E[X] + 1/2 * 1 = 1/2 * 1/2 + 1/2 * 1 = 3/4.
I thought E[Y^2] = 1/2 * E[X^2] + 1/2 * 1 = 1/2 * 1/3 + 1/2 * 1 = 2/3

But this seems E[Y^2] - E[Y]^2 < 0, which cannot happen, so I'm going about it the wrong way.


Ray Vickson said:
Besides the other suggestions, you could do it by determining the probability distribution of Y. For example, it is not hard to get ##F(y) = P\{Y \leq y\}## for y ≥ 0, and from that get ##F_p(z) = P\{ Y^p \leq z \}## for z ≥ 0. Since ##Z = Y^p \geq 0## you can get its expected value by using the more-or-less standard expression
[tex] EZ = \int_0^{\infty} P\{ Z > z \} \, dz[/tex]
Similarly, you can get ##E Y^{2p}##.

To determine the CDF don't I need the PDF of Y? Let's suppose I do get it, how can one go from ##F(y) = P\{Y \leq y\}## to ##F_p(z) = P\{ Y^p \leq z \}## ?
 
  • #7
chiro said:
It has a uniform distribution over [0,1] which is P(X=x) = 1 where x is in [0,1].
Only if you redefine P(X=x) to mean dP[X<x]/dx.
 
  • #8
This is just a conditional distribution where C = 1: what do you find wrong with this?
 
  • #9
Gridvvk said:
Instead I tried finding E[Y] = 1/2 * E[X] + 1/2 * 1 = 1/2 * 1/2 + 1/2 * 1 = 3/4.
I thought E[Y^2] = 1/2 * E[X^2] + 1/2 * 1 = 1/2 * 1/3 + 1/2 * 1 = 2/3

But this seems E[Y^2] - E[Y]^2 < 0, which cannot happen
I think you've checked E[Y^2] - E[Y] instead of E[Y^2] - E[Y]^2.
 
  • #10
chiro said:
This is just a conditional distribution where C = 1: what do you find wrong with this?
X has a uniform distribution, so continuous. The probability that it takes any specific value is zero.
 
  • #11
Gridvvk said:
Thanks for all the hints and quick replies. I saw them right away, but didn't fully know how to proceed, so I thought I'd come back look at it later and figure it out, but I'm drawing a blank.



I really liked the idea letting Y = C + (1-C)X and C = 0 corresponds to heads and C = 1 corresponds to tails. But I wasn't sure on how that exactly translates to getting the the joint distribution since Y is defined piecewise as a mixed random variable.

Instead I tried finding E[Y] = 1/2 * E[X] + 1/2 * 1 = 1/2 * 1/2 + 1/2 * 1 = 3/4.
I thought E[Y^2] = 1/2 * E[X^2] + 1/2 * 1 = 1/2 * 1/3 + 1/2 * 1 = 2/3

But this seems E[Y^2] - E[Y]^2 < 0, which cannot happen, so I'm going about it the wrong way.




To determine the CDF don't I need the PDF of Y? Let's suppose I do get it, how can one go from ##F(y) = P\{Y \leq y\}## to ##F_p(z) = P\{ Y^p \leq z \}## ?

If I tell you that ##Y^p \leq z## what can you tell me about Y?
 
  • #12
haruspex said:
I think you've checked E[Y^2] - E[Y] instead of E[Y^2] - E[Y]^2.

Oh, that is true, but when I try to generalize the same method for E[Y^p] I experience some unexpected results. But then again, I didn't use any conditional probabilities this way.

Var[Y^p] = E[Y^(2p)] - E[Y^p]^2

E[Y^p] = (1/2) * E[X^p] + 1/2 * 1 = 1/2(1 / (p + 1)) + 1/2 = (p + 2) / (2p + 2)
Where I used the power-rule for integration to get E[X^p]

Similarly, E[Y^(2p)] = (1/2) * E[X^(2p)] + (1/2) * 1 = 1/2(1 / (2p + 1)) + 1/2 = (p + 1) / (2p + 1)

Var[Y^p] = (p + 1) / (2p + 1) - [(p + 2) / (2p + 2)]^2 = p^2(2p + 3) / [4(p + 1)^2(2p + 1)]

However, the limit of Var[Y^p] as p tends to infinity is 1/4, and not 0 as I thought it would be by the law of large numbers.


Ray Vickson said:
If I tell you that ##Y^p \leq z## what can you tell me about Y?

I'm not entirely sure, would it be ##Y \leq z## as well?
 
  • #13
Gridvvk said:
However, the limit of Var[Y^p] as p tends to infinity is 1/4, and not 0 as I thought it would be by the law of large numbers.
In that limit, Y^p is equally likely 1 or 0. Sounds like a var of 1/4 to me.
 
  • Like
Likes 1 person
  • #14
haruspex said:
In that limit, Y^p is equally likely 1 or 0. Sounds like a var of 1/4 to me.

Yes that does make sense. Just to clarify, the reason this doesn't violate the law of large numbers is because the law doesn't necessarily say that any variance dies out?

Does that mean the method I utilized is correct, even though I did not rely on any conditional probabilities?
 
  • #15
Gridvvk said:
Oh, that is true, but when I try to generalize the same method for E[Y^p] I experience some unexpected results. But then again, I didn't use any conditional probabilities this way.

Var[Y^p] = E[Y^(2p)] - E[Y^p]^2

E[Y^p] = (1/2) * E[X^p] + 1/2 * 1 = 1/2(1 / (p + 1)) + 1/2 = (p + 2) / (2p + 2)
Where I used the power-rule for integration to get E[X^p]

Similarly, E[Y^(2p)] = (1/2) * E[X^(2p)] + (1/2) * 1 = 1/2(1 / (2p + 1)) + 1/2 = (p + 1) / (2p + 1)

Var[Y^p] = (p + 1) / (2p + 1) - [(p + 2) / (2p + 2)]^2 = p^2(2p + 3) / [4(p + 1)^2(2p + 1)]

However, the limit of Var[Y^p] as p tends to infinity is 1/4, and not 0 as I thought it would be by the law of large numbers.




I'm not entirely sure, would it be ##Y \leq z## as well?

NO! Think about it. If ##Y \leq 1/2## do you honestly believe that ##Y^{10}## can be as large as 1/2, or does it have to be a lot less than 1/2? Conversely, it you know that ##Y^{10} \leq 1/2## do you really think that Y cannot be larger than 1/2?
 
  • #16
Ray Vickson said:
NO! Think about it. If ##Y \leq 1/2## do you honestly believe that ##Y^{10}## can be as large as 1/2, or does it have to be a lot less than 1/2? Conversely, it you know that ##Y^{10} \leq 1/2## do you really think that Y cannot be larger than 1/2?

I thought you were referring to a random variable, Y, so I wasn't sure if the that properties was true for higher moments, but yes if you keep taking higher powers of a number that is in (0,1) you approach 0. Conversely, if Y^p is bounded by a z in (0,1) then the bigger that p gets, Y approaches 1.

I fail to see the connection this has with the problem though.
 
  • #17
Gridvvk said:
I thought you were referring to a random variable, Y, so I wasn't sure if the that properties was true for higher moments, but yes if you keep taking higher powers of a number that is in (0,1) you approach 0. Conversely, if Y^p is bounded by a z in (0,1) then the bigger that p gets, Y approaches 1.

I fail to see the connection this has with the problem though.

For ##Z = Y^p## you can get a simple, explicit formula for ##P(Z \leq z)## and can use that to get explicit expressions for ##EZ## and ##\text{Var}\,Z##. In other words, if you know the probability distribution of a random variable you can use that to get the mean and variance, etc.
 
  • #18
Ray Vickson said:
For ##Z = Y^p## you can get a simple, explicit formula for ##P(Z \leq z)## and can use that to get explicit expressions for ##EZ## and ##\text{Var}\,Z##. In other words, if you know the probability distribution of a random variable you can use that to get the mean and variance, etc.

Var[Z] = E[Z^2] - E[Z]^2

Doesn't this correspond directly to what I did without resorting to making the substitution ##Z = Y^p##.

Var[Y^p] = E[Y^(2p)] - E[Y^p]^2

E[Y^p] = (1/2) * E[X^p] + 1/2 * 1 = 1/2(1 / (p + 1)) + 1/2 = (p + 2) / (2p + 2)
Where I used the power-rule for integration to get E[X^p]

Similarly, E[Y^(2p)] = (1/2) * E[X^(2p)] + (1/2) * 1 = 1/2(1 / (2p + 1)) + 1/2 = (p + 1) / (2p + 1)

Var[Y^p] = (p + 1) / (2p + 1) - [(p + 2) / (2p + 2)]^2 = p^2(2p + 3) / [4(p + 1)^2(2p + 1)]

Unless it is the case that my work/answer was wrong, or that by letting Z = Y^p, you streamline the thinking and present in a better manner.
 
  • #19
Gridvvk said:
Var[Z] = E[Z^2] - E[Z]^2

Doesn't this correspond directly to what I did without resorting to making the substitution ##Z = Y^p##.

Var[Y^p] = E[Y^(2p)] - E[Y^p]^2

E[Y^p] = (1/2) * E[X^p] + 1/2 * 1 = 1/2(1 / (p + 1)) + 1/2 = (p + 2) / (2p + 2)
Where I used the power-rule for integration to get E[X^p]

Similarly, E[Y^(2p)] = (1/2) * E[X^(2p)] + (1/2) * 1 = 1/2(1 / (2p + 1)) + 1/2 = (p + 1) / (2p + 1)

Var[Y^p] = (p + 1) / (2p + 1) - [(p + 2) / (2p + 2)]^2 = p^2(2p + 3) / [4(p + 1)^2(2p + 1)]

Unless it is the case that my work/answer was wrong, or that by letting Z = Y^p, you streamline the thinking and present in a better manner.

No problem. These are just two (slightly different) ways of doing the same problem. If you prefer one way, go for it.
 
  • Like
Likes 1 person
  • #20
haruspex said:
X has a uniform distribution, so continuous. The probability that it takes any specific value is zero.

You are way too anal.

Stating P(X=x) = 1 for x in [0,1] is a very standard way of describing a probability density function.

You don't need to bring up measure theory: it's entirely un-necessary for this problem.
 
  • #21
chiro said:
Stating P(X=x) = 1 for x in [0,1] is a very standard way of describing a probability density function.
Never seen that done before. I'm used to forms like fX(x) for continuous pdfs. That's why I couldn't understand what you'd written.
 
  • #22
In the books I've used and seen, typically the way that the probability function is interpreted is based on the domain.

You are right in that if its on a continuous space like the real line, the probability of a single value is zero (and the proofs are done in theoretical probability and measure theory). However it is implicitly understood that the density function corresponds to P(X=x) which takes a value at some value x (possibly a real number).

The density function is denoted as P(X=x) regardless of measure and domain (again in the books I've used) but the probabilistic interpretation will depend on the nature of the random variable.

I can understand that in rigorous treatments, the above might be seen as "hand-wavy", but in your normal A-level books, this is done quite a lot.
 
  • #23
chiro said:
In the books I've used and seen, typically the way that the probability function is interpreted is based on the domain.

You are right in that if its on a continuous space like the real line, the probability of a single value is zero (and the proofs are done in theoretical probability and measure theory). However it is implicitly understood that the density function corresponds to P(X=x) which takes a value at some value x (possibly a real number).

The density function is denoted as P(X=x) regardless of measure and domain (again in the books I've used) but the probabilistic interpretation will depend on the nature of the random variable.

I can understand that in rigorous treatments, the above might be seen as "hand-wavy", but in your normal A-level books, this is done quite a lot.

I have seen many books on probability and have never seen the notation you use in the context of continuous random variables---only for discrete random variables. Of course, there are dozens and dozens of probability books I have not seen, so maybe that notation appears in some of them. What are the titles/authors of the books you cite, so we can know to stay away from them? There are very good reasons to be concerned about that notation, as it will---guranteed-- confuse students and cause some of them to musunderstand the material.

BTW: I think it is inappropriate of you to refer to another poster as "anal"; it borders on abuse.
 
  • #24
I apologize for the anal comment.

I should have clarified that most of my education comes from internal lecture notes as opposed to books (although I do use many books as supplements).

In any case you are right about the notation in the books, which leaves me to look at my notes for any instances of abused notation.

For the OP and any other readers, please disregard my post regarding equating probabilities in continuous random variables with the density function.
 

FAQ: Variance of Y^p: How to Calculate Using Conditional Probabilities

1. What is the definition of variance?

Variance is a measure of how spread out a set of data points are from the average or mean value. It is calculated by taking the average of the squared differences between each data point and the mean.

2. What is conditional probability?

Conditional probability is the probability of an event occurring given that another event has already occurred. It is calculated by dividing the probability of both events occurring by the probability of the first event occurring.

3. How is variance of Y^p calculated using conditional probabilities?

Variance of Y^p is calculated by first finding the conditional probability of Y^p given X, then multiplying that by the conditional probability of X. The result is then squared and subtracted from the expected value of Y^p squared. This calculation can also be written as Var(Y^p) = E(Y^p)^2 - [E(Y^p|X)^2 * P(X)].

4. What are some real-world applications of calculating variance using conditional probabilities?

Conditional probabilities and variance calculations are commonly used in statistics, finance, and risk analysis. For example, they can be used to determine the likelihood of certain outcomes in a stock market investment, or to assess the risk of a medical treatment based on a patient's condition.

5. Are there any limitations to calculating variance using conditional probabilities?

While variance calculations using conditional probabilities can provide valuable insights, they may not accurately represent complex or non-linear relationships between variables. Additionally, they may require large amounts of data and assumptions about the underlying distribution of the data, which can affect the accuracy of the results. Careful interpretation and validation of the results is important when using this method.

Back
Top