Prob with Univariate&Bivariate&Marginal Normal Distributions

In summary: To my knowledge, the use of ##f(x) = 1/√(2πσ_X) e^{–(x – μ_X)^2 / {(2σ_X^2)}} (for –∞ < x < ∞)## for part b) and ##f(x) = 1/√(2πσ_{X | Y = 102}) e^{–(x – μ_{X | Y = 102})^2 / {(2σ_{X | Y = 102}^2)}} (for –∞ < x < ∞)## for part d) does not contradict your statement because you are not setting x equal to a specific value in those equations. You
  • #1
s3a
818
8

Homework Statement



Problem:
In an acid-base titration, a base or acid is gradually added to the other until they have completely neutralized each other.

Let X and Y denote the milliliters of acid and base needed for equivalence, respectively.

Assume X and Y have a bivariate normal distribution with ##σ_X## = 5mL, ##σ_Y## = 2mL , ##μ_X## = 120mL , ##μ_Y## = 100mL , and ρ = 0.6.Determine the following.:

a) Covariance of X and Y

b) Marginal probability distribution of X

c) P(X < 116)

d) Conditional probability distribution of X given that Y = 102

e) P(X < 116 | Y = 102)

Solutions given by solutions manual:
a) ##ρ = cov(X,Y)/[σ_x σ_y] =0.6 cov(X,Y)= 0.6*2*5=6##

b) The marginal probability distribution of X is normal with mean ##μ_x##, ##σ_x##.

c) ##P(X < 116) =P(X-120 < -4)=P((X_120)/5 < -0.8)=P(Z < -0.8) = 0.21##

d) The conditional probability distribution of X given Y = 102 is a bivariate normal distribution with mean and variance

μ_{X | Y = 102} = 120 - 100*0.6*(5/2) + (5/2)*0.6*102 = 123

σ^2_{X | Y = 102} = 25(1 - 0.36) = 16

e) ##P(X < 116|Y=102)=P(Z < (116-123)/4)=0.040##

Homework Equations


##μ_{X | Y = y} = μ_x + ρ σ_X/σ_Y (y – μ_y)##

##σ^2_{Y | X = x} = σ_Y^2 (1 – ρ^2)##

Bivariate normal distribution:

##f_{XY}(x,y; σ_X, σ_Y, μ_x, μ_y, ρ) = 1/[2πσ_Xσ_Y√(1 – ρ^2)] × exp( –1/[2(1 – ρ^2)] × ((x – μ_X)/σ_X^2 – 2ρ(x – μ_X)(y - μ_Y)/[σ_Xσ_Y] + (y – μ_Y)^2/σ_Y^2) )## for –∞ < x < ∞ and –∞ < x < ∞, with parameters σ_X > 0, σ_Y > 0, –∞ < μ_X < ∞, –∞ < μ_Y < ∞ and –1 < ρ < 1.

Univariate normal distribution:

##f(x) = 1/√(2πσ) e^{–(x – μ)^2 / [2σ^2]}## for –∞ < x < ∞

The Attempt at a Solution


There are a few things I want to address with this post. Here they are.:

1) Would the marginal probability distribution of part b) be given by ##∫^∞_∞ 1/√(2πσ_X) e^{–(x – μ_X)^2 / [2σ_X^2]} dx## ?

2) Would this be calculable using the integral from my question “1)” above?

3) Would the conditional probability distribution of X given that Y = 102 be given by ##∫^∞_∞ 1/√(2πσ_{X | Y = 102}) e^{–(x – μ_{X | Y = 102})^2 / [2σ_{X | Y = 102}^2]} dx## ?

4) Would this be calculable using the integral from my question “3)” above?Any input would be GREATLY appreciated!
 
Physics news on Phys.org
  • #2
s3a said:

Homework Statement



Problem:
In an acid-base titration, a base or acid is gradually added to the other until they have completely neutralized each other.

Let X and Y denote the milliliters of acid and base needed for equivalence, respectively.

Assume X and Y have a bivariate normal distribution with ##σ_X## = 5mL, ##σ_Y## = 2mL , ##μ_X## = 120mL , ##μ_Y## = 100mL , and ρ = 0.6.Determine the following.:

a) Covariance of X and Y

b) Marginal probability distribution of X

c) P(X < 116)

d) Conditional probability distribution of X given that Y = 102

e) P(X < 116 | Y = 102)

Solutions given by solutions manual:
a) ##ρ = cov(X,Y)/[σ_x σ_y] =0.6 cov(X,Y)= 0.6*2*5=6##

b) The marginal probability distribution of X is normal with mean ##μ_x##, ##σ_x##.

c) ##P(X < 116) =P(X-120 < -4)=P((X_120)/5 < -0.8)=P(Z < -0.8) = 0.21##

d) The conditional probability distribution of X given Y = 102 is a bivariate normal distribution with mean and variance

μ_{X | Y = 102} = 120 - 100*0.6*(5/2) + (5/2)*0.6*102 = 123

σ^2_{X | Y = 102} = 25(1 - 0.36) = 16

e) ##P(X < 116|Y=102)=P(Z < (116-123)/4)=0.040##

Homework Equations


##μ_{X | Y = y} = μ_x + ρ σ_X/σ_Y (y – μ_y)##

##σ^2_{Y | X = x} = σ_Y^2 (1 – ρ^2)##

Bivariate normal distribution:

##f_{XY}(x,y; σ_X, σ_Y, μ_x, μ_y, ρ) = 1/[2πσ_Xσ_Y√(1 – ρ^2)] × exp( –1/[2(1 – ρ^2)] × ((x – μ_X)/σ_X^2 – 2ρ(x – μ_X)(y - μ_Y)/[σ_Xσ_Y] + (y – μ_Y)^2/σ_Y^2) )## for –∞ < x < ∞ and –∞ < x < ∞, with parameters σ_X > 0, σ_Y > 0, –∞ < μ_X < ∞, –∞ < μ_Y < ∞ and –1 < ρ < 1.

Univariate normal distribution:

##f(x) = 1/√(2πσ) e^{–(x – μ)^2 / [2σ^2]}## for –∞ < x < ∞

The Attempt at a Solution


There are a few things I want to address with this post. Here they are.:

1) Would the marginal probability distribution of part b) be given by ##∫^∞_∞ 1/√(2πσ_X) e^{–(x – μ_X)^2 / [2σ_X^2]} dx## ?

2) Would this be calculable using the integral from my question “1)” above?

3) Would the conditional probability distribution of X given that Y = 102 be given by ##∫^∞_∞ 1/√(2πσ_{X | Y = 102}) e^{–(x – μ_{X | Y = 102})^2 / [2σ_{X | Y = 102}^2]} dx## ?

4) Would this be calculable using the integral from my question “3)” above?Any input would be GREATLY appreciated!

1) NO, obviously not. The marginal distribution of X must have x in it, but you have integrated over all x (so your final answer will not contain x anymore). You need to erase the integration sign.
2) No integration necessary for answering the question.
3) No. A conditional distribution of X must have an x in it, and you have integrated over all x (so your answer will not contain x anymore). It would be OK to just erase the integral sign.
4) Same answer as in (2).
 
Last edited:
  • #3
To my knowledge, it is always the case that P(X = x) = 0 when dealing with continuous functions.

Why is the use of ##f(x) = 1/√(2πσ_X) e^{–(x – μ_X)^2 / {(2σ_X^2)}} (for –∞ < x < ∞)## for part b) and ##f(x) = 1/√(2πσ_{X | Y = 102}) e^{–(x – μ_{X | Y = 102})^2 / {(2σ_{X | Y = 102}^2)}} (for –∞ < x < ∞)## for part d) not contradicting my above statement?

Also, the sheet of paper that my school gives, which has the table for the normal distribution's Z values and probabilities, states that ##Φ(z) = P(Z ≤ z) = ∫ _{-∞}^x 1/√{2π} e^{-1/2 × u^2} du##. How does this relate to the two f(x) functions I mentioned in the above paragraph (=second bunch of text that doesn't skip lines in this post)?

P.S.
Sorry for the formatting screw-up of my initial post.
 
  • #4
s3a said:
To my knowledge, it is always the case that P(X = x) = 0 when dealing with continuous functions.

Why is the use of ##f(x) = 1/√(2πσ_X) e^{–(x – μ_X)^2 / {(2σ_X^2)}} (for –∞ < x < ∞)## for part b) and ##f(x) = 1/√(2πσ_{X | Y = 102}) e^{–(x – μ_{X | Y = 102})^2 / {(2σ_{X | Y = 102}^2)}} (for –∞ < x < ∞)## for part d) not contradicting my above statement?

Also, the sheet of paper that my school gives, which has the table for the normal distribution's Z values and probabilities, states that ##Φ(z) = P(Z ≤ z) = ∫ _{-∞}^x 1/√{2π} e^{-1/2 × u^2} du##. How does this relate to the two f(x) functions I mentioned in the above paragraph (=second bunch of text that doesn't skip lines in this post)?

P.S.
Sorry for the formatting screw-up of my initial post.

You need not worry about any contradictions regarding ##f_{X|Y}(x|Y=y)##, even though ##P(Y=y) = 0## for every ##y##. You can think of ##f_{X|Y}(x|Y=y)## as a limit of
[tex] f_{X|Y}(x|y < Y < y + \Delta y) = \frac{f_{XY}(x, y < Y < y+\Delta y)}{P(y < Y < y+\Delta y)}[/tex]
as ##\Delta y \to 0##.

I don't understand the rest of your question. Surely you know how to covert a probability such as ##P(X_{N(\mu, \sigma)} \leq x)## into the standard ##P(Z \leq z)##, where ##Z \sim N(0,1)##. You must have used it dozens of times already in your studies.
 
  • #5
Unless I'm misunderstanding something right now, it seems we're miscommunicating.

I feel that I have some fundamental terminology confusions.

Basically,
1. Is f(x) called a (continuous) probability distribution function?

2. In the continuous case, is a probability distribution function the integral from -∞ to x of a probability density function (such that the probability distribution function is a cumulative distribution function)?

3. Is it the case that f(x) = P(X = x), or is it the case that [ii] f(x) = P(X ≤ x) = P(X < x)? For what it's worth, I'm thinking that [ii] is correct.

4. Let's say I looked at a table of values for a normal distribution function, and I took some value of Z, z, and then used that to find X by using Z = (X - μ)/σ to get X = x = z * σ + μ, would computing f(x) = f(z * σ + μ) give me P(X ≤ x) = P(X < x)?

5. So, finding the marginal distribution function of X just means I should ignore the all the Y-related stuff like σ_Y, whereas a conditional probability distribution of X is similar in the sense that it also requires that I use the univariate normal probability distribution function instead of the bivariate one except that I take the Y-related stuff into account by finding a single μ and σ given by the ##μ_{X|Y=y} = μ_X + ρ σ_X/σ_Y(y–μ_Y)## and ##σ^2_{Y|X=x} = σ^2_Y(1–ρ^2)## formulas, right?
 
  • #6
s3a said:
is a probability distribution function the integral from -∞ to x of a probability density function (such that the probability distribution function is a cumulative distribution function)?
AFAIK, "probability distribution function" is not a defined thing. You can have a probability density function (PDF), and a corresponding cumulative distribution function (CDF). In the case that the first exists, it is the derivative of the second.
s3a said:
Is it the case that f(x) = P(X = x), or is it the case that [ii] f(x) = P(X ≤ x) = P(X < x)?
f(x0 (lower case) is usually reserved for the density function, and F(x) for the CDF. F(x) = P(X<x).
s3a said:
So, finding the marginal distribution function of X just means I should ignore the all the Y-related stuff
Not ignore, exactly. It means you need to integrate over y (not over x).
 
  • #7
Would the full answers to parts b and d involve an integral? (I say "full answers", because the answers given seem like sloppy shortcuts for those who already have a good understanding of the material.)
 
  • #8
s3a said:
Unless I'm misunderstanding something right now, it seems we're miscommunicating.

I feel that I have some fundamental terminology confusions.

Basically,
1. Is f(x) called a (continuous) probability distribution function?

2. In the continuous case, is a probability distribution function the integral from -∞ to x of a probability density function (such that the probability distribution function is a cumulative distribution function)?

3. Is it the case that f(x) = P(X = x), or is it the case that [ii] f(x) = P(X ≤ x) = P(X < x)? For what it's worth, I'm thinking that [ii] is correct.

4. Let's say I looked at a table of values for a normal distribution function, and I took some value of Z, z, and then used that to find X by using Z = (X - μ)/σ to get X = x = z * σ + μ, would computing f(x) = f(z * σ + μ) give me P(X ≤ x) = P(X < x)?

5. So, finding the marginal distribution function of X just means I should ignore the all the Y-related stuff like σ_Y, whereas a conditional probability distribution of X is similar in the sense that it also requires that I use the univariate normal probability distribution function instead of the bivariate one except that I take the Y-related stuff into account by finding a single μ and σ given by the ##μ_{X|Y=y} = μ_X + ρ σ_X/σ_Y(y–μ_Y)## and ##σ^2_{Y|X=x} = σ^2_Y(1–ρ^2)## formulas, right?

For a continuous random variable ##X##, ##P(X=x) = 0## for all ##x##. However, there is a probability density ##f(x)## such that for small ##\Delta x > 0## we have
[tex] P(x < X < x + \Delta x) = f(x) \, \Delta x + O((\Delta x)^2)[/tex],
so ##P(x < X < x | \Delta x) \doteq f(x) \Delta x##, accurate to first order in the small quantity ##\Delta x##. The cumulative distribution function (nowadays sometimes called just the distribution function---erasing the word "cumulative") is ##F(x) = P( X \leq x)##. (However, some writers use ##X < x## instead; the difference is immaterial for purely continuous ##X##, but the distinction is important when the random variable is discrete, or mixed, partly continuous and partly discrete.)

(2): YES, for continuous ##X##

(3): Everything you write there is false for a continuous random variable (that is, if you have already used the symbol ##f(x)## to denote the density). It is customary to use a lower-case letter for the density and the corresponding upper case letter for the (cumulative) distribution. This is convention, not law, so you are allowed to violate it provided that you explain your notation first.

(4) No. If ##f(x)## means the density, then ##P(X \leq x) = \int_{-\infty}^x f(t) \, dt## is certainly not equal to ##f(x)## or anything like it. If ##\phi (z)## denotes density of the standard normal random variable ##Z \sim N(0,1)##---that is, ##\phi(z) = \exp (-z^2/2)/\sqrt{2 \pi}## --- then for ##X \sim N(\mu,\sigma)## we have ##X = \mu + \sigma Z##, ##f(x) = \frac{1}{\sigma} \phi((x - \mu)/\sigma)## and ##F(x) = \Phi((x-\mu)/\sigma)##, where ##\Phi(z) = P(Z \leq z)##. The function ##\Phi## is tabulated and is also available in many hand-held calculators and in spreadsheets, etc.

(5) Sort of right. The marginal distribution of ##X## integrates over all the other random variables that accompany ##X## in a multivariate distribution. In the case of a multivariate normal with parameters ##\mu_x, \mu_y, \sigma_x^2, \sigma_y^2, \sigma_{xy}##, the integration can be done explicitly, and when you do that you end up with the marginal density
[tex] f_X(x) = \frac{1}{\sigma \sqrt{2 \pi}} \exp\left( - \frac{(x - \mu_x)^2}{2 \sigma_x^2} \right) [/tex]
One way to say it is that you ignore ##\mu_y, \sigma_y^2 , \sigma_{xy}##---anything with a "y" in it. However, the fact that you can ignore those things is a provably correct result, not just intjuition.

The case of ##f(x|y)## is different: you must have some of the y-related quantities in the formula, and the correct expressions are the ones you wrote. It is important to remember that in this case "y" acts like a constant parameter in the distribution of ##X##, while ##x## can vary over all ##\mathbb{R}##.
 
  • #9
Bro, where did you find the solution manual for this 6th edition. Or whichever edition you have. Please please share.
 

FAQ: Prob with Univariate&Bivariate&Marginal Normal Distributions

1. What is the difference between univariate, bivariate, and marginal normal distributions?

Univariate normal distribution refers to a probability distribution of a single variable, while bivariate normal distribution refers to a probability distribution of two variables. Marginal normal distribution, on the other hand, refers to the probability distribution of one variable in a multivariate normal distribution.

2. How is the mean and standard deviation calculated for univariate, bivariate, and marginal normal distributions?

The mean and standard deviation for a univariate normal distribution are calculated using the formula: mean = μ and standard deviation = σ. For a bivariate normal distribution, the mean is calculated using the formula: mean = (μx, μy) and the standard deviation is calculated using the formula: standard deviation = (σx, σy). In a multivariate normal distribution, the mean is calculated using the formula: mean = (μ1, μ2, ..., μn) and the standard deviation is calculated using the formula: standard deviation = (σ1, σ2, ..., σn).

3. What is the significance of the normal distribution in statistics?

The normal distribution is one of the most commonly used probability distributions in statistics due to its many useful properties. It is a symmetrical distribution with a bell-shaped curve, and many natural phenomena in the world tend to follow this distribution. It is also important because it allows for the use of various statistical tests and methods, such as the t-test and ANOVA.

4. How do you identify if a dataset follows a normal distribution?

There are several methods to identify if a dataset follows a normal distribution. One way is to visually inspect the data using a histogram or a normal probability plot. Another way is to use statistical tests such as the Shapiro-Wilk test or the Kolmogorov-Smirnov test. If the p-value of these tests is greater than 0.05, then we can assume that the data follows a normal distribution.

5. Can the normal distribution be used for non-numerical data?

No, the normal distribution is only applicable to numerical data. It assumes that the data is continuous and has a symmetric bell-shaped curve. Categorical or non-numerical data does not have these properties, and therefore, the normal distribution cannot be used to analyze such data.

Back
Top