Predict Z-Score for Y Given X at 30th Percentile

  • MHB
  • Thread starter dlee
  • Start date
In summary, the correlation between X and Y is 0.7 and the z-score for Y is -0.364 if X is at the 30th percentile.
  • #1
dlee
4
0
Consider two random variables X,Y whose correlation is ρ = 0.7 (and the joint PMF is football shaped). Predict the z-score for Y if you observe that X is at the 30th percentile (assuming X ~ N(4,4)).

The solution to this problem is -0.364, but I'm not sure how to approach this answer.
 
Physics news on Phys.org
  • #2
Re: Correlation?

dlee said:
Consider two random variables X,Y whose correlation is ρ = 0.7 (and the joint PMF is football shaped). Predict the z-score for Y if you observe that X is at the 30th percentile (assuming X ~ N(4,4)).

The solution to this problem is -0.364, but I'm not sure how to approach this answer.

I we assume a bivariate normal distribution, we "expect" the relation:
$$y(x) = \text{sgn}(\rho) \frac {\sigma_Y}{\sigma_X} (x - \mu_X) + \mu_Y$$

With X at the 30th percentile, that means $z_X = \frac{x - \mu_X}{\sigma_X} = \text{invNorm}(0.30) = -0.524$.

In other words, the z-score for Y is
$$z_Y = \frac{y - \mu_Y}{\sigma_Y} = \text{sgn}(\rho) z_X = -0.524$$

I don't know how they got to -0.364.
 
  • #3
Re: Correlation?

I like Serena said:
I we assume a bivariate normal distribution, we "expect" the relation:
$$y(x) = \text{sgn}(\rho) \frac {\sigma_Y}{\sigma_X} (x - \mu_X) + \mu_Y$$

With X at the 30th percentile, that means $z_X = \frac{x - \mu_X}{\sigma_X} = \text{invNorm}(0.30) = -0.524$.

In other words, the z-score for Y is
$$z_Y = \frac{y - \mu_Y}{\sigma_Y} = \text{sgn}(\rho) z_X = -0.524$$

I don't know how they got to -0.364.

That can't be right.

You can without loss of generality assume \(\displaystyle \mu_X = \mu_Y = 0\), so we have a model:

$$y=\alpha x$$

then $\displaystyle \sigma_Y=\alpha\; \sigma_X$, and $\rho=E(XY)/(\sigma_X \sigma_Y)=\alpha\; \sigma_X/\sigma_Y$

Hence: $$\alpha=\rho \frac{\sigma_Y}{\sigma_X}$$...

.
 
  • #4
Re: Correlation?

zzephod said:
That can't be right.

You can without loss of generality assume \(\displaystyle \mu_X = \mu_Y = 0\)

I didn't.
The problem asks for a z-score, meaning $\mu_X$, and $\mu_Y$ get eliminated (see my derivation).

so we have a model:

$$y=\alpha x$$

then $\displaystyle \sigma_Y=\alpha\; \sigma_X$, and $\rho=E(XY)/(\sigma_X \sigma_Y)=\alpha\; \sigma_X/\sigma_Y$

Hence: $$\alpha=\rho \frac{\sigma_Y}{\sigma_X}$$...

Well... multiplying by 0.7 almost gives the requested result.
But that won't be right.
 
  • #5
Re: Correlation?

I like Serena said:
... Well... multiplying by 0.7 almost gives the requested result.
But that won't be right.

It will be if you use "nearest value" in inverse normal lookup in a table.

.
 
Last edited:
  • #6
Re: Correlation?

I like Serena said:
I didn't.
The problem asks for a z-score, meaning $\mu_X$, and $\mu_Y$ get eliminated (see my derivation).

Well, since you failed to set up a model with the correct correlation it is not irrelevant to make an observation that simplifies setting the correlation without changing the answer.

.
 
Last edited:
  • #7
Re: Correlation?

zzephod said:
Well, since you failed to set up a model with the correct correlation it is not irrelevant to make an observation that simplifies setting the correlation without changing the answer.

.

The model is a positive sloped football that could be anywhere.
The problem puts the heart at x=4 with a variance of 4.
The y coordinate of the heart and the slope can still be freely chosen.
Then, with the given correlation, the "width" of the football becomes fixed.

Either way, when talking about the z-score of y, all these choices become moot, since they are standardized.
The relationship between $E(z_Y|z_X)$ and $z_X$ is simply $E(z_Y|z_X) = z_X$, whichever model you pick.
This is a "standardized" football that is aligned on the line y=x with a width such that the correlation is satisfied.
 
Last edited:
  • #8
Re: Correlation?

I like Serena said:
The model is a positive sloped football that could be anywhere.
The problem puts the heart at x=4 with a variance of 4.
The y coordinate of the heart and the slope or can still be freely chosen.
Then, with the given correlation the "width" of the football becomes fixed.

Either way, when talking about the z-score of y, all these choices become moot, since they are standardized.
The relationship between $E(z_Y|z_X)$ and $z_X$ is simply $E(z_Y|z_X) = z_X$, whichever model you pick.
This is a "standardized" football that is aligned on the line y=x with a width such that the correlation is satisfied.

Since for Bivariate normal rv $X,\ Y$:

$$E(Y|X)=\rho\; \frac{\sigma_Y}{\sigma_X}Y$$

So as $z_X,\ z_Y$ have the same correlation coefficient as $X$ and $Y$ we have:

$$E(z_Y|z_X) = \rho\; z_X$$

See: http://athenasc.com/Bivariate-Normal.pdf.

... And simulation confirms this.

.
 
Last edited:

FAQ: Predict Z-Score for Y Given X at 30th Percentile

What is a Z-score?

A Z-score, also known as a standard score, is a statistical measure that indicates how many standard deviations a particular value is above or below the mean of a data set. It is used to compare values from different data sets and determine their relative position.

How is the Z-score calculated?

The Z-score is calculated by subtracting the mean of the data set from the given value and then dividing by the standard deviation. The formula is (x - μ) / σ, where x is the given value, μ is the mean, and σ is the standard deviation.

What does a Z-score of 0 mean?

A Z-score of 0 means that the given value is equal to the mean of the data set. This indicates that the value is at the 50th percentile, or the middle point, of the data set.

How is the Z-score used to predict values at a specific percentile?

The Z-score can be used to predict values at a specific percentile by finding the corresponding Z-score on a standard normal distribution table and then using that Z-score to calculate the predicted value using the formula (Z * σ) + μ. This will give the value at the desired percentile.

What is the significance of predicting a Z-score for a specific percentile?

Predicting a Z-score for a specific percentile allows for comparison and interpretation of data. It can help identify outliers and determine the relative position of a particular value within a data set. It can also be used to make predictions and identify patterns in data.

Similar threads

Replies
5
Views
1K
Replies
30
Views
3K
Replies
3
Views
2K
Replies
1
Views
905
Replies
5
Views
11K
Back
Top