Determine the mean square error of a simple distribution

In summary, determining the mean square error (MSE) of a simple distribution involves calculating the average of the squares of the differences between the observed values and the expected or predicted values. This is done by taking the sum of the squared deviations from the mean, dividing by the number of observations, or by applying the formula MSE = (1/n) * Σ(actual - predicted)², where n is the number of data points. MSE is a key metric for assessing the accuracy of a model or distribution, with lower values indicating better predictive performance.
  • #1
psie
269
32
Homework Statement
Consider $$f_{X,Y}(x,y)=\begin{cases} c,&\text{for } x,y\geq0, x+y\leq 1,\\ 0,&\text{otherwise}.\end{cases},$$where ##c## is some constant to be determined. Determine ##E(Y\mid X=x)## and ##E(X\mid Y=y)##. Moreover, determine the expected quadratic prediction error ##E(Y-d(X))^2## for the best predictor ##d(X)## of ##Y## based on ##X##.
Relevant Equations
The best predictor is the conditional expectation, i.e. ##h(X)=E(Y\mid X)##.
What troubles me about this exercise is that I don't get the answer that the book gets regarding the expected quadratic prediction error.

##c## is determined by $$1=\int_0^1\int_0^{1-x} c\,dydx=c\int_0^1(1-x)\,dx=c\left[-\frac{(1-x)^2}{2}\right]_0^1=\frac{c}2,$$so ##c=2##. The marginal density of ##X## is $$f_X(x)=\int_0^{1-x}2\,dy=2(1-x),\quad 0<x<1.$$And the conditional one is $$f_{Y\mid X=x}(y)=\frac{f_{X,Y}(x,y)}{f_X(x)}=\frac2{2(1-x)}=\frac1{1-x},\quad 0<y<1-x.$$Finally, $$E(Y\mid X=x)=\int_0^{1-x}y\cdot\frac1{1-x}\,dy=\frac1{1-x}\left[\frac{y^2}{2}\right]_0^{1-x}=\frac{(1-x)^2}{2(1-x)}=\frac{1-x}{2}.$$ By symmetry, ##E(X\mid Y=y)=\frac{1-y}{2}##.

I am confident everything is correct up to this point, as this is actually an example in the book and done exactly the same way. But the next part is omitted in the book, i.e. determining the expected quadratic prediction error ##E(Y-E(Y\mid X))^2##, where ##E(Y\mid X)=(1-X)/2##. We can simplify as follows \begin{align*}E(Y-E(Y\mid X))^2&=E(Y-(1-X)/2)^2 \\ &=E(Y^2+(1-X)^2/4-Y(1-X)) \\ &=E\left(Y^2+\frac14-\frac{X}{2}+\frac{X^2}{4}-Y+YX\right).\end{align*} Since ##X,Y## have the exact same distribution, we can replace ##Y## with ##X## except in the last term I believe, i.e. except in ##YX##. So we have $$E\left(\frac{5X^2}{4}+\frac14-\frac{3X}{2}+YX\right)=\frac54E(X^2)+\frac14-\frac32E(X)+E(YX).$$ I used WolframAlpha to compute the three expectations on the right-hand side of this last equation:

##E(X)=\frac13##: first integral
##E(X^2)=\frac16##: second integral
##E(XY)=\frac1{12}##: third integral

Therefor $$E(Y-(1-X)/2)^2 =\frac54\cdot\frac16+\frac14-\frac32\cdot\frac13+\frac1{12}=\frac1{24}.$$The book gets ##\frac1{48}##.
 
  • Like
Likes Gavran and docnet
Physics news on Phys.org
  • #2
I have tried the calculation a couple of different ways, and get the same answer as you. I suspect they may have forgotten (as I almost did) to include the constant c in the calculation.
 
  • Like
Likes psie
  • #3
Your result is correct.
The conditional probability distribution of Y given X is a continuous uniform distribution. You can check the result by using the next expression $$ E((Y-E(Y|X))^2) = E(Var(Y|X)) $$ and by using the formula for calculating a variance of a continuous uniform distribution.
 
  • Like
Likes psie
  • #4
I assume the " Best predictor" is some estimator. But then what would " best" mean here, as there are estimators that may have, e.g. , minimal variance, be consistent, etc.
 
  • #5
WWGD said:
I assume the " Best predictor" is some estimator. But then what would " best" mean here, as there are estimators that may have, e.g. , minimal variance, be consistent, etc.
Here we are talking about the predictor of ## Y ## based on ## X ## with the lowest mean squared error among all possible estimators of ## Y ## based on ## X ##.
 
  • Like
Likes psie

FAQ: Determine the mean square error of a simple distribution

What is mean square error (MSE)?

Mean square error (MSE) is a measure of the average squared differences between predicted values and actual values. It quantifies the accuracy of a model by indicating how close the predictions are to the true outcomes. A lower MSE indicates a better fit of the model to the data.

How do you calculate mean square error for a simple distribution?

To calculate the mean square error for a simple distribution, follow these steps: first, find the mean of the distribution. Then, for each data point, calculate the squared difference between the data point and the mean. Finally, take the average of these squared differences to obtain the MSE.

What is the difference between MSE and variance?

The main difference between mean square error (MSE) and variance is that MSE measures the average squared deviation of predicted values from actual values, while variance measures the average squared deviation of a set of values from their mean. MSE is used for assessing model performance, whereas variance describes the spread of a dataset.

Can MSE be negative?

No, mean square error (MSE) cannot be negative. Since MSE is calculated as the average of squared differences, all squared values are non-negative. Therefore, the result of the MSE calculation will always be zero or positive.

What are the implications of a high MSE value?

A high mean square error (MSE) value indicates that the predictions made by a model are far from the actual values, suggesting poor model performance. This may imply that the model is not adequately capturing the underlying patterns in the data, which could lead to inaccurate predictions and unreliable conclusions.

Back
Top