Which Estimator Minimizes Expected Square Error for E[g(X,Y)]?

  • Thread starter Apteronotus
  • Start date
In summary, the conversation discusses the problem of estimating the true value of a random variable Z, which is a function of two other random variables X and Y. Two potential estimators are considered: Z=E[g(X,Y)] and Z=g(E[X],E[Y]). The question of which estimator is "better" is brought up, but it is noted that the concept of "best" needs to be defined in statistical terms. Various approaches for finding the "best" estimator are discussed, such as Maximum Likelihood Estimation and finding the highest probability density. The importance of distinguishing between the mean of a distribution, the mean of a sample, and an estimator for the mean is emphasized.
  • #1
Apteronotus
202
0
Suppose X and Y are r.v.
Suppose also that we get N samples of a r.v. Z which depends on X and Y. That is Z=g(X,Y).

Which is a better estimate of the true value of Z?

[itex]Z=E[g(X,Y)][/itex]
or
[itex]Z=g(E[X],E[Y])[/itex]
 
Physics news on Phys.org
  • #2
Apteronotus said:
Suppose X and Y are r.v.
Suppose also that we get N samples of a r.v. Z which depends on X and Y. That is Z=g(X,Y).

Which is a better estimate of the true value of Z?

[itex]Z=E[g(X,Y)][/itex]
or
[itex]Z=g(E[X],E[Y])[/itex]

Hey Apteronotus.

You can't actually estimate Z since Z is a random variable and not a parameter: you need to be careful about using estimation in this context.

Z is a random variable, so if you wanted to get the population mean of Z then you calculate E[Z] = E[g(X,Y)].

Remember that estimation concerns estimating something that is essentially fixed, like mu, sigma, lambda, and so on and we construct distributions in statistical theory to estimate either exactly or approximately a distribution for that particular estimator.
 
  • #3
Hi Chiro,

Thank you for your reply.

The situation is that my z is in fact fixed. Its value depends on two other variables x and y.
I have a model/function which calculates the true value of z. That is
z=g(x,y)

Now, the problem is that I have added noise to my x & y variables:
X=x+noise
Y=y+noise

Using these noisy inputs, I get a noisy output Z=g(X,Y)
Since the noise has zero mean
E[X]=x, and
E[Y]=y

I was wondering whether g(E[X],E[Y]) or E[g(X,Y)] would bring me closer to the actual value z?
 
  • #4
Apteronotus said:
I was wondering whether g(E[X],E[Y]) or E[g(X,Y)] would bring me closer to the actual value z?

I think that if you manage to state your question precisely, the answer will be E( g(X,Y)), but you haven't defined the meaning of "best" in your original post and to say a random result is "closer" to something has no specific meaning. A random variable has no deterministic "close-ness" to anything unless the "close-ness" is defined in statistical terms and there are different ways of doing that.

Perhaps you want to minimized the expected value of the square of the difference between an estimator of the the mean value of g(x,y) and the actual mean value of g(x,y).

In estimation theory, there are "least squares" estimators, "maximum likelihood" estimators, "minimum variance" estimators, etc. Each is "best" according to a different criteria.
 
  • #5
Stephen. Thank you for taking the time.

I'm hoping that the meaning of "best" becomes apparent from my second post.

I guess I would define it as follows:
Which quantity is smaller

[itex]
\left\{E[g(X,Y)]-g(x,y)\right\}^2
[/itex]
or

[itex]
\left\{g(E[X],E[Y])-g(x,y)\right\}^2
[/itex]

where
[itex]
X=x+noise \qquad \mbox{and} \qquad Y=y+noise
[/itex]
 
  • #6
Clearly the second quantity [itex]\left\{g(E[X],E[Y])-g(x,y)\right\}^2=0[/itex], as

[itex]E[X]=E[x+noise]=x \qquad \mbox{and}\qquad E[Y]=E[y+noise]=y[/itex]

Since, the first quantity, [itex]\left\{E[g(X,Y)]-g(x,y)\right\}^2\ge0[/itex] then I guess to answer my question, the first one must be "Better".
 
  • #7
Have you tried considering estimation schemes that find the value of Z where the probability of Z is maximal (like they do to find point estimates for parameters using Maximum Likelihood Estimation)?

Another technique you can also do is to find the highest probability density for a given probability value that finds the highest simple region of Z that corresponds to the integral of that region being the value of the probability. For example if p = 0.1 then the HPD will correspond to a region where the probability is maximized and the region itself is minimized.

So the above show two approaches: one is a point-estimate approach and the other is an interval/region approach.
 
  • #8
Apteronotus said:
I guess I would define it as follows:
Which quantity is smaller

[itex]
\left\{E[g(X,Y)]-g(x,y)\right\}^2
[/itex]
or

[itex]
\left\{g(E[X],E[Y])-g(x,y)\right\}^2
[/itex]

You can consider answer questions only if you know E(X),E(Y) and E(g(X,Y)). If you already know those quantities, what statistical problem are you trying to solve?

If lower case "x" and "y" denote random variables in those expression, the expressions themselves take-on random values, so you can't claim anything about which one of them is smaller.

In your next post, you seem to say E[X] = x. That would imply x is a constant. So what is "x"? is it a constant or is it a random variable?

A typical scenario is statistics would be that we are trying to estimate E[ g(x,y)] from a sample. We define some function W of the sample data. This function is an "estimator". I think you want to ask which is the "best" estimator for E(g(x,y)). Is it the function defined by W1 = the mean value of g(xi, yi) taken over all data points (xi,yi) ? Or is it the function defined by W2 = g( x_bar, y_bar) where x_bar and y_bar are the mean of the samples x1,x2,.. and y1,y2,... respectively.

One way to define a "best" estimator W (x1,x2,...y1,y2,..) is to say that it minimized the expected square error.

I.e. it minimizes E[ ( w(x1,x2,..,y2,y2...) - E(g(x,y))^2 ]

Note that you have to have two expectations in this expression. If you leave off the one on the left, the expression is a random quantity which varies with the data.

I think the language your are using in your thoughts is failing to distinguish among the following different concepts.

1) The mean of a distribution
2) The mean of sample that is drawn from that distribution
3) An estimator for the mean of the distribution

Similar distinctions hold for other statistical quantities, such as the standard deviation, the variance, the mode etc.
 

FAQ: Which Estimator Minimizes Expected Square Error for E[g(X,Y)]?

1. What is the meaning of "G(E[X],E[Y])"?

"G(E[X],E[Y])" is a statistical notation that represents the expected value of the function "G" when applied to the random variables X and Y. In simpler terms, it is the average value of the function when considering all possible outcomes of X and Y.

2. How is "G(E[X],E[Y])" calculated?

To calculate "G(E[X],E[Y])", first the expected values of X and Y must be determined. Then, these values are substituted into the function G, and the resulting value is the expected value of the function.

3. Can "G(E[X],E[Y])" be negative?

Yes, "G(E[X],E[Y])" can be negative. This means that the average value of the function G is negative, which is possible if the function has negative values for some outcomes of X and Y.

4. What is the significance of "E[g(X,Y)]"?

"E[g(X,Y)]" is another way of representing the expected value of the function g when applied to the random variables X and Y. It is similar to "G(E[X],E[Y])", but allows for a more general function g to be used.

5. How is "E[g(X,Y)]" different from "G(E[X],E[Y])"?

The main difference between "E[g(X,Y)]" and "G(E[X],E[Y])" is the function being used. "E[g(X,Y)]" allows for a more general function g to be used, while "G(E[X],E[Y])" typically represents a specific function G that has been chosen for a particular purpose. Additionally, "E[g(X,Y)]" may not always be defined, while "G(E[X],E[Y])" is always defined as long as the expected values of X and Y exist.

Similar threads

Replies
1
Views
712
Replies
5
Views
1K
Replies
5
Views
795
Replies
3
Views
1K
Replies
1
Views
2K
Replies
1
Views
650
Back
Top