- #1
- 5,199
- 38
I'm trying to figure out what is going on with the proof of the following lemma as presented by my prof. There seem to be some notational inconsistencies. I also found myself questioning my understanding of integration variables:
Lemma
Suppose X is a continuous random variable with probability density function f.
f(x) = 0 whenever x < 0. Then:
[tex] E[X] = \int_0^{\infty}{P(X>x)dx} [/tex]
Proof
[tex] \int_0^{\infty}{P(X>x)dx} = \int_0^{\infty}\left[{\int_x^{\infty}{f(y)dy}\right]dx} [/tex]
So he substituted an expression for the probability (the integrand) which is *itself* an integral of the density function. Now, before I continue, I'd like to make sure I understand the introduction of the dummy variable y. Basically, we want the probability in question, but x is unspecified. The limit of integration is a variable...so the resulting probability will be expressed as a function of x, which is what we want. The question arises...what are we integrating the density function with respect to? As I understand it: we are STILL integrating the density function wrt to the values (between x and infinity) that X can take on. So the independent variable represents the same quantity in principle. We are just denoting it by 'y' to distinguish it from the lower limit of those values, x. P is only a function of that lower limit, not of all of those values over which we integrated. They are gone because we integrated over them. Am I right?
Next, the prof noted that rather than integrating the density fcn. from x to infinity, we could integrate across the whole domain of the density function, provided we introduced something to sift out the undesirable values of f(y) for which y <= x. So he introduced an indicator variable:
[tex] \text{Let} \ \ I_{y>x} = \left\{\begin{array}{cc}1 & \text{if} \ \ y > x \\0 & \text{if} \ \ y \leq x \end{array} [/tex]
With this indicator variable, the preceding integral becomes:
[tex] \int_0^{\infty}{\int_0^{\infty}{I_{y>x} f(y) dy}dx} = \int_0^{\infty}{dyf(y)\int_0^{\infty}{dx I_{y>x}} [/tex]
I was going to ask a question about this step, but based on my previous thoughts, I think I now understand the separation of the the two integrals w.r.t x and y. y is independent of x. f(y) has some value, regardless of what the value of your "marker point" for the integration of the density fcn, x, is.
That rightmost integral becomes 'y', because the integrand is non zero only when x is less than y, so we have the integral from zero to y of dx which becomes [y-0] = y. Substituting, the integral becomes:
[tex] \int_0^{\infty} {yf(y)dy} = E[Y] [/tex]
[tex] \text{Q.E.D} [/tex]
What's up with that last line!? There is NO random variable called 'Y' anywhere in this lemma. We were trying to prove that it was E[X]! Furthermore, I have no problem believing that that integral in the last line *IS* E[X]. After all, f is the probability density function of X. It doesn't matter what symbol we use...y still represents the possible values that X can take on i.e. y is an argument of f. I think it should say E[X], and that the prof just absentmindedly put E[Y] when he got to the end of his proof, because he saw 'integral of f of y dy " on the board. However, I would like that, and all of my other inferences, confirmed independently.
Thanks.
Lemma
Suppose X is a continuous random variable with probability density function f.
f(x) = 0 whenever x < 0. Then:
[tex] E[X] = \int_0^{\infty}{P(X>x)dx} [/tex]
Proof
[tex] \int_0^{\infty}{P(X>x)dx} = \int_0^{\infty}\left[{\int_x^{\infty}{f(y)dy}\right]dx} [/tex]
So he substituted an expression for the probability (the integrand) which is *itself* an integral of the density function. Now, before I continue, I'd like to make sure I understand the introduction of the dummy variable y. Basically, we want the probability in question, but x is unspecified. The limit of integration is a variable...so the resulting probability will be expressed as a function of x, which is what we want. The question arises...what are we integrating the density function with respect to? As I understand it: we are STILL integrating the density function wrt to the values (between x and infinity) that X can take on. So the independent variable represents the same quantity in principle. We are just denoting it by 'y' to distinguish it from the lower limit of those values, x. P is only a function of that lower limit, not of all of those values over which we integrated. They are gone because we integrated over them. Am I right?
Next, the prof noted that rather than integrating the density fcn. from x to infinity, we could integrate across the whole domain of the density function, provided we introduced something to sift out the undesirable values of f(y) for which y <= x. So he introduced an indicator variable:
[tex] \text{Let} \ \ I_{y>x} = \left\{\begin{array}{cc}1 & \text{if} \ \ y > x \\0 & \text{if} \ \ y \leq x \end{array} [/tex]
With this indicator variable, the preceding integral becomes:
[tex] \int_0^{\infty}{\int_0^{\infty}{I_{y>x} f(y) dy}dx} = \int_0^{\infty}{dyf(y)\int_0^{\infty}{dx I_{y>x}} [/tex]
I was going to ask a question about this step, but based on my previous thoughts, I think I now understand the separation of the the two integrals w.r.t x and y. y is independent of x. f(y) has some value, regardless of what the value of your "marker point" for the integration of the density fcn, x, is.
That rightmost integral becomes 'y', because the integrand is non zero only when x is less than y, so we have the integral from zero to y of dx which becomes [y-0] = y. Substituting, the integral becomes:
[tex] \int_0^{\infty} {yf(y)dy} = E[Y] [/tex]
[tex] \text{Q.E.D} [/tex]
What's up with that last line!? There is NO random variable called 'Y' anywhere in this lemma. We were trying to prove that it was E[X]! Furthermore, I have no problem believing that that integral in the last line *IS* E[X]. After all, f is the probability density function of X. It doesn't matter what symbol we use...y still represents the possible values that X can take on i.e. y is an argument of f. I think it should say E[X], and that the prof just absentmindedly put E[Y] when he got to the end of his proof, because he saw 'integral of f of y dy " on the board. However, I would like that, and all of my other inferences, confirmed independently.
Thanks.