Help with proof of sample variance

In summary, the conversation discusses the formulation of sample variance and its relationship to the population variance. The speaker questions why the population variance is used instead of the sample variance when calculating the expected value of the sample variance. However, they later realize that y_i represents a random draw from the population, making its variance the population variance.
  • #1
fadecomic
10
0
Hi,

I'm trying to prove to myself that the formulation of sample variance as [tex]s^2=\frac{1}{n-1}\sum_{i=1}^{n}(y_i - \mu_y)^2[/tex] is an unbiased estimator of the population variance [tex]\sigma^2[/tex]. Of course, I proceed by checking the expected value of the sample variance, which flows smoothly until I get to [tex]\frac{n-1}{n}E[s^2] = E[y_i^2]-E[\mu_y^2][/tex]. Fine. Now, from the definition of variance: [tex]V[y_i^2]=E[y_i^2]-(E[y_i])^2[/tex]. Most of the sources I've checked insert the population variance here for [tex]V[y_i][/tex] (which is what you get to if you sub in the above). That make absolutely no sense to me. Shouldn't this be the sample variance since [tex]y_i[/tex] is the sample? Calling it the population variance seems circular to me. Can someone explain to me why this variance is the population variance and not the sample variance?

Thanks.
 
Physics news on Phys.org
  • #2
Oh, never mind. I figured it out. Above, y_i is a random draw from the population, not the sample, so by definition, its variance is the population variance, not the sample variance.
 

FAQ: Help with proof of sample variance

What is sample variance?

Sample variance is a measure of the spread or variability of a set of data points in a sample. It is calculated by taking the sum of the squared differences between each data point and the mean, and then dividing by the number of data points minus one.

Why is sample variance important?

Sample variance is important because it gives us an idea of how spread out the data points are in a sample. This can help us understand the overall distribution of the data and make inferences about the population from which the sample was taken.

How is sample variance different from population variance?

Sample variance is calculated using only a subset of data points from a larger population. Population variance, on the other hand, is calculated using all of the data points in a population. Sample variance is an estimate of the population variance.

What is the formula for calculating sample variance?

The formula for calculating sample variance is:
s^2 = Σ(x- x̄)^2 / (n-1)
Where x is each data point, x̄ is the sample mean, and n is the number of data points in the sample.

How can I use sample variance in my research or experiments?

Sample variance can be used in a variety of ways in research and experiments. It can help to assess the reliability of data, compare groups or treatments, and make predictions about a larger population. It can also be used to identify outliers or unusual data points that may need further investigation.

Similar threads

Back
Top