Estimator for variance when sampling without replacement

In summary: But, this is just an approximation as the variance of the population mean is actually closer to\frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^3
  • #1
logarithmic
107
0
Does anyone know the formula for an unbiased estimator of the population variance [tex]\frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^2[/tex] when taking r samples without replacement from a finite population [tex]\{x_1, \dots, x_n\}[/tex] whose mean is [tex]\bar{x}[/tex]?

A google search doesn't find anything useful other than the the special cases of when r = n the estimator is of course [tex]\frac{r-1}{r}s^2[/tex], where [tex]s^2 = \frac{1}{r-1}\sum_{i=1}^{r}(x_i - \bar{x})^2[/tex] which is of course the unbiased estimator when taking r samples with replacement.

I know that a (relatively) simple formula exists, I've seen it somewhere before, just don't remember where.
 
Last edited:
Physics news on Phys.org
  • #3
MaxManus said:

Not that one. That's the distribution for the number of black balls drawn without replacement from a box with black and white balls. It isn't an estimator for the population variance.
 
  • #4
logarithmic said:
Not that one. That's the distribution for the number of black balls drawn without replacement from a box with black and white balls. It isn't an estimator for the population variance.

But isn't that the distribution you described? And isn't the variance formula on the right table an estimator?
 
  • #5
MaxManus said:
But isn't that the distribution you described? And isn't the variance formula on the right table an estimator?
Not really. I'm looking for an estimator, that is a function of the samples: f(X_1, ..., X_r) which itself is a random variable, such that E(f(X_1, ..., X_r)) = true population variance.

That variance formula isn't a random variable (i.e. it can't be an estimator), it's the variance of a certain random variable that counts. But a hypergeometric random variable isn't appropriate for measuring the count of the samples since, the number of samples is assumed to be fixed as r.
 
  • #6
logarithmic said:
Does anyone know the formula for an unbiased estimator of the population variance [tex]\frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^2[/tex] when taking r samples without replacement from a finite population [tex]\{x_1, \dots, x_n\}[/tex] whose mean is [tex]\bar{x}[/tex]?

Hi logarithmic. With that definition of [itex]\bar{x}[/itex] (which is actually the definition of the population mean) then the unbiased estimator of the population variance is simply,

[tex]\frac{1}{r}\sum_{i=1}^{r}(x_i - \bar{x})^2[/tex]

However I think you really meant for [itex]\bar{x}[/itex] to denote the sample mean of the "r" chosen samples rather than the population mean. In which case unbiased estimator is,

[tex]\left(\frac{n-1}{n}\right) \, \frac {1}{r-1}\sum_{i=1}^{r}(x_i - \bar{x})^2[/tex]
 

FAQ: Estimator for variance when sampling without replacement

What is an estimator for variance when sampling without replacement?

An estimator for variance when sampling without replacement is a statistical method used to estimate the variance of a population when a sample is taken without replacing the selected units. This method takes into account the finite population and adjusts for the decrease in variability caused by the lack of replacement.

How is an estimator for variance calculated when sampling without replacement?

An estimator for variance when sampling without replacement is calculated by using the formula: s² = (N-n)/(N-1) * (∑(x-x̄)²/n), where N is the population size, n is the sample size, x is the individual data point, and x̄ is the sample mean.

What are the assumptions made when using an estimator for variance when sampling without replacement?

The main assumptions made when using an estimator for variance when sampling without replacement are that the population is finite, the sample is taken randomly, and the sample size is less than 10% of the population size. Additionally, the observations in the sample are independent and the population has a normal distribution.

How accurate is an estimator for variance when sampling without replacement?

An estimator for variance when sampling without replacement is generally more accurate than an estimator when sampling with replacement. This is because it takes into account the finite population and adjusts for the decrease in variability. However, the accuracy of the estimator depends on the assumptions made and the sample size relative to the population size.

What are some advantages of using an estimator for variance when sampling without replacement?

One advantage of using an estimator for variance when sampling without replacement is that it takes into account the finite population, making it more accurate for small populations. It also reduces the variability in the estimate, making it more reliable. Additionally, this method is more suitable for populations that do not have a constant variance.

Similar threads

Replies
5
Views
998
Replies
2
Views
2K
Replies
2
Views
1K
Replies
8
Views
3K
Replies
7
Views
2K
Replies
11
Views
2K
Replies
1
Views
1K
Back
Top