- #1
fog37
- 1,569
- 108
- TL;DR Summary
- Unbiasdness of estimates
Hello (again).
I have a basic question about standard error and unbiased estimators.
Let's say we have a population and with a certain mean height and a corresponding variance. We can never know these two parameters, the mean and the variance, and we can only estimate them. Certainly, the more accurate the estimates, the better.
To achieve that, we want out estimators to be unbiased so the expectation value of the mean "tends" to the actual population mean. This unbiasdness is rooted in the theoretical idea that if we took many many many many samples and calculated their mean and created the sample distribution of the means, the mean of the means would be the actual population mean. However, the sample means are all different from each other, some are close to the true mean some are very off...The standard error of the sample mean is essentially the variance of that normal sampling distribution telling us how much the various sample means differ from each other.
That said, assuming it is correct, we only work with a single sample of size ##n## and have a single sample mean whose value could still be very very far from the actual mean, i.e. we could be very off! Isn't that a problem? The idea that ##E[sample mean]=true mean## seems very abstract. I know that, in statistics, we always have not choice but to deal with uncertainty...I guess knowing that ##E[sample mean]=true mean## gives us a little more confidence that our sample statistics is a decent result? It is a better type of uncertainty than other uncertainties...
The same idea applies to ##95% ##confidence intervals: if we took a million samples and calculated their ##CI##, the true population means would be captured inside 95% of those sample CI intervals. That is an interesting result but, working with a single sample as we always do, it may be well possible that our constructed ##CI## does not contain the true population parameter!
Thank you!
I have a basic question about standard error and unbiased estimators.
Let's say we have a population and with a certain mean height and a corresponding variance. We can never know these two parameters, the mean and the variance, and we can only estimate them. Certainly, the more accurate the estimates, the better.
To achieve that, we want out estimators to be unbiased so the expectation value of the mean "tends" to the actual population mean. This unbiasdness is rooted in the theoretical idea that if we took many many many many samples and calculated their mean and created the sample distribution of the means, the mean of the means would be the actual population mean. However, the sample means are all different from each other, some are close to the true mean some are very off...The standard error of the sample mean is essentially the variance of that normal sampling distribution telling us how much the various sample means differ from each other.
That said, assuming it is correct, we only work with a single sample of size ##n## and have a single sample mean whose value could still be very very far from the actual mean, i.e. we could be very off! Isn't that a problem? The idea that ##E[sample mean]=true mean## seems very abstract. I know that, in statistics, we always have not choice but to deal with uncertainty...I guess knowing that ##E[sample mean]=true mean## gives us a little more confidence that our sample statistics is a decent result? It is a better type of uncertainty than other uncertainties...
The same idea applies to ##95% ##confidence intervals: if we took a million samples and calculated their ##CI##, the true population means would be captured inside 95% of those sample CI intervals. That is an interesting result but, working with a single sample as we always do, it may be well possible that our constructed ##CI## does not contain the true population parameter!
Thank you!