Standard Deviation Versus Sample Size & T-Distribution

In summary, the standard deviation of a t-Distribution decreases as the degree of freedom (and, thus, also the sample size) increases due to the fact that more data tends to give a more accurate estimate of the true population standard deviation. However, the sample standard deviation may still underestimate the population standard deviation if the sample mean is divided by n, and the true population mean is divided by n or the sample mean is divided by (n-1). For the degree of the t-distribution, it is recommended to use the n or (n-1) that was divided by. Additionally, it should be noted that the population standard deviation is a biased estimate and finding an unbiased estimator for a normal standard deviation is a challenging task.
  • #1
OpheliaM
7
1
I don't understand why does the standard deviation of a t-Distribution decreases as the degree of freedom (and, thus, also the sample size) increases when the sample standard deviation underestimates the population standard deviation?
 
Physics news on Phys.org
  • #2
OpheliaM said:
I don't understand why does the standard deviation of a t-Distribution decreases as the degree of freedom (and, thus, also the sample size) increases
More data tends to give a more accurate estimate of the true population standard deviation.
when the sample standard deviation underestimates the population standard deviation?
The sample standard deviation underestimates the population standard deviation if you use the sample mean and divide by n. If you use the true population mean and divide by n or use the sample mean and divide by (n-1) that is not true.(CORRECTION: it is still under-estimated. See @Number Nine 's post below) For the degree of the t-distribution, you should use the n or (n-1) that you divided by.

PS. Just to be more clear. The sample mean should always be the sum of the sample divided by n. When I say "use the sample mean and divide by (n-1)", I mean that the sum of squares of deviations from the sample mean are divided by (n-1). That is Bessel's correction. (see https://en.wikipedia.org/wiki/Bessel's_correction )
 
Last edited:
  • #3
FactChecker said:
More data tends to give a more accurate estimate of the true population standard deviation.The sample standard deviation underestimates the population standard deviation if you use the sample mean and divide by n. If you use the true population mean and divide by n or use the sample mean and divide by (n-1) that is not true. For the degree of the t-distribution, you should use the n or (n-1) that you divided by.

PS. Just to be more clear. The sample mean should always be the sum of the sample divided by n. When I say "use the sample mean and divide by (n-1)", I mean that the sum of squares of deviations from the sample mean are divided by (n-1). That is Bessel's correction. (see https://en.wikipedia.org/wiki/Bessel's_correction )

A minor point: the "population standard deviation" (i.e. the square root of the sum of squared deviations from the mean, divided by n-1) is actually a biased estimate of the standard deviation. This follows from Jensen's inequality, since the square root is a concave function. It's fairly difficult to find an unbiased estimator of a normal standard deviation, and the corrections have no closed form -- see https://en.wikipedia.org/wiki/Unbia...deviation#Results_for_the_normal_distribution
 
  • Like
Likes FactChecker
  • #4
Number Nine said:
A minor point: the "population standard deviation" (i.e. the square root of the sum of squared deviations from the mean, divided by n-1) is actually a biased estimate of the standard deviation. This follows from Jensen's inequality, since the square root is a concave function. It's fairly difficult to find an unbiased estimator of a normal standard deviation, and the corrections have no closed form -- see https://en.wikipedia.org/wiki/Unbia...deviation#Results_for_the_normal_distribution
I stand corrected. Thanks. I will correct my prior post.
 

FAQ: Standard Deviation Versus Sample Size & T-Distribution

What is standard deviation and why is it important?

Standard deviation is a measure of how spread out a set of data is from its mean. It is important because it helps us understand the variability or dispersion of data within a population or sample.

How does sample size affect standard deviation?

The larger the sample size, the more accurate the estimate of the standard deviation will be. As sample size increases, the variability within the data is better represented and the standard deviation becomes a more reliable measure of dispersion.

What is the relationship between standard deviation and the t-distribution?

The t-distribution is a probability distribution that is used when the sample size is small and the population standard deviation is unknown. It is similar to the normal distribution, but accounts for the increased uncertainty in smaller sample sizes. The standard deviation is used to calculate the t-statistic, which is used to determine the probability of obtaining a certain sample mean.

How does sample size affect the shape of the t-distribution?

As sample size increases, the t-distribution becomes more similar to the normal distribution. This means that the tails of the t-distribution become less thick and more symmetric, and the distribution becomes less dependent on the sample size and more on the population standard deviation.

How does the t-distribution help with hypothesis testing?

The t-distribution is used in hypothesis testing to calculate the probability of obtaining a certain sample mean. By comparing the calculated t-statistic to a critical value from the t-distribution, we can determine if the sample mean is significantly different from the population mean. This helps us make decisions about our hypotheses and draw conclusions based on our data.

Similar threads

Back
Top