How can I calculate the uncertainty of the mean with Gaussian error propagation?

ida10 · Nov 18, 2024

I thought that maybe it would be a good idea to do gaußian error propagation of the formular of the mean, this should give me the uncertainty of the average i calculate from the sample i have...
And additionally consider the standard deviation
Can someone maybe give me a detailed way to handle something like this... on the internet/books i looked at only look at samples where the individual meassures dont have uncertainties...

jbriggs444 · Nov 18, 2024

ida10 said:

Homework Statement: Hello,
I have measured the length of a tube 16 times
Now each of the individual messures has a uncertainty due to the measuring device.
I now want to calculate the mean and its uncertainty
Relevant Equations: /

I thought that maybe it would be a good idea to do gaußian error propagation of the formular of the mean, this should give me the uncertainty of the average i calculate from the sample i have...
And additionally consider the standard deviation
Can someone maybe give me a detailed way to handle something like this... on the internet/books i looked at only look at samples where the individual meassures dont have uncertainties...

One can spend a semester learning about just the purely statistical aspect of this. Then there is also the experimental physics side including things like quantization error and systematic error. I will not touch at all on those in the text that follows.

On the statistical side, one would begin by making some assumptions about your measurement process.

Assumption: Each measurement is a random process. The result will have some probability distribution.

Assumption: The measurements are independent and identically distributed.

You do not know what the distribution is. You do not know what its mean is. All you have is your 16 measurements.

The distribution does have a mean. We call it the "population mean". You are trying to estimate this mean.
The distribution does have a standard deviation. We call it the "population standard deviation". You are trying to estimate this standard deviation.

If you calculate the mean of the measurements in your sample, that give you the "sample mean". The sample mean is an unbiased estimator for the mean of the distribution.

The standard deviation is trickier. The obvious move is to compute the sample variance. The standard deviation is the square root of the variance. One computes the variance by adding up the squared difference of each measurement from the sample mean:$$V = \sum \frac{({x_i}-x_\text{avg})^2}{n}$$
It turns out that this is not an unbiased estimator. The sample mean will tend to be closer to the measured values. The population mean a bit farther away. So the above formula will tend to underestimate the standard deviation. The remedy for this is to divide by ##n-1## instead of ##n##:$$\text{Standard Deviation Estimate} = \sigma = \sqrt{V} = \sqrt{\sum \frac{(x_i - x_\text{avg})^2}{n-1}}$$To be clear, I took a 400 level class in statistics about 47 years ago and I am regurgitating my understanding from memory. You might be better served by looking at the Wiki article on standard deviation.

haruspex · Nov 18, 2024

Just to add…

The terminology “population mean" etc. comes from considering distributions of very large but finite populations, like the entire populace of a country. In the context of the measurements you are making, it is the infinite number of measurements you could make.

I caution against looking up formulas and applying them without thinking a bit more about what is really going on. Why will your measurements produce varied results? Is there a systematic error? There is sure to be a quantisation error.
Consider using a a ruler marked only in cm intervals. If the actual length is 3.77cm, it is likely all your measurements will record 4cm. That is a systematic error.
If marked in mm intervals, but is 2mm thick, parallax may produce a randomised quantisation error.

Interestingly, an added random error can produce a more accurate result. Without the parallax error, you might well take 16 measurements that all say 38mm. With it, you get a mix of 38mm and 37mm (and possibly some above and below that), but with enough samples you will get a mean closer to 37.7.

WWGD · Nov 18, 2024

As I remeber it, uncertainty of " First-order" Random Variables such as ( sampling) mean, variance, etc., are described through sample variance, while you use propagation of errors is used for functions of said first order variables.

jbriggs444 · Nov 18, 2024

WWGD said:

As I remeber it, uncertainty of " First-order" Random Variables such as ( sampling) mean, variance, etc., are described through sample variance, while you use propagation of errors is used for functions of said first order variables.

I agree.

So if, for instance, you estimate the standard deviation in the measurement distribution based on the square root of the sample variance (with the n-1 adjustment) then you can estimate the standard deviation in the mean of 16 samples by dividing that figure by ##\sqrt{16}##.

If you make the assumption that the distribution of the sample mean is approximately normal, you can convert that standard deviation into an uncertainty range at a chosen confidence level.

Fair warning. This is ivory tower pontification for me. I do not do this for a living.

ida10 · Nov 18, 2024

Yes i also know gaußian error propergation for functions, but the formular for the mean is also just a function of the measurements... i think if I that if i just have the measured values (without uncertainty of the measurements) that it makes complete sense to just use standard deviation and standard error as it represents how far the measurement values spread from each other and therefore provide also a statistical value for other "measurment samples" on how the mean of them might spread ... but in my case all measurements also have a certain uncertainty and standard deviation wouldnt take those into account... therefore i thought that maybe if i treat the mean as a function (which it is) maybe gaußian error propergation would be an idea... but thats what i am not sure about The problem I have is therefore how i should take the uncertainty of the measurements into account?

Reference: https://www.physicsforums.com/threads/uncertainty-of-mean.1067000/

WWGD · Nov 18, 2024

Do you have the explicit form of the function describing the mean?

jbriggs444 · Nov 18, 2024

ida10 said:

Yes i also know gaußian error propergation for functions, but the formular for the mean is also just a function of the measurements...

Yes. The formula for the mean (##\frac {\sum x_i}{n}##) is a function of the measurements (##x_i##)

I am not sure what you are getting at with that observation.

ida10 said:

i think if I that if i just have the measured values (without uncertainty of the measurements) that it makes complete sense to just use standard deviation and standard error

What do you think "standard deviation" and "standard error" mean in this context? It is a serious question. If we are to understand one another clearly, we need to understand the words we each are using.

ida10 said:

as it [they?] represents how far the measurement values spread from each other and therefore provide also a statistical value for other "measurment samples" on how the mean of them might spread

This sounds like it might be correct. But I am not certain what it is saying.

ida10 said:

... but in my case all measurements also have a certain uncertainty and standard deviation wouldnt take those into account...

You seem to be speaking of "uncertainty" as if it is something separate from "standard deviation". To me, they are almost the same thing. "Standard deviation" is a way of characterizing a distribution (or, equivalently, a random variable). It lets you quantify the width of the distribution with a single number. "Uncertainty" to me talks about the likelihood that a measurement will be within some range of the correct answer. This is a different way of quantifying the width of the distribution.

Yes, the measurements have an unknown uncertainty. That uncertainty is a characteristic of some unknown distribution. That distribution likely has a standard deviation. Which is also unknown.

A useful idea is to estimate the standard deviation of the unknown distribution.
Another useful idea is to estimate the mean of the unknown distribution.

After reading this thread, do you know how to do either of those things using the 16 sample values?

ida10 · Nov 18, 2024

The mean is the sum of all measurements divided by the number of measurements... the mean is therfore a function of the measurments...
m(xi) = sum xi/n
Where xi i=1,...,n are the measurements and with those gaußian error propagation would give
sqrt(sum deltaxi²) as formular for the error where deltaxi is the uncertainty of measurment xi
(Sorry for the messy writen formulas)

Yes my measurements have uncertainty given by the measurment device...
Yes, I do know how to do those, but i have again not seen them with taking the errors of the measurements into account...

Lets say i have measured something 16 times and i only use the value my measurement device tells me than i can estimate mean and standard deviation... i have seen lots of examples on that

But i have 16 measurements that all themselfs have a uncertainty like eg 5,0mm +/- 0,1mm for example... and standard deviation only considers the 5,0mm but what about my +/- 0,1 mm...

(Thats why my idea would have been error propergation cause in the end it is only a function)

Sorry if i just don't understand the answers completly and also thank you for helping me... I try to find literature on this kind of problem but all I find is without uncertainty of the measurments

jbriggs444 · Nov 18, 2024

ida10 said:

Yes my measurements have uncertainty given by the measurment device...

Ahhh, so we are not forced to estimate the standard deviation of an unknown measurement distribution. The manufacturer has characterized the distribution for us, by providing its uncertainty.

You should probably compare the standard deviation in your 16 sample data set (with the n-1 adjustment) against the assurances from the manufacturer to see what you can trust.

If your 16 measurements all match then the earlier and subsequent comments by @haruspex about quantization error become quite significant.

haruspex · Nov 18, 2024

ida10 said:

The problem I have is therefore how i should take the uncertainty of the measurements into account?

If you model the individual measurements as independent, Gaussian with a known variance and take the sample mean as the population mean, you can use the formula for the variance of their sum.
Since you have a reasonably large number of samples, the fact that the individual measurements are (probably) uniformly distributed is unimportant. Substituting a Gaussian distribution of the same mean and variance will be ok.

More of an issue is any systematic error from the granularity of the measurements, as discussed in post #3. I have not tried it, but there might be a general approach using Maximum Likelihood Estimation. E.g. model as quantised Gaussian, i.e. lumped into intervals around the measurement granularity, then vary the mean and variance to maximise the likelihood of the observed data. At one extreme, if all the measurements are the same you deduce quite a low underlying variance but the uncertainty in the mean won't fall below that of the individual measurements. At the other, a wide scatter of measurements will imply a largish individual variance but a smaller uncertainty in the population mean.
Merely maximising the likelihood might be too crude to get a result, though. It may be necessary to bias it somehow.

BvU · Nov 18, 2024

jbriggs444 said:

class in statistics about 47 years ago

For me it is 53 years ago. And one of the things I remember is that the relative uncertainty in the estimate of the standard deviation is about ##1/\sqrt n##, so 25% in this case.

ida10 said:

each of the individual measurements has an uncertainty due to the measuring device

I would very much like to have that clarified (since they all concern the same tube afaik). Assigning different errors to individual measurements is a tricky business that has to be seriously justified.

Furthermore you have an internal error (from the measurements) and an external error from the estimate of the measurement error (I think only the non-systematic part of the latter).

##\ ##

haruspex · Nov 18, 2024

BvU said:

Assigning different errors to individual measurements is a tricky business

I do not read @ida10's statement as implying the individual measurements have different a priori error distributions (which I believe is called heteroscedasticity).

WWGD · Nov 18, 2024

haruspex said:

I do not read @ida10's statement as implying the individual measurements have different a priori error distributions (which I believe is called heteroscedasticity).

I guess the CLT doesn't kick in unless the samples are IID.

jbriggs444 · Nov 18, 2024

WWGD said:

I guess the CLT doesn't kick in unless the samples are IID.

A quick look at the Wiki for the CLT says that there is a version that does not require IID:

https://en.wikipedia.org/wiki/Central_limit_theorem said:

The central limit theorem has several variants. In its common form, the random variables must be independent and identically distributed (i.i.d.). This requirement can be weakened; convergence of the mean to the normal distribution also occurs for non-identical distributions or for non-independent observations if they comply with certain conditions.

WWGD · Nov 18, 2024

jbriggs444 said:

A quick look at the Wiki for the CLT says that there is a version that does not require IID:

Ok, should have been more precise: Not guaranteed to converge in distribution.

How can I calculate the uncertainty of the mean with Gaussian error propagation?

Similar threads

Hot Threads

Recent Insights