Propagating Error in both numerator and denominator of ratio

Roo2 · Jul 11, 2016

Hello,

I have a set of N samples, each of which yields the measurable variables A and B. I am interested in computing the mean and standard error of the ratio A/B within the group. The catch is that I need to do background subtraction on both A and B, and the two different background values BG_A and BG_B are themselves the mean values of ~100 replicate measurements, which means they have their own error which must be propagated.

Based on a previous thread that I made a long time ago, I gather that since the same BG values are subtracted from each of the N measurements, the error is 100% correlated and should be added in quadrature to the calculated SEM of my set of (A/B) values. However, since I have a ratio of uncertainties (BG_A/BG_B), how do I treat this?

I know that the formula for propagating the error of a ratio is Δz = aΔb + bΔa. However, the values of a and b are specific to the given sample within the set, so I no longer have a single value that I can add in quadrature. What is the correct approach to handle the error propagation?

mfb · Jul 11, 2016

You can extend the formula to see how the error propagates. If I understand your description correctly, you calculate
$$ R = \frac{A_{meas} - BG_A}{B_{meas}-BG_B}$$
where you have 100 measurements of A and B but a single common background estimate. Now you can estimate how the mean and standard deviation of R varies if A_{meas} vary within their uncertainties, how it varies if BG_A varies within its uncertainty, and so on. If those different contributions are independent, add their influence on R in quadrature.

This works if the uncertainties are not too large, in particular in the denominator they should be much smaller than the central values of the difference.

BvU · Jul 11, 2016

Your Δz = aΔb + bΔa is valid for uncorrelated errors in a and b only.

[edit] should be (Δz/z)² = (Δa/a)² + (Δb/b)²

(Idem Δ(a-b)² = Δa² + Δb²)
Check out Kirchner page 5ff -- the inconvenient truth

So it's important to know if ΔBG_A and ΔBG_B are correlated or not ! Uncorrelated or 100% correlated makes a (potentially big) difference (and here uncorrelated is less favorable ...)

Numerical values are important here: if you have a signal hardly above background you worry about this correlation; if it's just a small correction you ignore the error in the background (and compensate a bit when drawing conclusions on how R depends on whatever was varied -- if anything was varfied at all, which isn't clear to me from your post).

In conclusion: a bit more context and a bit more quantitative info are desirable for better assistance

Roo2 · Jul 11, 2016

BvU said:

Your Δz = aΔb + bΔa is valid for uncorrelated errors in a and b only.

(Idem Δ(a-b)² = Δa² + Δb²)
Check out Kirchner page 5ff -- the inconvenient truth

So it's important to know if ΔBG_A and ΔBG_B are correlated or not ! Uncorrelated or 100% correlated makes a (potentially big) difference (and here uncorrelated is less favorable ...)

Numerical values are important here: if you have a signal hardly above background you worry about this correlation; if it's just a small correction you ignore the error in the background (and compensate a bit when drawing conclusions on how R depends on whatever was varied -- if anything was varfied at all, which isn't clear to me from your post).

In conclusion: a bit more context and a bit more quantitative info are desirable for better assistance

Thanks for your help!

Is there a statistical test to see if ΔBG_A and ΔBG_B are correlated, or is it decided by logical inference? The setup of the experiment is that I'm measuring the optical density and light emitted by bacterial cultures using two different instruments. The background measurements are taken using containers filled with unpopulated bacterial media, and virtually all of the signal (for both A and B) arises from the plastic container. Since the two containers are different, I would expect no correlation between the i'th BG_A and the i'th BG_B. On the other hand, I do expect correlation between A (light emitted) and B (optical density) of the cultures, because in general the higher the bacterial concentration, the higher the photon count. However, I am testing these measurements for a variety of growth conditions, some of which severely impair the production of the light-emitting element, so under some conditions high density cultures produce almost no light. The different growth conditions are grouped separately; e.g. all of the bacteria within each set are grown under the same conditions to the best of my ability to control them.

Quantitatively, the errors are pretty small compared to the means; the mean +/- Standard deviation for the two measurements are:

BG_A = 2460 +/- 39 (N = 95)
BG_B = 0.041 +/- .0006 (N=96)

mfb · Jul 11, 2016

Roo2 said:

Is there a statistical test to see if ΔBG_A and ΔBG_B are correlated, or is it decided by logical inference?

It follows from your method, there is no test that can measure correlation between two single values just based on those values.
Where does the uncertainty estimate for the background measurements come from? If it is purely a statistical uncertainty, it is uncorrelated. Some calibration that might be common for both methods could introduce a correlation.

Roo2 · Jul 12, 2016

mfb said:

You can extend the formula to see how the error propagates. If I understand your description correctly, you calculate
$$ R = \frac{A_{meas} - BG_A}{B_{meas}-BG_B}$$
where you have 100 measurements of A and B but a single common background estimate. Now you can estimate how the mean and standard deviation of R varies if A_{meas} vary within their uncertainties, how it varies if BG_A varies within its uncertainty, and so on. If those different contributions are independent, add their influence on R in quadrature.

This works if the uncertainties are not too large, in particular in the denominator they should be much smaller than the central values of the difference.

I'm concerned that the treatment proposed above doesn't keep the units in mind. I think it may be helpful for me to have a concrete example. Let's say we perform the analysis with the data below (generated randomly for the sake of this post):

A1: 3623 Fluorescence Units
B1: 0.074 Absorbance Units

A2: 5927 Fluorescence Units
B2: 0.129 Absorbance Units

A3: 4512 Fluorescence Units
B3: 0.096 Absorbance Units

Applying the background subtraction indicated above in post 4:
BG_A = 2460 +/- 39 (N = 95)
BG_B = 0.041 +/- .0006 (N=96)

A1: 1163 +/- 39 FU
B1: .033 +/- .0006 AU

A2: 3467 +/- 39 FU
B2: .088 +/- .0006 AU

A3: 2052 +/- 39 FU
A3: .055 +/- .0006 AU

Computing the ratios for samples 1, 2, and 3:

R1: 35242.42 FU/AU

R2: 39397.73 FU/AU

R3: 37309.09 FU/AU

Mean(R) = 37316.41 FU/AU
StDev(R) = 2077.66
SEM(R) = 1199.54

Adding the errors in quadrature, as I understand your post #2:
$$SEM_{new} = \sqrt{(1199.54 FU/AU)^2 + (39 FU)^2 + (.0006 AU)^2}$$
Dimensionally, this doesn't make much sense to me. Additionally, a huge relative error in the absorbance (such as 1 AU) counts less than a tiny relative error in fluorescence (such as the provided 39 AU).

One way I could see to correct the approach is to propagate the error when measuring R1 , R2, and R3, and adding all of those in quadrature:
$$SEM(R_{n}) = R_{n} * \sqrt{ (\frac {SEM(A_{n})} {A_{n}})^2 + (\frac {SEM(B_{n})} {B_{n}})^2} $$
Since the error in the measurement of A_n is 0, SEM(A_n) should just be the quadrature-added sqrt(0² + SEM(BG_A)²) = SEM(BG_A) = 39, and likewise SEM(B_n) = SEM(BG_B) = .0006.

This produces errors which are in the appropriately scaled units of R. Using this formula:
$$SEM(R_{1}) = 35242.42 * \sqrt{ (\frac {39} {1163})^2 + (\frac {.0006} {.033})^2} = 1344.35$$
$$SEM(R_{2}) = 39397.73 * \sqrt{ (\frac {39} {3467})^2 + (\frac {.0006} {.088})^2} = 518.23$$
$$SEM(R_{3}) = 37309.09 * \sqrt{ (\frac {39} {2052})^2 + (\frac {.0006} {.055})^2} = 817.60 $$

Adding those errors in quadrature to the SEM of the set produces:
$$SEM(R) = \sqrt{1199.54^2 + 1344.35^2 + 518.23^2 + 817.6^2} = 2045.29$$
$$R = 37316.41 \pm 2045.29 $$
This seems like a more reasonable approach; however, I'm concerned that it triple-counts the identical background errors (39 for A and .006 for B). Does this need to be somehow rectified?

BvU · Jul 13, 2016

Must be very secretive work if you need to provide randomly generated example data ...

Using the same symbols for actual observations (A1 etc) and processed values (A1 etc) is confusing; try to avoid it.

What I am still missing is:

the errors in A and B

I take it A1 (3623) and B1 (.074) are single observations. What are they, exactly (counts, other raw data, a ratio of the outcome of a peak integration procedure from some chromatographic machine, ... ) ? So what about the ##\sigma## from statistics and what about calibration factors (uncertainties therein and are they -- or some of them -- common for A and B)

If BGA is the outcome of 100 measurements and has an error of 1.6 %, I would expect a single measurement of the background to have an error of around 16% (##\sigma_m = {\sigma \over \sqrt N}##). Such a sigma can come from poisson statistics when the actual count is around 36 (##\sigma = \sqrt x##). You report BGA = 2460, so perhaps the calibration factor from counts to FU (sorry, never heard of...) is 68 and A1 is in fact only 53 counts. If that follows Poisson statistics, then A1 has a sigma of 7 * 68 and you can completely ignore ##\ \sigma_{\rm\,BGA}##, ##\ \sigma_{\rm\,BGB}## and ##\ {\rm cov(BGA, BGB)}## and we are done.However, if I am on the wrong track, then please clear us up.

I find it surprising BGA and BGB have the same relative error. Coincidence ?

I read your post #6 up to the word 'Computing'. We need some clarification first. But if you run into dimensional discrepancies you can be certain you are doing something very wrong.

mfb · Jul 13, 2016

Roo2 said:

Adding the errors in quadrature

Add the uncertainties on R that result from the uncertainty sources in quadrature.
If you increase or decrease BG_A by 39, how does R change? Either find a formula or really repeat all calculations with a larger and smaller background. That is your uncertainty on R coming from BG_A. That you can add in quadrature to other similarly determined uncertainties.

Roo2 said:

One way I could see to correct the approach is to propagate the error when measuring R1 , R2, and R3, and adding all of those in quadrature:

The formula you use there would only work if R would be the product of all components. And you would have to apply it to the final R, not to each invididual R because you don't take correlations into account otherwise.

Stephen Tashi · Jul 13, 2016

Roo2 said:

I know that the formula for propagating the error of a ratio is Δz = aΔb + bΔa.

If z = a/b, how can the units work out in that formula ?

If delta z is the dimensionless ( std_dev(a/b))/( a/b) then you could estimate it as the dimensionless ( std_dev a)/ a + (std_dev b)/b.

BvU · Jul 13, 2016

Stephen Tashi said:

If z = a/b, how can the units work out in that formula ?

If delta z is the dimensionless ( std_dev(a/b))/( a/b) then you could estimate it as the dimensionless ( std_dev a)/ a + (std_dev b)/b.

BvU said:

Your Δz = aΔb + bΔa is valid for uncorrelated errors in a and b only.

Thanks Stephen -- overlooked this completely because of concentrating on covariance !

So a small tutorial (saves any memorizing once you get it), simple calculus -- read Kirchner for a better version

if ##z=f(a,b)## then $$ dz = {\partial z \over \partial a} da + {\partial z \over \partial b} db $$ and $$

dz^2 = \left (\partial z \over \partial a\right )^2 da^2 + \left (\partial z \over \partial b\right )^2 db^2 +
\left (\partial z \over \partial a\right )\left (\partial z \over \partial b\right )da\; db
$$ and analogously (Kirchner page 5) $$
Var(z) = \left (\partial z \over \partial a\right )^2 Var(a) + \left (\partial z \over \partial b\right )^2 Var(b) +
\left (\partial z \over \partial a\right )\left (\partial z \over \partial b\right ) Cov(a,b)
$$ where ##\ Var(x) = \sigma_x^2 \ ## and ## \ Cov(a,b) \equiv \sigma_{ab}^2 = r_{ab}\sigma_a \sigma_b ##.

If the errors in a and b are uncorrelated ##\ r_{ab}=0 \ ## and for e.g. z=a/b we get $$
\sigma_z^2 = \left (1\over b \right )^2 \sigma_a^2 + \left (a\over b^2 \right )^2 \sigma_b^2 \quad \Rightarrow \\
\left (\sigma_z\over z\right)^2 = \left (\sigma_a\over a\right)^2 +\left (\sigma_b\over b\right)^2 $$
and all worries about dimensions vanish.

@stephen: we sum variances, so the ##\sigma##'s are added in quadrature.

Propagating Error in both numerator and denominator of ratio

FAQ: Propagating Error in both numerator and denominator of ratio

What is propagating error in a ratio?

Why is propagating error important?

How is propagating error calculated?

What are some common sources of error in a ratio?

Can propagating error be used for any type of ratio?

Similar threads

Hot Threads

Recent Insights