- #1
andrewr
- 263
- 0
Hi all,
I've been having a discussion about doing calculations on data which is supposedly Gaussian.
And (Of course) there is a problem: Once operations are performed on the measurements -- such as taking a ratio of one kind of measurement to another; the result is often no longer a Gaussian; In particular, I'd like to explore in this thread the problem of Gaussians ratios.
Stephen Tashi made some excellent comments, and provided some links that I think describe the pathological nature of this distribution well -- but also see, especially, the paper by Marsaglia.
Background information
What I am going to present in this thread is an analysis of the properties of ratio's of Gaussians (which ends in the very pathological Cauchy) I wish to study the mean (hopefully exact) and a quasi standard deviation (quasi because many of these distributions won't have a finite one...)
Based on symmetry arguments, I would say even the Cauchy distribution has a real mean, or no mean -- in the sense that it is a definite number, 0. The Cauchy, and I think many others with mu's very close to zero *do* have means.
For many ratios of Gaussians -- especially those ratios where the numerator and denominators have respective μ >> σ -- doing numerical experiments (eg: sampling approximations) I get repeatable results for both the sample mean and deviation of the experiment. Occasionally I will get a catastrophic failure... and this happens much more often as the mean of the denominator approaches zero.
I'd like to derive formulas for both the mean, and probable *sample* deviation, and also the confidence intervals that a sample will avoid a "catastophic" sub-sampling;
and I'll have to explain some of this later on in the thread.
For now, I'd like to verify my derivation for the mean of a ratio of Gaussians. Attached to the bottom of this post is a graph showing (the red line) what my derivation produced as a final result. Also on the graph are 6 locations that I did numerical experiments on and received results in agreement with the derivation often. I don't think there is a question of whether the result is correct or not -- I'm confident it is correct; there is just more to the problem...
Notice, the graph has a numerator of N(1,0); a constant; but see the derivation itself to understand why it is sufficient for calculating mu of N(a,1) / N(b,1).
The formula I came up with is (drumroll please!):
[tex]
\mu = a \sqrt { 2 } \int \limits _{ 0 } ^{ b \over \sqrt { 2 } } e ^{ t ^ { 2 } - b ^{ 2 } \over 2 } dt
[/tex]
Or, alternately,
[tex]
\mu = a \sqrt { \pi \over 2 } e ^{ - { b ^{ 2 } \over 2 } } \times erfi \left ( { b \over \sqrt { 2 } } \right )[/tex]
where
[tex] erfi(t) = \sqrt { 4 \over \pi } \int \limits _{ 0 } ^{ t } e ^{ t ^ { 2 } } dt
[/tex]
I will give the derivation in the next post, which needs some cleaning up. I'd appreciate some pointers on how to improve the derivation's quality -- as that will undoubtedly help me work out (clearly) the issues about higher moments...
Thank you for your interest.
I've been having a discussion about doing calculations on data which is supposedly Gaussian.
And (Of course) there is a problem: Once operations are performed on the measurements -- such as taking a ratio of one kind of measurement to another; the result is often no longer a Gaussian; In particular, I'd like to explore in this thread the problem of Gaussians ratios.
Stephen Tashi made some excellent comments, and provided some links that I think describe the pathological nature of this distribution well -- but also see, especially, the paper by Marsaglia.
Background information
What I am going to present in this thread is an analysis of the properties of ratio's of Gaussians (which ends in the very pathological Cauchy) I wish to study the mean (hopefully exact) and a quasi standard deviation (quasi because many of these distributions won't have a finite one...)
Based on symmetry arguments, I would say even the Cauchy distribution has a real mean, or no mean -- in the sense that it is a definite number, 0. The Cauchy, and I think many others with mu's very close to zero *do* have means.
For many ratios of Gaussians -- especially those ratios where the numerator and denominators have respective μ >> σ -- doing numerical experiments (eg: sampling approximations) I get repeatable results for both the sample mean and deviation of the experiment. Occasionally I will get a catastrophic failure... and this happens much more often as the mean of the denominator approaches zero.
I'd like to derive formulas for both the mean, and probable *sample* deviation, and also the confidence intervals that a sample will avoid a "catastophic" sub-sampling;
and I'll have to explain some of this later on in the thread.
For now, I'd like to verify my derivation for the mean of a ratio of Gaussians. Attached to the bottom of this post is a graph showing (the red line) what my derivation produced as a final result. Also on the graph are 6 locations that I did numerical experiments on and received results in agreement with the derivation often. I don't think there is a question of whether the result is correct or not -- I'm confident it is correct; there is just more to the problem...
Notice, the graph has a numerator of N(1,0); a constant; but see the derivation itself to understand why it is sufficient for calculating mu of N(a,1) / N(b,1).
The formula I came up with is (drumroll please!):
[tex]
\mu = a \sqrt { 2 } \int \limits _{ 0 } ^{ b \over \sqrt { 2 } } e ^{ t ^ { 2 } - b ^{ 2 } \over 2 } dt
[/tex]
Or, alternately,
[tex]
\mu = a \sqrt { \pi \over 2 } e ^{ - { b ^{ 2 } \over 2 } } \times erfi \left ( { b \over \sqrt { 2 } } \right )[/tex]
where
[tex] erfi(t) = \sqrt { 4 \over \pi } \int \limits _{ 0 } ^{ t } e ^{ t ^ { 2 } } dt
[/tex]
I will give the derivation in the next post, which needs some cleaning up. I'd appreciate some pointers on how to improve the derivation's quality -- as that will undoubtedly help me work out (clearly) the issues about higher moments...
Thank you for your interest.
Attachments
Last edited: