Poisson dist. with small numbers

In summary, the poisson distribution is the correct distribution to use when dealing with data with a low mean and a high standard deviation. With only a few data points, the mean and standard deviation approximation provided by the sqrt(N) is accurate. However, if the data has a low mean and a high standard deviation, the Poisson distribution is not the correct distribution to use. The Poisson distribution is limited to integer values, so with 43194 samples, the mean would be .00107 and the standard deviation would be 0.0316.
  • #1
penguindecay
26
0
Dear Physicists,
I have a poisson dist with a mean at 0.00107. I tried that usual SQRT(mean) for the standard deviation but of course I got an sigma larger than my actual plot. Can someone point me to some text or the right theory?

Cheers

L
 
Physics news on Phys.org
  • #2
The sqrt(N) does not work with uncertainty distributions that are not Gaussian, which means N less than say 100. The Poisson distribution is the correct distribution to use in your case, but it is limited to integer values; 0, 1, 2, etc.. So in your case, about 99.9% of the time you get zero, and 0.107% of the time you get 1.

Bob S
 
  • #3
How many data samples do you have? That's a really low mean, only 1/1000 expected hits per sample. The poisson distribution for that will be very close to just 99.9% for zero events, 0.1% for one event and close to zero probabilty for >1 events. Is that roughly what your data looks like (Yes/No)? If no then give some more information about your data, and it's probably not Poisson(0.001).
 
  • #4
Bob S said:
The sqrt(N) does not work with uncertainty distributions that are not Gaussian, which means N less than say 100. The Poisson distribution is the correct distribution to use in your case, but it is limited to integer values; 0, 1, 2, etc.. So in your case, about 99.9% of the time you get zero, and 0.107% of the time you get 1.

Bob S

Hi Bob, the Poission distribution is discrete, but the parameter ([itex]\lambda[/itex]) is not limited to integer values.

Also the standard deviation of the poisson distribution is [itex]\sqrt{\lambda}[/itex].

In this case [itex]\lambda \simeq 0.001[/itex] so [itex]\sigma = \sqrt{0.001} \simeq 0.0316[/itex].

Even if you approximate the distribution with just two points, P(0)=0.999 and P(1)=0.001, you get a very good approximation to match both mean (lambda) and stdev (sigma). In this case you still get,
lambda = 0.001
sigma = sqrt( 0.999*(0-.001)^2 + 0.001*(1-.001)^2) = sqrt(0.000999) = 0.0316

I suspect the OP might not have enough data points, he'll need many thousand of points before he's likely to get a good statistical representation of the system.
 
  • #5
Hi, just to clarify, I have a poisson dist. that has it's peak at about 0.0001 and a mean value of 0.00107, there are 43194 entries. Sorry I got some numbers mixed up at the start.

Bascically I've been counting decay times of the same events. Thanks, I tried the usual root(mean) but the dist. only goes from 0 to 0.02, the root(0.001) gets me 0.03 as a sigma. Am I going about it the wrong way?
 
  • #6
penguindecay said:
the dist. only goes from 0 to 0.02
I don't know what you mean by this. The Poisson distribution is a discrete distribution, so it will go to some integer number, not to a fraction of an integer. With such a low probability I would assume that it would only go from 0 to 1 with 2 only if you sampled hundreds of thousands of samples, or maybe even more.

In any case, those numbers seem correct to me. If you have 43194 samples with a mean of .00107 that probably means that you have about 46 1's and the rest 0's. The standard deviation of that is indeed about 0.03.
 
  • #7
penguindecay said:
Hi, just to clarify, I have a poisson dist. that has it's peak at about 0.0001 and a mean value of 0.00107, there are 43194 entries. Sorry I got some numbers mixed up at the start.

Bascically I've been counting decay times of the same events. Thanks, I tried the usual root(mean) but the dist. only goes from 0 to 0.02, the root(0.001) gets me 0.03 as a sigma. Am I going about it the wrong way?

Ok so you're recording the "time until event", that's a continuous distribution so it's definitely not Poisson.

If you record the number of events in a given time interval (possible values are 0, 1, 2, 3 etc events) then you get a Poisson distribution, in your case (time until event) you should be looking at the exponential distribution (which is of course a continuous distribution).

BTW. For an exponential distribution the mean and the standard deviation (sqrt variance) are equal. So it does seem compatible with your data.

See http://en.wikipedia.org/wiki/Exponential_distribution
 
  • #8
Hi, thanks for all the information you have given me. I think I should explain my graph and data in detail. I have a plot running the decay time from 0 to 0.02 in intervals of 0.001. I'm binning into these intervals the events that fit into the bin width (of 0.001). I get (I assume) a poisson dist. I assmed it's a poisson because I'm doing a counting exp?

It is not a exp dist. as the event in question has many interactions before losing energy, in short the decay time I record is from the incident event to the last event that has an energy above a threshold, Thanks
 
  • #9
penguindecay said:
Hi, thanks for all the information you have given me. I think I should explain my graph and data in detail. I have a plot running the decay time from 0 to 0.02 in intervals of 0.001. I'm binning into these intervals the events that fit into the bin width (of 0.001). I get (I assume) a poisson dist. I assmed it's a poisson because I'm doing a counting exp?

It is not a exp dist. as the event in question has many interactions before losing energy, in short the decay time I record is from the incident event to the last event that has an energy above a threshold, Thanks

Are you sure it's not exponential. You said it has a peak in the first bin and decays from there, that's consistent with exponential.

What about the std-dev, have you calculated it from the data. Is it close to the mean? If so that's also consistent with exponential.

Finally what does it look like, does it look exponential?
 
  • #10
Here's another interesting fact. If the distribution was exponential then the expected number of samples greater than some time "t" would equal n/u exp(-t/u)

Putting your numbers, n=43194, u=0.00107 gives t=0.019 the lowest "t bin" for which the expected number (of samples greater than "t") is less than one. In other words, it gives a likely approx figure for the largest "t" value you would expect to see in that experiment, given the number of samples you took. Now what did you say before, your largest "t" is 0.02. It seems to fit the data pretty well, yes?
 
  • #11
Thanks, I'll ask my teacher about this. But the thing looks like a poisson! definitely not a decay slope like an exp. It's definitely a hill shape with equal areas on each side of the peak. Thanks again everyone!
 
  • #12
penguindecay said:
But the thing looks like a poisson! definitely not a decay slope like an exp. It's definitely a hill shape with equal areas on each side of the peak.

Ok but previously you said :
penguindecay said:
that has it's peak at about 0.0001 and a mean value of 0.00107, there are 43194 entries.

That puts the peak in the very first bin! So could you please clarify which of those two statements is the correct one. I can't see how they can both be true.
 
  • #13
There should be no confusion about Poisson v. Exponential. If it is a continuous distribution then it cannot be Poisson, if it is a discrete distribution then it cannot be exponential.

You mentioned that you have 43194 data points. Are they integers (e.g. 0 and 1) or are they real numbers (e.g. 0.00015)?
 
Last edited:

FAQ: Poisson dist. with small numbers

What is the Poisson distribution with small numbers?

The Poisson distribution is a probability distribution that is used to model the occurrence of rare events. It is characterized by a single parameter, lambda, which represents the average number of events that occur in a given time interval or space. When the number of events is small, the Poisson distribution can be used to estimate the probability of observing a specific number of events.

How is the Poisson distribution different from other probability distributions?

The Poisson distribution differs from other probability distributions, such as the normal distribution, in that it is discrete rather than continuous. This means that the values it can take on are whole numbers, rather than a range of values. Additionally, the Poisson distribution is often used to model rare events, while other distributions are used for more common events.

What is the formula for calculating probabilities with the Poisson distribution?

The formula for calculating probabilities with the Poisson distribution is P(x) = (e^(-lambda) * lambda^x) / x!, where x is the number of events and lambda is the average number of events. This formula is used to calculate the probability of observing a specific number of events in a given time interval or space.

Can the Poisson distribution be used for large numbers?

While the Poisson distribution is commonly used for small numbers, it can also be used for larger numbers. However, as the number of events increases, the Poisson distribution will start to approximate a normal distribution. Therefore, for larger numbers, the normal distribution may be a more appropriate model.

What is the importance of the Poisson distribution in scientific research?

The Poisson distribution is an important tool in scientific research because it allows researchers to estimate the probability of rare events. This can be useful in a variety of fields, such as epidemiology, genetics, and finance. It also allows researchers to model and analyze data that may not fit a normal distribution, making it a valuable tool in many scientific studies.

Back
Top