Signal signficancies low expected events

ChrisVer · Dec 24, 2016

Hi, I am a little "confused" of how to treat a problem stating: suppose you expect 1.25 background (SM) events from one measurement. During one particular measurement, you observe 5 events. What's the probability that you discovered new physics?
The standard/basic way to go about it is to calculate the significance of your "hypothetized new signal", that is [itex] s = \frac{S}{\sqrt{B}}= \frac{5-1.25}{\sqrt{1.25}}=3.35[/itex], which tells us that the new signal should be 3.35 sigma (standard deviations) away from the expected value to explain the observed events. In probabilities this translates to 99.958% signal exists/ 0.042% it's a statistical fluctuation, right?

My question is that the probabilities in those cases are taken from the standard normal distribution. Wouldn't (for so small expected events) the random variable (observed events) be taken out a Poisson distribution? In explicit wouldn't I have to calculate what's the probability of : [itex]P( x \ge 5 | \lambda =1.25 )\approx 0.9\%[/itex] (being statistical fluctuation) and the complementary for a signal? This would be relatively bad for the significance, as the standard deviation of a Poisson distributed variable is not equal to the square root of the events (for low expected events)...

Thanks.
Maybe this deserves to be moved in the statistics, but please do whatever you find fit.

Orodruin · Dec 24, 2016

ChrisVer said:

What's the probability that you discovered new physics?

Ooops! Be careful here! Are you using frequentist or Bayesian statistics? In Bayesian statistics it depends on your prior. Frequentist statistics does not answer this question. It answers the question "what is the probability this (or something more extreme) would happen by chance?" which is not the same question.

And yes, for a low number of events you need to use the Poisson distribution.

ChrisVer · Dec 24, 2016

Orodruin said:

In Bayesian statistics it depends on your prior.

I don't think I can use a prior for this? Except for saying that I expect (given the model) [itex]B+ \mu S[/itex] where [itex]\mu[/itex] is taken from a uniform prior, and calculating stuff like the 95-quantile for mu to claim a discovery or not (via credibility levels)... seeing eg how the likelihood works:
[itex]L( x_{obs} | B + \mu S ) = Poi(x_{obs} | B+\mu S) Uni(\mu) [/itex]
again I don't think it's easy to determine by simple calculations the ranges for mu or the results of its quantiles (would have to input those in a statistical program).

Orodruin said:

Frequentist statistics does not answer this question. It answers the question "what is the probability this (or something more extreme) would happen by chance?" which is not the same question.

I think this is done easier via calculations? I.e. it results in the way I calculated the Poisson probability (and also how partially I interpreted the result of it, being a stat fluctuation/happened by chance). Maybe I was wrong to move one step further and say that the complementary (99.1%) is the probability of having seen a signal.

Orodruin said:

And yes, for a low number of events you need to use the Poisson distribution.

yup, so the significance (as I defined it) for such cases wouldn't be a good indicator?

Orodruin · Dec 24, 2016

I have these two comics posted on my office door:
Xkcd:

SMBC: http://www.smbc-comics.com/index.php?id=4127

mfb · Dec 24, 2016

ChrisVer said:

Maybe I was wrong to move one step further and say that the complementary (99.1%) is the probability of having seen a signal.

Correct. Stay away from such an interpretation, there is no meaningful way to define "the probability that you saw a signal". "The probability of a background fluctuation that large" is the only meaningful number.

Signal signficancies low expected events

FAQ: Signal signficancies low expected events

What are "signal significances" in scientific research?

How are signal significances calculated?

Why are low expected events important in scientific research?

What are some common challenges when studying low expected events?

How can scientists improve the detection of low expected events?

Similar threads

Hot Threads

Recent Insights