BDT scores: Signal vs Background

  • A
  • Thread starter ChrisVer
  • Start date
  • Tags
    Signal
In summary, the conversation discusses the use of BDTs to classify outcomes as "Signal" or "Background" and the interpretation of the BDT variable as a measure of how likely an outcome is to be signal. The possibility of using multiple categories with increasing S/B ratio is also mentioned as a way to improve sensitivity in certain measurements.
  • #1
ChrisVer
Gold Member
3,378
465
I have one question:
Many times, in order to test whether an outcome should be considered as "Signal" (S) or "Background" (B), we are using BDTs which we have trained on known outcomes...
I was wondering though, the outcome of being S/B is binary : either it's signal or it's not... the only way I can interpret the BDT variable after that is as "how likely it is to be signal or not", since a particular cut on the BDT corresponds to some particular background rejection and signal efficiency (ROC curves).
Is my interpretation correct? Would a "signal" object with a BDTscore=0.89 be a less likely signal than one with BDTscore=0.95 ? If not, then is there a way to compare the two events? I.e. the higher I place a cut on BDT the stronger/tighter signal events I'm selecting?
 
Physics news on Phys.org
  • #2
ChrisVer said:
the outcome of being S/B is binary
It doesn't have to be. You can make multiple categories with increasing S/B ratio (increasing BDT score) and study them separately. LHCb does this for ##B_s \to \mu \mu## measurements, ATLAS and CMS do this for ##H \to \gamma \gamma## measurements for example. It improves your sensitivity compared to a single cut.
 
  • #3
mfb said:
It doesn't have to be. You can make multiple categories with increasing S/B ratio (increasing BDT score) and study them separately. LHCb does this for ##B_s \to \mu \mu## measurements, ATLAS and CMS do this for ##H \to \gamma \gamma## measurements for example. It improves your sensitivity compared to a single cut.
Sorry I didn't mean the ratio between Signal and Background, but being Signal or being Background (shortly I wrote S/B for S or B)
 
  • #4
I know, but what the BDT gives you is a ranking in terms of S/B ratio.
 

FAQ: BDT scores: Signal vs Background

What is a BDT score?

A BDT score is a numerical value that represents the probability of an event being signal (or true) rather than background (or noise). It is obtained from a machine learning algorithm called a Boosted Decision Tree (BDT), which is trained on a dataset containing both signal and background events.

How is a BDT score used to distinguish between signal and background events?

The BDT score is typically used as a cut-off value, where events with a score above the cut-off are classified as signal and events with a score below the cut-off are classified as background. This allows for efficient separation of signal and background events in a dataset.

Can BDT scores be used for different types of data?

Yes, BDT scores can be used for any type of data that can be fed into the algorithm, such as numerical, categorical, or even image data. However, the performance of the BDT algorithm may vary depending on the type of data.

How are BDT scores evaluated for accuracy?

The accuracy of BDT scores is typically evaluated using metrics such as the area under the Receiver Operating Characteristic (ROC) curve or the F1 score. These metrics compare the true positive rate (signal events correctly classified as signal) and false positive rate (background events incorrectly classified as signal) for different cut-off values.

Are BDT scores reliable?

The reliability of BDT scores depends on the quality of the training data and the performance of the BDT algorithm. It is important to carefully select and preprocess the training data to ensure accurate and unbiased results. Additionally, the performance of the BDT algorithm can be evaluated through cross-validation techniques and fine-tuning of the algorithm parameters.

Back
Top