Classifiers, threshold, and ROC curve

  • Thread starter fog37
  • Start date
  • Tags
    Threshold
In summary, classifiers are algorithms that categorize data into different classes based on features. The threshold is a specific value that determines the cutoff between classes in binary classification, influencing the sensitivity and specificity of the classifier. The Receiver Operating Characteristic (ROC) curve is a graphical representation that illustrates the trade-off between true positive rates and false positive rates across various threshold settings, helping to evaluate the performance of a classifier and select an optimal threshold for decision-making.
  • #1
fog37
1,569
108
TL;DR Summary
Classifiers, threshold, and ROC curve
Hello,

A classifier is a ML model that can classify between 2 or more classes. Some classifiers are called probabilistic in the sense that they output a probability score that is then compared against a threshold value (usually 0.5) to make the class decision. Other classifiers are not probabilistic...I guess they are called deterministic. We can always plot the ROC curve for a binary classifier. The ROC curve depends on TPR, FPR and various explored threshold values. The TPR and FPR vary for different threshold values...

Do all deterministic classifiers make their decision also based on some set threshold? If so, does it mean that we can plot the ROC curve for any classifier, probabilistic or not?

Thank you!
 
Technology news on Phys.org
  • #2
fog37 said:
TL;DR Summary: Classifiers, threshold, and ROC curve

Hello,

A classifier is a ML model that can classify between 2 or more classes. Some classifiers are called probabilistic in the sense that they output a probability score that is then compared against a threshold value (usually 0.5) to make the class decision. Other classifiers are not probabilistic...I guess they are called deterministic. We can always plot the ROC curve for a binary classifier. The ROC curve depends on TPR, FPR and various explored threshold values. The TPR and FPR vary for different threshold values...

Do all deterministic classifiers make their decision also based on some set threshold? If so, does it mean that we can plot the ROC curve for any classifier, probabilistic or not?

Thank you!
I'm not aware of any probabilistic classifier. Usually you just compare the predicted with the actual known value/class of elements in the Testing set., all, like you said, given a threshold, so that, e.g., a threshold of 0.6 will give us a given Confusion Matrix Can you give us examples of probabilistic classifiers?
 
  • #3
fog37 said:
Some classifiers are called probabilistic in the sense that they output a probability score that is then compared against a threshold value (usually 0.5) to make the class decision.
No, that is not what a probabilistic classifier does: https://en.wikipedia.org/wiki/Probabilistic_classification

fog37 said:
Do all deterministic classifiers make their decision also based on some set threshold?
No: first of all the term 'deterministic classifier' is not generally recognised, and secondly you should revise your understanding of this material and consider whether your question makes sense given the diversity of classification algorithms.

fog37 said:
If so, does it mean that we can plot the ROC curve for any classifier, probabilistic or not?
Once you have revised this material you should be able to see whether this question is relevent.
 
  • #4
pbuk said:
No, that is not what a probabilistic classifier does: https://en.wikipedia.org/wiki/Probabilistic_classification


No: first of all the term 'deterministic classifier' is not generally recognised, and secondly you should revise your understanding of this material and consider whether your question makes sense given the diversity of classification algorithms.


Once you have revised this material you should be able to see whether this question is relevent.
Confusingly, Knn is sometimes described as a predictor, some times as a classifier.
 
  • #5
WWGD said:
Confusingly, Knn is sometimes described as a predictor, some times as a classifier.
Yes, in a field as diverse and dynamic as machine learning categorisation and making generalisations in the way the OP is trying to do is IMHO a waste of time.
 
  • Like
Likes WWGD
  • #6
pbuk said:
Yes, in a field as diverse and dynamic as machine learning categorisation and making generalisations in the way the OP is trying to do is IMHO a waste of time.
Same goes for SVMs, also listed for both Classification and Regression
 

FAQ: Classifiers, threshold, and ROC curve

What is a classifier in machine learning?

A classifier is a type of algorithm used in machine learning to categorize data into distinct classes or groups. It takes input data and predicts the class label based on the features of the input. Common examples of classifiers include decision trees, support vector machines, and neural networks.

What is the role of a threshold in classification?

The threshold in classification determines the cutoff point for deciding which class an instance belongs to based on the predicted probabilities. For example, if a classifier predicts a probability of 0.7 for class A and the threshold is set at 0.5, the instance will be classified as class A. Adjusting the threshold can help balance sensitivity and specificity in the classification results.

What is an ROC curve?

An ROC (Receiver Operating Characteristic) curve is a graphical representation that illustrates the performance of a binary classifier as the discrimination threshold is varied. It plots the true positive rate (sensitivity) against the false positive rate (1-specificity) at different threshold settings, allowing for the evaluation of the trade-offs between sensitivity and specificity.

How do you interpret an ROC curve?

The ROC curve is interpreted by analyzing the area under the curve (AUC). An AUC of 0.5 indicates no discrimination (random guessing), while an AUC of 1.0 indicates perfect discrimination. The closer the ROC curve is to the top-left corner, the better the classifier's performance. Additionally, the shape of the curve can provide insights into the balance between true positive and false positive rates at various thresholds.

How can you use the ROC curve to choose a threshold?

The ROC curve can help in selecting an optimal threshold by identifying a point that provides a desirable balance between true positive rate and false positive rate, based on the specific context of the problem. This can be done visually by selecting a point on the curve that is closest to the top-left corner or by using metrics such as the Youden's J statistic, which maximizes the difference between sensitivity and false positive rate.

Back
Top