Statistical significance of a ML model...

  • Thread starter Thread starter fog37
  • Start date Start date
  • Tags Tags
    Linear regression
AI Thread Summary
To assess the statistical significance of machine learning models such as decision trees, SVMs, and neural networks, traditional tests like t-tests and F-tests used in linear and logistic regression may not apply directly. Instead, the discussion highlights the importance of uncertainty quantification (UQ), a developing subfield focused on evaluating model reliability and significance. A suggested approach involves setting aside a portion of the input data for testing, ensuring that the model is not evaluated on the data it was trained on, which would invalidate the results. For binary classification models, a scoring system can be implemented, where correct predictions are scored as 1 and incorrect as 0, allowing for comparisons against other predictive methods or random guessing to determine significance.
fog37
Messages
1,566
Reaction score
108
TL;DR Summary
Determining if a ML model is statistically significant...
Hello,

How do we check if a ML model is statistically significant? For models like linear regression, logistic regression, etc. there are tests (t-tests, F-tests, etc.) that will tell us if the model, trained on some dataset, is statistically significant or not.

But in the case of ML models, like decision trees, SVM, or neural nets, how do we determine if the model is statistically significant? I have not seen any specific test to do that...

Thank you!
 
Technology news on Phys.org
There is a whole subfield on this called UQ - uncertainty quantification. It is an area or active development.
 
fog37 said:
TL;DR Summary: Determining if a ML model is statistically significant...

But in the case of ML models, like decision trees, SVM, or neural nets, how do we determine if the model is statistically significant? I have not seen any specific test to do that...
The t test will work with any predictive model. You're supposed to set aside a part of the input data, and not use it in your model and use it for testing later. (Because predicting your input data with a ML model is cheating). For a yes/no model, you can score a 1 for correct, and 0 for wrong, and you can compare it other ways to predict the outcomes (or random guessing),
 
Thread 'Star maps using Blender'
Blender just recently dropped a new version, 4.5(with 5.0 on the horizon), and within it was a new feature for which I immediately thought of a use for. The new feature was a .csv importer for Geometry nodes. Geometry nodes are a method of modelling that uses a node tree to create 3D models which offers more flexibility than straight modeling does. The .csv importer node allows you to bring in a .csv file and use the data in it to control aspects of your model. So for example, if you...
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...
Back
Top