- #1
FallenApple
- 566
- 61
From what I understand, machine learning is incredibly good at making predictions from data in a very automated/algorithmic way.
But for any inference that is going to deal with ideas of causality, it's primarily a subject matter concern, which relies on mostly on judgment calls and intuition.
So basically, a machine learning algorithm would need human level intelligence and intuition to be able to do proper causal analysis?
Here's an example where there might be issues.
Say, an ML algorithm finds that low socioeconomic status is associated with diabetes with a significant p value. We clearly know that diabetes is a biological phenomena and that any possible(this is a big if) causal connection between a non biological variable such as low SES and diabetes must logically have intermediate steps between the two variables within the causal chain. It is these unknown intermediate steps that probably should be investigated in follow up studies. We logically know(or intuit from prior knowledge+domain knowledge) that low SES could lead to higher stress or unhealthy diet, which are biological. So a significant pval for SES indicates that maybe we should collect data on those missing variables, and then redo the analysis with those in the model.
But there's no way a learning algorithm can make any of those connections because those deductions are mostly intuition and logic, which are not statistical. Not to mention, how would ML look at confounders?
But for any inference that is going to deal with ideas of causality, it's primarily a subject matter concern, which relies on mostly on judgment calls and intuition.
So basically, a machine learning algorithm would need human level intelligence and intuition to be able to do proper causal analysis?
Here's an example where there might be issues.
Say, an ML algorithm finds that low socioeconomic status is associated with diabetes with a significant p value. We clearly know that diabetes is a biological phenomena and that any possible(this is a big if) causal connection between a non biological variable such as low SES and diabetes must logically have intermediate steps between the two variables within the causal chain. It is these unknown intermediate steps that probably should be investigated in follow up studies. We logically know(or intuit from prior knowledge+domain knowledge) that low SES could lead to higher stress or unhealthy diet, which are biological. So a significant pval for SES indicates that maybe we should collect data on those missing variables, and then redo the analysis with those in the model.
But there's no way a learning algorithm can make any of those connections because those deductions are mostly intuition and logic, which are not statistical. Not to mention, how would ML look at confounders?
Last edited: