- #36
- 35,005
- 21,672
I don't think they make naive mistakes either. A more realistic "mistake" is to deliberately undersample subsets where you think you alreadsy know the answer to some degree, so you can oversample the subset where you don't.
This trades uncertainty in one subsample for undcertainty in another. And this can improve the overall error.
The problems start when the assumptions on the "well-known" sample turn out to be incorrect. They are aggravated if the undersampling is sufficient to hide the discrepancy between what is ecxpected and what is observed.
In the physical sciences we would say that one is reducing the statistical error at a cost o increased systematic error.
This trades uncertainty in one subsample for undcertainty in another. And this can improve the overall error.
The problems start when the assumptions on the "well-known" sample turn out to be incorrect. They are aggravated if the undersampling is sufficient to hide the discrepancy between what is ecxpected and what is observed.
In the physical sciences we would say that one is reducing the statistical error at a cost o increased systematic error.