- #1
Jarvis323
- 1,243
- 986
I am wondering if this problem has a name, and what is the most efficient way to solve it. Say you have a normalized histogram ##h(P)## (representing a pdf estimated from a large population), with ##n## bins, you want to generate a sample of points ##S## from ##h(P)## of size ##k##, such that $$\sum_{i=0}^{n-1}|h(S)_i - h(P)_i|$$ is minimized. Note that since the histogram is discrete, it is enough to select a ##c(i)## for each bin, where $$\sum_{i=0}^{n-1}c(i) = k$$ and the solution is minimal.