How to Bin Data for Spectrum Fitting with Poisson Errors?

  • I
  • Thread starter kelly0303
  • Start date
  • Tags
    Bin Data
In summary, the author binned the data in order to make a better fit for the spectrum. They used 10, 22, 29, 35 as the bin centers.
  • #36
Remember the rate has the time in it already so it works.
I agree with everything @gleem has said. Just to reiterate: because you do in fact want the "interval weighted" rate values when you bin (I screwed this up because I didn't understand the relative energy widths) the binned rate is just
Ntotal/ttotal.​
Similarly the weighted RMS error sum for the rate ends up being
√Ntotal/ttotal
using the propagation of errors formula in #35. All self consistent...and simple
 
Physics news on Phys.org
  • #37
I don't know if it is cheating but in a similar data analysis, I built a spreadsheet that let me choose the bin size and then looked at the results. I was analyzing microarray data and there if you look at signal strength vs the number if occurrences it has to be placed in bins. There are very few with a very strong signal, and very many with a very weak signal. Each signal is unique, but if you sort it into bins, then you can see the difference between how often a signal response of 0-50 occurs, then how often 50-100, then 100-150, etc.

I would choose the bin size based on the data. It might be cheating but sometimes a change in the bin size changed the "noise" in the data. I had a spreadsheet where I imported the data, hanged the bin size in a single cell, which then calculated and plotted the results.

You should be able to sort your data, then apply a set of columns that apply the bin size test based on a separate cell. I don't think it matters whether the plat you generate is vs the midpoint, but i would use that.
 
  • #38
Adding counts and measurements times in a bin is the right approach, it is the best you can do with the given information. Making a bin includes the assumption that the value doesn't change much within the bin (otherwise the bin is too wide), in that case you can just add counts and times. If you want to be fancy with the x-values you can take the weighted average of the x-values going into that bin as x-value (with the measurement times as weights), but within bins as fine as in your example this shouldn't matter.
The calculations of the uncertainties in the previous posts are good, too.

Bin the measurements after sorting them by increasing wavenumber, of course.

A direct one-dimensional likelihood fit to 10,000 or even 100,000 data points shouldn't be an issue, by the way, unless your degrees of freedom are really excessive.
 

Similar threads

Replies
40
Views
4K
Replies
3
Views
7K
Replies
5
Views
2K
Replies
4
Views
1K
Replies
3
Views
2K
Replies
26
Views
2K
Replies
4
Views
1K
Back
Top