Advantage of having more measurements

  • I
  • Thread starter kelly0303
  • Start date
  • Tags
    Measurements
In summary: But in order to estimate c, I would need to know the functional form of the non-linearity. However the actual form is very model dependent so in our case we don't want to set constraints on a given model we just want to set a constraint on any deviation from linearity, regardless of its actual form. Basically I want to quantify how far the points are from being on a straight line.
  • #36
Twigg said:
I hear you, Dale. I'm going to go out on a limb and guess that this method was chosen because it has a simple graphical interpretation.

@kelly0303 I can think of a way to generate a similar measure for N isotope pairs, but it's definitely something I pulled out of a hat. So take this suggestion with a suitably sized grain of salt. It's also a fair bit of work, unfortunately

If you have N data points (isotope pairs), ##\vec{x}_i = (x_i, y_i)##, you can define the area of the triangle made by 3 adjacent points: $$A_i = \frac{1}{2} ||(\vec{x}_{i+1} - \vec{x}_i) \times (\vec{x}_{i+2} - \vec{x}_{i+1})||$$
(To see why this is the area, note that ##(\vec{x}_{i+1} - \vec{x}_i)## is one leg of the triangle, ##(\vec{x}_{i+2} - \vec{x}_{i+1})## is another leg of the triangle, and their cross product has magnitude equal to the area of a parallelpiped. So half the area of the parallelpiped is the area of the triangle.)

This gives you N-2 individual triangle's areas, which you can then average: $$\langle A \rangle = \frac{1}{N-2} \sum^{N-2}_{i=1} A_i$$

The tricky part about doing this is the error propagation, because the ##A_{i}##'s depend on shared ##\vec{x}_i##'s. Because of this, I would suggest using mathematica to expand ##\langle A \rangle## as a first-order series in small displacements in ##\vec{x}_i##, call them ##\delta \vec{x}_i##. The coefficients in this series will be the coefficients in the error propagation for ##\langle A \rangle##. Does that make sense is that just really confusing?

I couldn't tell you how meaningful ##\langle A \rangle## is, but it will have the same scale as the area for 3 points and it will have a well-defined error-bar. I think so long as there isn't an inflection point in the nonlinear curve, it should be fine?

Edit: forgot a normalization in the definition of ##\langle A \rangle##
@Twigg @Dale Thank you for your replies! @Twigg I need to look a bit more into your idea (not 100% sure I understand it). I just found this paper, which does what I need in equations 9 and 10, in terms of generalizing the formula to more than 4 isotopes. However, just by looking at that formula, I am still not sure I see how having more isotopes helps. It still looks to me like more isotopes would increase the propagated error, which makes no sense to me. Even looking at formula 12 which seems to be a rough approximation of the error on the parameter of interest (I still need to look into more detail at the derivation), assuming that the errors on the measured transitions are the same for all isotopes, the only improvement comes from the ##\Delta A_j^{max}##, which is the maximum difference between 2 isotopes. However in practice, measuring one more isotope (assuming we use even ones), that number would increase from something like 8 to 10 or 10 to 12, which is not much of an improvement and it doesn't even have to do with the statistics (not to mention that the bound presented there is an upper bound so the actual improvement would be even lower). Am I missing something? I know for a fact that there is a big effort to measure isotope shifts for as many isotopes as possible (hence why new radioactive beam facilities appeared), but at least for this particular problem that doesn't seem to help much.
 
  • Like
Likes Twigg and Dale
Physics news on Phys.org
  • #37
anuttarasammyak said:
Let us say we repeat 2 transition experiment on the isotopes.
We get:
1st experiment data plot graph DP_1 and NL estimation NL_1
2nd experiment data plot graph DP_2 and NL estimation NL_2
3rd experiment data plot graph DP_3 and NL estimation NL_3
----
n th experiment data plot graph DP_n and NL estimation NL_n
----

Is this the right story we are dealing with ?
No, this is not what I am asking for. My question involves only 2 transitions, so there is just one NL. In figure 1 in the paper you mentioned each NL corresponds to different pairs of transitions.
 
  • #38
I added EDIT to #35 to make it clearer. Still No ? Then what kind of experiment you do to get a new vector to be incorporated ?
 
Last edited:
  • #39
anuttarasammyak said:
I added EDIT to #35 to make it clearer. Still No ? Then what kind of experiment you do to get a new vector to be incorporated ?
This is the kind of experiment I am talking about (not sure if this is what you meant). Also we don't want to add a new vector to the data, we want to make the previous vectors longer.
 
  • #40
Twigg said:
I hear you, Dale. I'm going to go out on a limb and guess that this method was chosen because it has a simple graphical interpretation.

@kelly0303 I can think of a way to generate a similar measure for N isotope pairs, but it's definitely something I pulled out of a hat. So take this suggestion with a suitably sized grain of salt. It's also a fair bit of work, unfortunately

If you have N data points (isotope pairs), ##\vec{x}_i = (x_i, y_i)##, you can define the area of the triangle made by 3 adjacent points: $$A_i = \frac{1}{2} ||(\vec{x}_{i+1} - \vec{x}_i) \times (\vec{x}_{i+2} - \vec{x}_{i+1})||$$
(To see why this is the area, note that ##(\vec{x}_{i+1} - \vec{x}_i)## is one leg of the triangle, ##(\vec{x}_{i+2} - \vec{x}_{i+1})## is another leg of the triangle, and their cross product has magnitude equal to the area of a parallelpiped. So half the area of the parallelpiped is the area of the triangle.)

This gives you N-2 individual triangle's areas, which you can then average: $$\langle A \rangle = \frac{1}{N-2} \sum^{N-2}_{i=1} A_i$$

The tricky part about doing this is the error propagation, because the ##A_{i}##'s depend on shared ##\vec{x}_i##'s. Because of this, I would suggest using mathematica to expand ##\langle A \rangle## as a first-order series in small displacements in ##\vec{x}_i##, call them ##\delta \vec{x}_i##. The coefficients in this series will be the coefficients in the error propagation for ##\langle A \rangle##. Does that make sense is that just really confusing?

I couldn't tell you how meaningful ##\langle A \rangle## is, but it will have the same scale as the area for 3 points and it will have a well-defined error-bar. I think so long as there isn't an inflection point in the nonlinear curve, it should be fine?

Edit: forgot a normalization in the definition of ##\langle A \rangle##
I see that in the papers I read, they seem to completely ignore the theoretical error. Even in the paper I mentioned, in equation 11, they claim that we just need to propagate the error on the isotope shift measurement. However the theoretical parameter (##X_1##, ##X_2##, ##F_{12}##) are usually quite poorly calculated (the relative error is a few percent which is huge compared to the relative error of the IS measurements). Why can we just ignore those errors when setting bounds?
 
  • #41
Thanks for sharing the papers as you find them, it's very helpful and I've learned a few things. I think the method I shared above only applies to having extra electronic transitions, not extra isotopes. Sorry about that, and thanks for bearing with us! (Edit: I was right the first time. Whoops. Still, the method outlined by Solaro et al seems more consistent than what I was proposing, because my method could give you a fake "linear" reading for a curve with an inflection point in the middle.)

I'm still wrapping my head around the 4-dimensional shenanigans of the Solaro et al paper. But I think what you said about the only improvement coming from ##\Delta A^{max}_j## actually makes sense. The way I see it, Solaro et al were trying to measure just a non-zero non-linearity, and they observed pure linearity. I know they say they got ##NL = \frac{V}{\sigma_V} = 1.26##, but I have no idea what kind of nightmarish probability distribution NL follows, and they say in the main text that they got a regression fit that was consistent with linearity in the paragraph under Fig. 2. (I can at least interpret the second part with my pea brain, no idea what the first part means lol o0)). The range ##\Delta A^{max}_j## represents the span of their experiment in unexplored parameter space. In other words, adding isotopes increases the scope of the experiment, but doesn't increase its sensitivity. The sensitivity only cares about the precision (in Hz) of the spectroscopic measurements. Does that make sense?

I would point out that the math one finds in the supplementary material can be a little typo-prone, so that's something to keep in mind. There's definitely a typo in Eqn 10, since the inside of the square root isn't dimensionally sound (I think they just forgot an exponent of 2).
 
Last edited:
  • #42
About them not counting theoretical uncertainty, this is just a thing precision measurement people do. I can vouch for that :oldbiggrin:

The reason is that they're just looking to make a measurement not consistent with 0. The error bars on coupling constants may change the non-zero value you eventually see, but they don't affect whether or not you see 0. Only your experimental sensitivity determines that. If they get a measurement that's just barely not consistent with zero, they'll get the grad students (sounds like that might be you!) to work around the clock and get another 100 hours of data to push the measurement one way or the other (push it towards being consistent with 0 or reduce the error bar enough to definitively say it's non-zero). This is why when I was on a precision measurement I would pray for a measurement of zero :oldlaugh: Beyond Standard Model physics is cool and all, but I'll take my weekends to myself thank you.

Also, just wanted to say about the Solaro et al paper, that's one heck of a setup they've got. That's one intense spectroscopy experiment!
 
  • #43
Twigg said:
Thanks for sharing the papers as you find them, it's very helpful and I've learned a few things. I think the method I shared above only applies to having extra electronic transitions, not extra isotopes. Sorry about that, and thanks for bearing with us!

I'm still wrapping my head around the 4-dimensional shenanigans of the Solaro et al paper. But I think what you said about the only improvement coming from ##\Delta A^{max}_j## actually makes sense. The way I see it, Solaro et al were trying to measure just a non-zero non-linearity, and they observed pure linearity. I know they say they got ##NL = \frac{V}{\sigma_V} = 1.26##, but I have no idea what kind of nightmarish probability distribution NL follows, and they say in the main text that they got a regression fit that was consistent with linearity in the paragraph under Fig. 2. (I can at least interpret the second part with my pea brain, no idea what the first part means lol o0)). The range ##\Delta A^{max}_j## represents the span of their experiment in unexplored parameter space. In other words, adding isotopes increases the scope of the experiment, but doesn't increase its sensitivity. The sensitivity only cares about the precision (in Hz) of the spectroscopic measurements. Does that make sense?

I would point out that the math one finds in the supplementary material can be a little typo-prone, so that's something to keep in mind. There's definitely a typo in Eqn 10, since the inside of the square root isn't dimensionally sound (I think they just forgot an exponent of 2).
So basically, from a statistics point of view, measuring more isotopes doesn't improve the bounds (up to that ##\Delta A^{max}_j## factor). What needs to be done is to actually reduce the uncertainty on each individual IS measurement. It still confuses me that adding more data doesn't help you. Intuitively I would imagine that observing linearity with 10 points would help you set much tighter bounds on new physics than with 3 points.

I am not totally sure I understand the theoretical uncertainty part. In this case, the new physics coupling constant, call it ##\alpha## is of the form ##\frac{A}{B}##, with A depending only on experimental data and B depending on both theory and experiment, but for now let's assume it depends only on theory. Assume that we get ##A = 10 \pm 20## and from theory we have ##B = 100 \pm 10##. Whether A is consistent with zero or not has nothing to do with the theory and from example above we can set a 95% confidence limit on A of ##A < 50##. However, their limits are on ##\alpha## whose central value is ##10/100 = 0.1##. But I don't understand how can we just ignore the theoretical errors in this case. It seems like the way they quote the error would be ##\alpha < 0.1 + 20 + 20 = 40.1##, where only the experimental error is considered. Why can we ignore the ##\pm 10## coming from the theory?
 
  • #44
kelly0303 said:
I am not totally sure I understand the theoretical uncertainty part.
Your analysis here is correct, but the purpose is different. The goal isn't to quote the total uncertainty on ##\alpha##, but to give the best estimate of the experiment's sensitivity to ##\alpha##. That estimate is itself uncertain due to theory error bars, as your analysis shows. I couldn't tell you exactly where this practice started, but I believe the reason precision measurement folks do this is to compare experiments and rate them.
Imagine people quoted the total uncertainty on ##\alpha##, and imagine a case where the theory uncertainty dominated over the experimental uncertainty. Every experiment would approximately have the same "sensitivity" with this convention. So instead, people ignore the theory uncertainty and quote only the experiment uncertainty. That way it's easier to "rank" experiments. Of course, in reality this number doesn't mean anything on its own because for all you know an experiment with a lower statistical sensitivity could be bloated with systematic uncertainty.
 
  • #45
kelly0303 said:
Intuitively I would imagine that observing linearity with 10 points would help you set much tighter bounds on new physics than with 3 points.
I think what the expression in eqn 12 is saying is that what really matter is how far spaced those 10 points are. When measuring non-linearity, you want a large lever arm over which to see the change in slope. You can make a better measurement with 3 points very far apart than 10 points bunched together.
 
  • #46
Twigg said:
Your analysis here is correct, but the purpose is different. The goal isn't to quote the total uncertainty on ##\alpha##, but to give the best estimate of the experiment's sensitivity to ##\alpha##. That estimate is itself uncertain due to theory error bars, as your analysis shows. I couldn't tell you exactly where this practice started, but I believe the reason precision measurement folks do this is to compare experiments and rate them.
Imagine people quoted the total uncertainty on ##\alpha##, and imagine a case where the theory uncertainty dominated over the experimental uncertainty. Every experiment would approximately have the same "sensitivity" with this convention. So instead, people ignore the theory uncertainty and quote only the experiment uncertainty. That way it's easier to "rank" experiments. Of course, in reality this number doesn't mean anything on its own because for all you know an experiment with a lower statistical sensitivity could be bloated with systematic uncertainty.
Thanks a lot! So basically it is something agreed upon to make this exclusion plots, ignoring the theoretical part? However, now there is something else that confuses me. They talk in all these papers about non linearities coming from the SM, and how they can significantly affect the sensitivity to new physics. So as far as I understand, they can be calculated, but the errors are pretty big, so future experiments will try to measure more transitions in order to get rid of them using data, not theory. However, say that from the experiment we obtain a non-linearity of ##10 \pm 20##. If we assume no SM non-linearity, we would set a limit at ##<50##. Now if from theory we have a predicted non-linearity in the SM of 5, the value of the non-linearity due to new physics would be ##(10-5) \pm 20 = 5 \pm 20## from which we get a bound of ##<45##. So it looks as if including the SM non-linearity we get even a better bound. What am I doing wrong here? How does the SM non-linearity reduce the sensitivity to new physics?
 
  • #47
kelly0303 said:
So basically it is something agreed upon to make this exclusion plots, ignoring the theoretical part?
Yes, but it's not specific to this problem. Here's a better explanation than what I gave before: say you measured a value of ##A## that's ##5\sigma## away from 0. The confidence in this measurement does not depend on theory whatsoever. Even if there's a lot of uncertainty on ##B## (and therefore ##\alpha##), you still proved the existence of novel physics.

The problem with the SM non-linearity is that it introduces a systematic uncertainty. When looking for miniscule effects, you always want to measure something that would be 0 in the SM. Under this condition, you can separate experimental and theoretical uncertainties. When you have to subtract a systematic shift from your measurement, you add uncertainty due to your correction, and thus the theoretical uncertainty on the SM nonlinearity bleeds into your final error budget.

In your example, there would be theory error bars on the SM non-linearity of 5. The corrected non-linearity (##NL_{BSM} = NL_{observed} - NL_{SM}##) would have an error bar ##\sqrt{(20^2 + \sigma_{SM}^2)}## where ##\sigma_{SM}## is the uncertainty on the SM non-linearity of 5. Experiments often quickly outpace theoretical calculation in these projects.

That's why they try to cancel out the SM non-linearity by taking differential measurements. It's a classic rule of thumb for precision measurements to try and measure something with a baseline of 0 and to avoid non-zero correction factors like the plague.
 
  • #48
Twigg said:
Yes, but it's not specific to this problem. Here's a better explanation than what I gave before: say you measured a value of ##A## that's ##5\sigma## away from 0. The confidence in this measurement does not depend on theory whatsoever. Even if there's a lot of uncertainty on ##B## (and therefore ##\alpha##), you still proved the existence of novel physics.

The problem with the SM non-linearity is that it introduces a systematic uncertainty. When looking for miniscule effects, you always want to measure something that would be 0 in the SM. Under this condition, you can separate experimental and theoretical uncertainties. When you have to subtract a systematic shift from your measurement, you add uncertainty due to your correction, and thus the theoretical uncertainty on the SM nonlinearity bleeds into your final error budget.

In your example, there would be theory error bars on the SM non-linearity of 5. The corrected non-linearity (##NL_{BSM} = NL_{observed} - NL_{SM}##) would have an error bar ##\sqrt{(20^2 + \sigma_{SM}^2)}## where ##\sigma_{SM}## is the uncertainty on the SM non-linearity of 5. Experiments often quickly outpace theoretical calculation in these projects.

That's why they try to cancel out the SM non-linearity by taking differential measurements. It's a classic rule of thumb for precision measurements to try and measure something with a baseline of 0 and to avoid non-zero correction factors like the plague.
Oh I see, so basically we would need to measure one more transition in order to get rid of one SM non-linear effect. That extra transition would add some more error on the measurement, compared to just 2 transitions, but the extra error added is usually much smaller than the error on the theoretical predictions of the SM non-linearity and also now we know that the obtained results reflects exclusively new physics (assuming there is just one SM non-linearity).
 
  • Like
Likes Twigg
  • #49
Yep! I don't know the specifics, but it's undoubtedly something to that effect based on your description.

I wouldn't say it adds error, because taking two measurements also means you get to average down on the BSM non-linearity. For example, if you make two measurements that yield results like ##\alpha_1 = \alpha_{BSM} + \alpha_{SM}## and ##\alpha_2 = \alpha_{BSM} - \alpha_{SM}## (this is totally hypothetical), you'd get a ##\frac{1}{\sqrt{2}}## improvement in error over taking one measurement (because ##\alpha_{BSM} = \frac{1}{2} (\alpha_1 + \alpha_2)## so ##\sigma_{BSM} = \frac{\sqrt{\sigma^2 + \sigma^2}}{2} = \frac{1}{\sqrt{2}}\sigma##). But that's the same as just taking two measurements without cancelling. If anything, the error per square root the number of measurements is a constant.
 
  • Like
Likes kelly0303
  • #50
Twigg said:
Yep! I don't know the specifics, but it's undoubtedly something to that effect based on your description.

I wouldn't say it adds error, because taking two measurements also means you get to average down on the BSM non-linearity. For example, if you make two measurements that yield results like ##\alpha_1 = \alpha_{BSM} + \alpha_{SM}## and ##\alpha_2 = \alpha_{BSM} - \alpha_{SM}## (this is totally hypothetical), you'd get a ##\frac{1}{\sqrt{2}}## improvement in error over taking one measurement (because ##\alpha_{BSM} = \frac{1}{2} (\alpha_1 + \alpha_2)## so ##\sigma_{BSM} = \frac{\sqrt{\sigma^2 + \sigma^2}}{2} = \frac{1}{\sqrt{2}}\sigma##). But that's the same as just taking two measurements without cancelling. If anything, the error per square root the number of measurements is a constant.
One more questions (sorry!). The actual expression for the new physics parameter ##\alpha## is of the form ##A/B##, as we said above, but in practice B contains both theory and experimental input. As you said, if we are to measure ##A\neq 0## at a ##5 \sigma## level, we know for sure we found new physics (assuming we have no SM effects). So when we calculate the error on ##\alpha## and doing error propagation, why do we need to propagate the error from the experimental part in B, too? If all that matter is whether A is consistent with zero or not, why do we bother with B at all? Also, why don't people make exclusion plots for A directly, without involving ##\alpha## at all?
 
  • #51
Ah, sorry this was my mistake. I forgot B had some experimental uncertainty in our discussion. To be honest, when ##B## depends on the measured quantities, I'm less confident in my claim. However, in equation 11 of the Solaro paper, it looks like "B" doesn't depend on any measured quantities. Am I missing something?
 
  • #52
Twigg said:
Ah, sorry this was my mistake. I forgot B had some experimental uncertainty in our discussion. To be honest, when ##B## depends on the measured quantities, I'm less confident in my claim. However, in equation 11 of the Solaro paper, it looks like "B" doesn't depend on any measured quantities. Am I missing something?
My bad! Seems like in general that can be factorized as ##A/(BC)##, where A and C are experimental parts and B is only theory. But my other questions still remains. In this case why do we care about ##BC## at all? What we measure in practice is A and that is what tells us if we made a discovery or not so it seems like the ##BC## term just adds a lot of complications, without telling us anything about whether we found something or not.
 
  • #53
I think it's just cosmetic, and to make the result easier to communicate. From a measurement / statistics standpoint, the theoretical "B" coefficient does not matter. However, if you had two groups doing isotope shift measurements, one doing measurements in Ca and one doing measurements in radium or some fancy stuff, they'd need a quantity that was species-independent to compare results. That's where B comes in. Also, I think tying the experimental quantities to theory is just something that journals expect from authors.
 
  • #54
Twigg said:
I think it's just cosmetic, and to make the result easier to communicate. From a measurement / statistics standpoint, the theoretical "B" coefficient does not matter. However, if you had two groups doing isotope shift measurements, one doing measurements in Ca and one doing measurements in radium or some fancy stuff, they'd need a quantity that was species-independent to compare results. That's where B comes in. Also, I think tying the experimental quantities to theory is just something that journals expect from authors.
Sorry, my questions was not very clear. I was actually more curios about the C term. When doing error propagation we need to propagate the error from C, too, as it is an experimental term. However the discovery potential of the measurement is contained in A, in terms of the measured quantities. Why do we need to propagate the error from C, too and not just treat is as B?
 
  • #55
It's not clear to me off the bat how to interpret the final result "##y_e y_n##", but I'm guessing it's significant and has something to do with the intermediary bosons, but the beyond-standard-model theory aspect of this is way, way over my head. You might attract the attention of smarter folks than me by posting a new thread about the new physics behind this measurement. Sorry I couldn't be more help
 
  • Like
Likes kelly0303
  • #56
Twigg said:
It's not clear to me off the bat how to interpret the final result "##y_e y_n##", but I'm guessing it's significant and has something to do with the intermediary bosons, but the beyond-standard-model theory aspect of this is way, way over my head. You might attract the attention of smarter folks than me by posting a new thread about the new physics behind this measurement. Sorry I couldn't be more help
You helped me a lot! I understood a lot about the approaches to this kind of experiments from your replies. Thank you so much!
 
  • Like
Likes Twigg
  • #57
kelly0303 said:
Hello! I have some points in the plane, with errors on both x and y coordinates. The goal of the experiment is to check if the points are consistent with a straight line or not i.e. if they can be described by a function of the form ##y = f(x)=a+bx## or if there is some nonlinearity involved (e.g. ##y = f(x)=a+bx+cx^2##). Assume first we have only 3 points measured. In this case, the approach is to calculate the area of the triangle formed and the associated error, so we get something of the form ##A\pm dA##. If ##dA>A##, then we are consistent with non-linearity and we can set a constraint (to some given confidence level) on the magnitude of a possible non-linearity (e.g. ##c<c_0##). If we have 4 points, we can do something similar and we can for example calculate the area of the triangle formed by the first 3 points (in order of the x coordinate), ##A_1\pm dA_1## and the area of the last 3 points ##A_2\pm dA_2## and then sum them add and do error propagation to get ##A\pm dA## then proceed as above (in the case of this experiment we expect to not see a non-linearity so we just aim for upper bounds). My question is, what is the advantage of having more points? Intuitively, I expect that the more points you have, the more information you gain and hence the better you can constrain the non-linearity. But it seems like the error gets bigger and bigger, simply because we have more points and error propagation (you can assume that the errors on x and y are the same, or at least very similar for different measurements). So, assuming the points are actually on the line, for 3 points we get ##0\pm dA_3## and for, say 10 points we get ##0\pm dA_{10}## with ##dA_{10}>dA_3##, so the upper bounds we can set on the non-linearity are better (smaller) in the case of 3 points. But intuitively that doesn't make sense. Can someone help me understand what I am doing wrong. Why is it better to have more points? Thank you!

This may already be covered, I haven't read the entire thread, sorry.
I'm a retired cartographer/analyst programmer working for NSW (Australia) state mapping and geodetic survey authority. The advantage of more points is to determine precision. Precision is not accuracy it is consistency. You can for example express precision as a standard deviation, usually 90% of a sample is within 3 standard deviations and this is the usual way to express precision. I don't like the concept of basing straightness on the area of a triangle because the further points are apart, the larger the area even though the precision is the same. I think you would be better off using linear regression which is a statistical approach and will give you the mean line through the points as well as a correlation coefficient expressing how well the points fit the mean line through them. This can be done in Excel. In fact, you can even do curve regression in Excel and it will give you an expression for the curve which is really handy for trend and relationship analysts. Hope that makes sense.
 

Similar threads

Replies
22
Views
2K
Replies
26
Views
2K
Replies
5
Views
2K
Replies
16
Views
1K
Replies
17
Views
2K
Replies
3
Views
1K
Replies
1
Views
1K
Back
Top