Advantage of having more measurements

kelly0303 · May 15, 2021

Hello! I have some points in the plane, with errors on both x and y coordinates. The goal of the experiment is to check if the points are consistent with a straight line or not i.e. if they can be described by a function of the form ##y = f(x)=a+bx## or if there is some nonlinearity involved (e.g. ##y = f(x)=a+bx+cx^2##). Assume first we have only 3 points measured. In this case, the approach is to calculate the area of the triangle formed and the associated error, so we get something of the form ##A\pm dA##. If ##dA>A##, then we are consistent with non-linearity and we can set a constraint (to some given confidence level) on the magnitude of a possible non-linearity (e.g. ##c<c_0##). If we have 4 points, we can do something similar and we can for example calculate the area of the triangle formed by the first 3 points (in order of the x coordinate), ##A_1\pm dA_1## and the area of the last 3 points ##A_2\pm dA_2## and then sum them add and do error propagation to get ##A\pm dA## then proceed as above (in the case of this experiment we expect to not see a non-linearity so we just aim for upper bounds). My question is, what is the advantage of having more points? Intuitively, I expect that the more points you have, the more information you gain and hence the better you can constrain the non-linearity. But it seems like the error gets bigger and bigger, simply because we have more points and error propagation (you can assume that the errors on x and y are the same, or at least very similar for different measurements). So, assuming the points are actually on the line, for 3 points we get ##0\pm dA_3## and for, say 10 points we get ##0\pm dA_{10}## with ##dA_{10}>dA_3##, so the upper bounds we can set on the non-linearity are better (smaller) in the case of 3 points. But intuitively that doesn't make sense. Can someone help me understand what I am doing wrong. Why is it better to have more points? Thank you!

anuttarasammyak · May 15, 2021

kelly0303 said:

My question is, what is the advantage of having more points?

I believe the more observations or trials we do, the more information we get to know the physical system including its proper disturbance, noise or probabilistic behaviors.　

kelly0303 · May 15, 2021

anuttarasammyak said:

I believe the more observations or trials we do, the more information we get to know the physical system including its proper disturbance, noise or probabilistic behaviors.　

Well yeah, this is what I believe intuitively, but I am not sure how to show it mathematically.

anuttarasammyak · May 15, 2021

Law of large numbers and central limit theorem would be of your interest.

Dale · May 15, 2021

I don’t understand the point of the areas. Why not just estimate c directly using least squares. Or even a Bayesian estimation

kelly0303 · May 16, 2021

Dale said:

I don’t understand the point of the areas. Why not just estimate c directly using least squares. Or even a Bayesian estimation

But in order to estimate c, I would need to know the functional form of the non-linearity. However the actual form is very model dependent so in our case we don't want to set constraints on a given model we just want to set a constraint on any deviation from linearity, regardless of its actual form. Am I miss understanding your point?

Basically I want to quantify how far the points are from being on a straight line. I decided to use this area as a quantifier, but I am totally open to suggestions for better ways to do it.

kelly0303 · May 16, 2021

anuttarasammyak said:

Law of large numbers and central limit theorem would be of your interest.

I know about these in general, I am just not sure how they apply to my particular case. For example, in general the error would go as ##1/\sqrt{N}##, where N is the number of measurements, but I don't see that in my expressions above explicitly, so I am probably doing something wrong.

anuttarasammyak · May 16, 2021

I do not know what exactly is your system but I expect many (x,y) data plots show some dense and sparse pattern and it becomes clearer for larger N as the below linked experiment video shows as an example. link https://www.hitachi.com/rd/research/materials/quantum/doubleslit/index.html

kelly0303 · May 16, 2021

anuttarasammyak said:

I do not know what exactly is your system but I expect many (x,y) data plots show some dense and sparse pattern and it becomes clearer for larger N as the below linked experiment video shows as an example. link https://www.hitachi.com/rd/research/materials/quantum/doubleslit/index.html

I don't have many points, tho. Here is a paper that might explain it better (the physics of it is involved, but the details are not important for my question), in figure S2. In the experiments so far, people used to measure 3 points and get something like in figure S2. What it is usually done in literature is to calculate the area created by these 3 points and the error associated to it (by propagating the error from each of the 3 points), and from there set a constraint on the non-linearity (so far all the areas are smaller than the uncertainties, so we were able to just set upper limits). My question is simply, if I am able to measure a 4th point on that plot, how would that help me (I am sure it would, as I would gain more data, but I am not sure mathematically how is the error on the area reduced by adding one more point)?

jedishrfu · May 16, 2021

What you’re trying to do is what a linear regression does. It finds the best line through a set of points. If it looks to be a poor line after a lot of points then you must consider that there’s a different relationship.

Sometimes folks will apply a linear regression to the log values of x or y or both. This scheme can discover polynomial functions like ##y = x^2 ## because a log plot would show a straight line for ##log(y) = 2 log(x)##

Here’s more on linear regression:

https://en.wikipedia.org/wiki/Linear_regression

and this video

kelly0303 · May 16, 2021

jedishrfu said:

What you’re trying to do is what a linear regression does. It finds the best line through a set of points. If it looks to be a poor line after a lot of points then you must consider that there’s a different relationship.

Sometimes folks will apply a linear regression to the log values of x or y or both. This scheme can discover polynomial functions like ##y = x^2 ## because a log plot would show a straight line for ##log(y) = 2 log(x)##

Here’s more on linear regression:

https://en.wikipedia.org/wiki/Linear_regression

and this video

I know what linear regression is, that is not what I am trying to do... as I said in the previous reply, the paper I linked to might explain better what I want to do, especially figure S2. There they measure 3 points, calculate the area of the triangle created by them and quantify the deviation from linearity based on the value of that area. I don't see how doing a linear regression to these 3 points would help me quantify that non-linearity.

anuttarasammyak · May 16, 2021

kelly0303 said:

There they measure 3 points, calculate the area of created by them and quantify the deviation from linearity based on the value of that area.

I observe in S2 they set a half of volume of hexagonal with axis of three momentum vectors as NL, right ? Do these three vectors come from one time experiment data ? I would like to understand how you want to add data or vectors to it in your question.

kelly0303 · May 16, 2021

anuttarasammyak said:

I observe in S2 they set a half of volume of hexagonal with axis of three momentum vectors as NL, right ? Do these three vectors come from one time experiment data ? I would like to understand how you want to add data or vectors to it in your question.

I am not sure what you mean. What hexagonal volume are you referring to?

anuttarasammyak · May 16, 2021

Equation (6) and its explanation by S2.

kelly0303 · May 16, 2021

anuttarasammyak said:

Equation (6) and its explanation by S2.

Equation (6) is just the area of that triangle in figure S2. In the experiment they measure the 6 points ##m\nu_i^{AA_j}## from the x and y-axis in figure S2, and from there they calculate the area created.

anuttarasammyak · May 16, 2021

Equation (6) seems to have dimension of volume p^3 in momentum space
[tex]|(A \times B)\cdot C|[/tex]
not area for me.

kelly0303 · May 16, 2021

anuttarasammyak said:

But equation (6) seems to have dimension of volume p^3 in momentum space
[tex](A \times B)\cdot C[/tex]

If you look just before equation (5), ##m_\mu## is just a constant, without units.

anuttarasammyak · May 16, 2021

I see. And the paper saying "Equivalently, in our geometrical picture it is the volume of the parallelepiped defined by −→mν1,2 and −→mµ." assures my view.

Going back to your point what would you like to do more than this triplet vectors ? Making a quartet by incorporating another vector ? Getting a set of the triplet by many experiments?

jim mcnamara · May 16, 2021

You mentioned area. Area==zero. That is how to test for collinearity of points:
https://www.geeksforgeeks.org/program-check-three-points-collinear/

You can also use the distance test, if that makes any difference to you.

Now we are on the same page I hope.

The above is the best way to test when you want yes/no answers. Or. Use some kind of Minimum area test, if you are okay with a not "perfect" result. What you do in this case is up to you. This is arbitrary you realize. Regression seems okay here. As others mentioned.

This is an example for "not perfect", which you already know:
https://cran.r-project.org/web/packages/olsrr/vignettes/regression_diagnostics.html

Tolerance test of multi-collinearity -- what you are asking about i.e., "more points":
https://www.statisticshowto.com/tolerance-level-statistics/

Dale · May 16, 2021

kelly0303 said:

But in order to estimate c, I would need to know the functional form of the non-linearity.

Not really. You can always do a series expansion and approximate your nonlinearity as a polynomial. You only need to know the functional form if you want to make accurate predictions. But if you only want to detect nonlinearity a polynomial is fine.

kelly0303 said:

Basically I want to quantify how far the points are from being on a straight line. I decided to use this area as a quantifier, but I am totally open to suggestions for better ways to do it.

I suggest least squares regression to a polynomial.

kelly0303 said:

My question is simply, if I am able to measure a 4th point on that plot, how would that help me (I am sure it would, as I would gain more data, but I am not sure mathematically how is the error on the area reduced by adding one more point)?

With one more point you could fit a third order polynomial.

Stephen Tashi · May 16, 2021

kelly0303 said:

in figure S2. In the experiments so far, people used to measure 3 points and get something like in figure S2.

On the page following that figure, the paper says:

Our procedure above applies to cases with enough experimental data. For systems lacking (sufficiently precise)measurements, we can still derive projections provided that an acceptable estimation of the F21 constant is availablefrom either theory calculation or hyperfine splitting data (whenever available).

So I think the three points in figure S2 are themselves are not necessarily 3 single measurements, but instead , each of those points may be the mean value of many measurements.

kelly0303 · May 16, 2021

Dale said:

Not really. You can always do a series expansion and approximate your nonlinearity as a polynomial. You only need to know the functional form if you want to make accurate predictions. But if you only want to detect nonlinearity a polynomial is fine.

I suggest least squares regression to a polynomial.

With one more point you could fit a third order polynomial.

I am not sure I understand, I do want to set very accurate bounds on the non-linearity. Basically I want to describe my points by ##y=ax+b+g(x)##, with ##g(x) << ax,b##. From there I want to set constraints as tight as possible on the ##g(x)##. If I use a polynomial won't that influence how tight the constraints are? On a more practical aspect, in all the paper on this topic they use this area method, so I assume that if polynomial were to work they would have used them. But given that they use areas in literature, I would still like to find out the answer to my question in the case of using areas to define non-linearity.

kelly0303 · May 16, 2021

jim mcnamara said:

You mentioned area. Area==zero. That is how to test for collinearity of points:
https://www.geeksforgeeks.org/program-check-three-points-collinear/

You can also use the distance test, if that makes any difference to you.

Now we are on the same page I hope.

The above is the best way to test when you want yes/no answers. Or. Use some kind of Minimum area test, if you are okay with a not "perfect" result. What you do in this case is up to you. This is arbitrary you realize. Regression seems okay here. As others mentioned.

This is an example for "not perfect", which you already know:
https://cran.r-project.org/web/packages/olsrr/vignettes/regression_diagnostics.html

Tolerance test of multi-collinearity -- what you are asking about i.e., "more points":
https://www.statisticshowto.com/tolerance-level-statistics/

I am not sure I understand what you mean. Of course area=0 for collinear points. But in practice they won't be on a straight line, as we have experimental errors. So the area will be of the form ##3 \pm 5##, which is not zero, but it is consistent with zero within the error. My question is, if I add one more point, and I calculate the area formed by these 4 points, what to I gain compared to the case of having only 3 points. Sending me link to statistic webpages doesn't help me. I know the basics, I just don't know how to apply it to my problem.

kelly0303 · May 16, 2021

Stephen Tashi said:

On the page following that figure, the paper says:So I think the three points in figure S2 are themselves are not necessarily 3 single measurements, but instead , each of those points may be the mean value of many measurements.

On yes, the points in figure S2 are the results of many measurements. In the experiment one measures the x and y for a given point several times, then places it on that plot in S2. After measuring 3 such points we quantify the non-linearity by calculating that area. My question is, if I measure a 4th point, with the same uncertainty as the other 3 points. Do I get anything in terms of better constraining the non-linearity of the 3 points.

kelly0303 · May 16, 2021

anuttarasammyak said:

I see. And the paper saying "Equivalently, in our geometrical picture it is the volume of the parallelepiped defined by −→mν1,2 and −→mµ." assures my view.

Going back to your point what would you like to do more than this triplet vectors ? Making a quartet by incorporating another vector ? Getting a set of the triplet by many experiments?

I would like to add another point to figure S2. In terms of the mathematical description of the problem, the vector will be become 4D (not they are 3D).

kelly0303 · May 16, 2021

Dale said:

Not really. You can always do a series expansion and approximate your nonlinearity as a polynomial. You only need to know the functional form if you want to make accurate predictions. But if you only want to detect nonlinearity a polynomial is fine.

I suggest least squares regression to a polynomial.

With one more point you could fit a third order polynomial.

Just to clarify it, you might be right that using this area method is not the best. But whether is the best or not, if I want to do something in practice and compare it with the literature, I need to use the same method as the one in the literature, which is this area stuff. Even if I want to claim there is a better method, I still need to apply the old method to my problem to actually show that the new method does better by direct comparison. So for now let's assume that I have to use this area method, wether is the best thing to do or not. So my question is, if I quantify the non-linearity using this method for more points, how does that help me set better constraints on the non-linearity than using only 3 points? Thank you!

Dale · May 16, 2021

kelly0303 said:

On a more practical aspect, in all the paper on this topic they use this area method, so I assume that if polynomial were to work they would have used them. But given that they use areas in literature, I would still like to find out the answer to my question in the case of using areas to define non-linearity.

I did read the paper you posted earlier. The physics is far outside of my area of expertise. But from a statistical perspective the area approach makes no sense to me.

In general there is nothing particularly superior to modeling ##g(x)## as a piecewise linear function than as a polynomial with no zero or first order terms. The statistical methods for polynomials are very well studied and optimized over many decades, so personally I would prefer those. In my field we use the polynomial approach as the standard method to characterize non linearity.

However, I understand the value of using the same strategy that has been previously used in the relevant literature. Unfortunately, I don’t know that there is a good way to generalize this niche approach to additional points. With this approach there may be no advantage to additional points (a feature that should call into question the approach)

Stephen Tashi · May 16, 2021

kelly0303 said:

I would like to add another point to figure S2. In terms of the mathematical description of the problem, the vector will be become 4D (not they are 3D).

Figure S2 https://arxiv.org/pdf/1704.05068.pdf is a plot where each point corresponds to data for one isotope pair taken from 3 distinct isotopes? How do you propose to add another point to that situation?

kelly0303 · May 16, 2021

Dale said:

I did read the paper you posted earlier. The physics is far outside of my area of expertise. But from a statistical perspective the area approach makes no sense to me.

In general there is nothing particularly superior to modeling ##g(x)## as a piecewise linear function than as a polynomial with no zero or first order terms. The statistical methods for polynomials are very well studied and optimized over many decades, so personally I would prefer those. In my field we use the polynomial approach as the standard method to characterize non linearity.

However, I understand the value of using the same strategy that has been previously used in the relevant literature. Unfortunately, I don’t know that there is a good way to generalize this niche approach to additional points. With this approach there may be no advantage to additional points (a feature that should call into question the approach)

Thank you for this. So if I am to use the polynomial approach, could you please give me a bit more details (or point me towards some readings about that)?

kelly0303 · May 16, 2021

Stephen Tashi said:

Figure S2 https://arxiv.org/pdf/1704.05068.pdf is a plot where each point corresponds to data for one isotope pair taken from 3 distinct isotopes? How do you propose to add another point to that situation?

Yes, each point on x and y corresponds to a transition in a given isotope pair, so you measure 4 isotopes for that. My question was for the case in which you measure a 5th isotope. In that case on both axis you would add one more point, which would be ##m\nu_i^{AA_4}##. In principle, if you are able to measure the 2 transitions in more isotopes you can add as many points as you want (you keep the reference isotope the same all the time, usually the one with the smallest uncertainties, labeled just as ##A## in that plot).

Dale · May 16, 2021

kelly0303 said:

Thank you for this. So if I am to use the polynomial approach, could you please give me a bit more details (or point me towards some readings about that)?

Sure. The basic idea is that you model your data as ##y_i= b_0 x_i^0 + b_1 x_i^1 + b_2 x_i^2 + ... + \epsilon## where ##\epsilon \sim N(0,\sigma)## and the ##b_j## are the least squares fit terms. Note that even though the ##x_i^j## terms are non-linear for ##j\ge 2##, the fit is still an ordinary least squares linear fit because the ##b_j## terms are linear. So any typical ordinary least squares package will be able to fit this model.

Many fit packages will also be able to test for significance of the ##b_j## and give you both an estimate and a confidence interval for each. And if you need the area then you can simply evaluate the area under the polynomial to whatever order you wish and subtract the area under the first order polynomial.

Twigg · May 16, 2021

Dale said:

But from a statistical perspective the area approach makes no sense to me.

I hear you, Dale. I'm going to go out on a limb and guess that this method was chosen because it has a simple graphical interpretation.

@kelly0303 I can think of a way to generate a similar measure for N isotope pairs, but it's definitely something I pulled out of a hat. So take this suggestion with a suitably sized grain of salt. It's also a fair bit of work, unfortunately

If you have N data points (isotope pairs), ##\vec{x}_i = (x_i, y_i)##, you can define the area of the triangle made by 3 adjacent points: $$A_i = \frac{1}{2} ||(\vec{x}_{i+1} - \vec{x}_i) \times (\vec{x}_{i+2} - \vec{x}_{i+1})||$$
(To see why this is the area, note that ##(\vec{x}_{i+1} - \vec{x}_i)## is one leg of the triangle, ##(\vec{x}_{i+2} - \vec{x}_{i+1})## is another leg of the triangle, and their cross product has magnitude equal to the area of a parallelpiped. So half the area of the parallelpiped is the area of the triangle.)

This gives you N-2 individual triangle's areas, which you can then average: $$\langle A \rangle = \frac{1}{N-2} \sum^{N-2}_{i=1} A_i$$

The tricky part about doing this is the error propagation, because the ##A_{i}##'s depend on shared ##\vec{x}_i##'s. Because of this, I would suggest using mathematica to expand ##\langle A \rangle## as a first-order series in small displacements in ##\vec{x}_i##, call them ##\delta \vec{x}_i##. The coefficients in this series will be the coefficients in the error propagation for ##\langle A \rangle##. Does that make sense is that just really confusing?

I couldn't tell you how meaningful ##\langle A \rangle## is, but it will have the same scale as the area for 3 points and it will have a well-defined error-bar. I think so long as there isn't an inflection point in the nonlinear curve, it should be fine?

Edit: forgot a normalization in the definition of ##\langle A \rangle##

anuttarasammyak · May 16, 2021

kelly0303 said:

I would like to add another point to figure S2. In terms of the mathematical description of the problem, the vector will be become 4D (not they are 3D).

I observe in the paper PHYSICAL REVIEW RESEARCH 2, 043444 (2020)
https://journals.aps.org/prresearch/pdf/10.1103/PhysRevResearch.2.043444
equation (12) and Fig.1 for NL estimation in multi dimension space would be of your help.

kelly0303 · May 16, 2021

anuttarasammyak said:

I observe in the paper PHYSICAL REVIEW RESEARCH 2, 043444 (2020)
https://journals.aps.org/prresearch/pdf/10.1103/PhysRevResearch.2.043444
equation (12) and Fig.1 for NL estimation in multi dimension space would be of your help.

I came across that paper, but they are actually not doing what I am looking for. I am looking for the case in which you have just 2 transitions but more isotopes (so the system is overdetermined). In the paper you mentioned they have more than 2 transitions, but the system is not overdetermined.

anuttarasammyak · May 16, 2021

Let us say we repeat 2 transition experiment on the isotopes.
We get:
1st experiment data plot graph DP_1 and NL estimation NL_1
2nd experiment data plot graph DP_2 and NL estimation NL_2
3rd experiment data plot graph DP_3 and NL estimation NL_3
----
n th experiment data plot graph DP_n and NL estimation NL_n
----

Is this the right story we are dealing with ?

EDIT
To be clearer I add:
We keep eyes on the specific level transition during the experiment. We do our best to repeat the experiments in the same manner and condition. NL_n's are number data of the well defined physical quantity NL whose level transition is defined and shared with all the experiments. We can make n as large as we wish in idea.

Advantage of having more measurements

Similar threads

Hot Threads

Recent Insights