Parabolic curve fit to x,y data

In summary: I'm using the notation of the paper I referenced in my first edit.1. Construct a n x 6 matrix A of the form:x^2, xy, y^2, x, y, 1x^2, xy, y^2, x, y, 1x^2, xy, y^2, x, y, 1...x^2, xy, y^2, x, y, 1where x and y are the x and y coordinates of each data point.2. Construct a n x 1 vector b of the form
  • #1
jkristoff
3
0

Homework Statement



Given a series of measured data points (>1000) x,y find the best fit parabolic curve where the constant A (below) is given.

Homework Equations



General 2nd deg equation describes conic sections:
Ax2+Bxy+Cy2+Dx+Ey+F=0

for a parabola B^2=4AC, AND in this case B<>0 (ie: the parabola is rotated)


The Attempt at a Solution



I've looked at orthogonal distance regression algorithms like ODRPACK, but I don't know where to begin with the inputs. What makes this more challenging than simply doing a trendline in Excel is that the parabola is rotated (B<>0) and I need to be able to constrain the first term constant A to a known value. To minimize the orthogonal distances while knowing A, the best fit parabola needs to be rotated and translated.
Please help!
JK
 
Physics news on Phys.org
  • #2
Intriguing problem, I've been pondering it for a couple of days. Sorry to say I don’t necessarily have answers but here are some thoughts that might stimulate discussion.

- Not sure why orthogonal distance regression is needed. Ordinary least squares should be able to be used on a problem like this. Nothing wrong with using ODR but the motivation to do so isn't obvious.

- I tried ODRPACK once but couldn’t figure it out either. Didn’t try very hard though. Solving problems using first principles is much more fun and educational (if you have the interest and time) than using canned programs.

- This could be viewed as a linear least squares problem with a single nonlinear constraint (i.e. B^2 = 4AC)

- Developing the governing equations for least squares with equality constraints is actually quite straightforward (even for parametric equations).

- One possible approach (since A is fixed) is to say Bxy+Cy2+Dx+Ey+F = -Ax2 and just treat the set of linear equations as a black box least squares problem (i.e. you have a known matrix A with more rows than columns, a known vector b, and an unknown vector x) and solve for the coefficients, though it isn’t clear to me what that solution actually represents (i.e. what is really being minimized). Seems to me the constraint could also be included.

- To me the most fundamental approach is to represent the problem parametrically (i.e. x = f(t) and y=g(t)) but it isn’t clear (to me) how to represent a general conic parametrically and how the coefficients of the parametric equations relate to A,B,C,D,E,F.

- Another approach that might bear fruit (you alluded to it) is to take a normal parabola and apply a rotation and translation. You'd want to be careful to understand what the resulting least squares problem really represents.

- Might be helpful to know what kind of class this is for and how much effort is expected. Is this suppose to be a 1hr homework problem or a two week project?

-------------- EDIT ---------------------------------------------------------------

Did a little research.

http://research.microsoft.com/en-us/um/people/awf/ellipse/ellipse-pami.pdf

Turns out that given a general conic f = Ax2+Bxy+Cy2+Dx+Ey+F= 0, minimizing the equation [itex]\sum f^2[/itex] is considered minimizing the squared 'algebraic' distances from the points to the curve. By including the equality constraint B2-4AC = 0 via Lagrange Multiplier, the problem can be solved in a straightforward fashion as a nonlinear least squares problem subject to a single equality constraint.

I made up a problem by rotating a parabola (with 1000 pts), added random noise to the points, set A=1, and solved for B, C, D, E, and F. Convergence was achieved in maybe a dozen iterations. All done wtih Excel using matrix functions (not very sexy, but gets the job done most of the time)
 
Last edited:
  • #3
After a bit more digging there appear to be two common approaches to least squares fitting of general conics:

1. Minimizing the sum of 'algebraic' distances as described in the edit of my previous post (though I still don't understand what an algebraic distance really is). In this specific problem a nonlinear constraint B2-4AC=0 needs to be included to guarantee that the solution is a parabola. I tried this method on a made up problem and it worked fine.

2. Minimizing the sum of 'geometric' distances, which appear to me to be orthogonal distances. The technique is to fit an ordinary parabola to rotated (and potentially translated) data points. Orthogonal least squares makes sense because the orthogonal distances are preserved in the 'rotated' space, whereas vertical distances (i.e. ordinary least squares) in the normal space make little sense in the rotated space. The problem is nonlinear because the angle is unknown and needs to be determined. I also tried this method on a made up problem and it worked fine (though I still don't know how to analytically determine the coefficients of the general conic using the angle and coefficients of the normal parabola).

Though I couldn't find any references in internet searches, there is I believe a third valid option. Minimize the distances in the normal space that are neither vertical nor orthogonal but will result in vertical distances in the rotated space (essentially an ordinary least squares solution in the rotated space). I think this could be done by adding constraints. Food for thought.
 
Last edited:
  • #4
Thanks for your help! I am trying your first-reply suggestion using matrices in excel, but having trouble so far. How did you include the equality constraint?
Regarding the second-post approach#2, you can rotate a normal parabola (B=0) symmetrical about the y-axis centered at the origin using COT(2p)=(a-c)/b
where a,b and c are coefficients of the rotated parabola (0<p<PI).
Can you explain a bit further how you set up the made up problem for the orthogonal distance approach?
Thanks,
JK
 
  • #5
jkristoff said:
I am trying your first-reply suggestion using matrices in excel, but having trouble so far. How did you include the equality constraint?

The answer depends on how much you know about least squares. If you know how to derive the 'normal' equations associated with linear least squares, adding equality constraints is just an extension by using Langrange Multipliers to alter the equation being minimized. Your case is more complicated because the constraint is nonlinear. Thus, the solution is iterative by guessing at the initial values of the unknowns, solving a 'linearlized' set of equations (including the linearized constraint equations), updating the values of the unknowns, and iterating until desired convergence is achieved.

The attachments shows the linear and nonlinear situations.
 

Attachments

  • CLS.jpg
    CLS.jpg
    12.6 KB · Views: 859
  • NLCLS.jpg
    NLCLS.jpg
    20.7 KB · Views: 704
Last edited:
  • #6
jkristoff said:
Can you explain a bit further how you set up the made up problem for the orthogonal distance approach?

Orthogonal Least Squares is inherently nonlinear because the values of x that result in an orthogonal vector from the curve to the data point is unknown. This particular problem is more complicated in that you are first rotating the data points (call them x' and y') through an unknown angle t (thus t needs to be included as an unknown).

http://en.wikipedia.org/wiki/Rotation_matrix

x* = x'*cos(t) - y'*sin(t) ...data point
y* = x'*sin(t) + y'*cos(t) ...data point

In orthogonal least squares we minimize the sum of the squared distances from the data points to the curve:

min [itex] \sum [/itex] [ (delta y)2 + (delta x)2 ] = min [itex] \sum [/itex] [ f2 + g2 ]

where

f = ax2 + bx + c - y*
g = x - x*

Making substitutions, we get:

f = ax2 + bx + c - x'*sin(t) - y'*cos(t)
g = x - x'*cos(t) + y'*sin(t)

We next linearize f & g (calling the linearized functions f*and g*) by taking the first two terms of the Taylor series:

[tex]f* = f_0 + (\partial f / \partial a) \Delta a + (\partial f / \partial b) \Delta b + (\partial f / \partial c) \Delta c + (\partial f / \partial x) \Delta x + (\partial f / \partial t) \Delta t [/tex]

[tex]g* = g_0 + (\partial g / \partial x) \Delta x + (\partial g / \partial t) \Delta t [/tex]

resulting in the linearized least squares problem

[tex] min \text{ }F = \sum [ (f*)^2 + (g*)^2 ] [/tex]

To minimize F, take partial derivatives of F with respect to each variable (a,b,c,t,xi) and set equal to zero. The math is straightforward but tedius.

If none of this or the previous post is familiar to you, I'd wonder why such a homework problem would be given. Tools are needed to solve problems.
 
Last edited:
  • #7
Thank you very much for your help thus far. This is not to be considered a 1 hour homework problem. I'm involved in an independent study with our student group. We are trying the characterize shapes of liquids measured in rotating containers. The x,y data is generated from a non-contact laser displacement probe. All of what you've mentioned is familiar to me; just not VERY familiar. I'm the old-timer of the group and so the math is a bit rusty at the moment (my old calculus book sits at the bedside table). I've been charged to handle to task of the algorithm because I have the engineering education.
I hope you don't mind that this is not a 1 hour homework problem. The forum guidelines states that questions regarding homework OR independent study may be posted here.
For the moment, I need to catch up with you regarding your last two posts.
Thanks again,
JK
 

FAQ: Parabolic curve fit to x,y data

What is a parabolic curve fit?

A parabolic curve fit is a mathematical method used to find the best-fitting parabola for a set of data points. It involves finding the coefficients that minimize the error between the parabola and the data points.

Why is a parabolic curve fit useful?

A parabolic curve fit is useful because it can help identify trends and patterns in data that follow a parabolic shape. This can be helpful in analyzing and predicting future data points.

How is a parabolic curve fit different from other curve fitting methods?

A parabolic curve fit is specifically designed for data that follows a parabolic shape, whereas other curve fitting methods may be better suited for other types of data. Additionally, a parabolic curve fit involves finding the coefficients of a parabola, while other methods may use different equations or functions.

What factors can affect the accuracy of a parabolic curve fit?

The accuracy of a parabolic curve fit can be affected by the quality and quantity of the data points, as well as any outliers or errors in the data. It is also important to choose the appropriate type of parabolic curve (e.g. quadratic, cubic, etc.) for the data being analyzed.

How can a parabolic curve fit be used in real-world applications?

A parabolic curve fit can be used in a variety of real-world applications, such as predicting stock market trends, analyzing the trajectory of a projectile, and modeling the growth of natural phenomena. It can also be used in engineering and design to optimize the shape of objects for maximum efficiency.

Similar threads

Replies
7
Views
2K
Replies
9
Views
5K
Replies
4
Views
2K
Replies
59
Views
4K
Replies
4
Views
1K
Replies
1
Views
2K
Replies
9
Views
3K
Back
Top