Data Fitting with Bezier curve

In summary, the conversation discussed fitting a cubic Bezier curve to a given set of data points through a parametric equation with 8 unknown coefficients determined by 4 control points. The difficulty lies in finding the right value of t for a given x of a data point, but this can be solved through a least squares problem or using Newton-Raphson. One approach for updating the control points is through a numerical gradient descent, while another involves treating the total square error as a function of the control points. The dead link mentioned may have referred to a file on Matlab Central.
  • #1
hotvette
Homework Helper
1,008
7
Does anyone know how to fit a cubic Bezier curve to a given set of data points? If so, I'd appreciate some coaching on the methodology.

Bezier curves have separate equations for x and y in a parametric variable t that varies from 0 to 1:

x = at3 + bt2 + ct + d

y = et3 + ft2 + gt + h

the 8 unknowns are a function of 4 control points:

xc = (x0, x1, x2, x3)
yc = (y0, y1, y2, y3)

thus, for a given set of values for the 4 control points, all 8 coefficients are determined and the x,y values of the Bezier curve can be calculated.

But, the equations are de-coupled in the sense that a,b,c,d depend ONLY on the x-values of the 4 control points (xc) and e,f,g,h depend ONLY on the y-values of the 4 control points (yc).

I've been trying (in vain so far) to set up the curve fitting as a least squares problem. Part of the problem is the difficulty of finding the right value of t for a given x of a data point being fitted, since x is cubic in t. One approach is to solve the cubic equation, but I've found that Newton-Raphson works quite well.

It turns out that solving for the y-values of the 4 control points (yc) is a straightforward linear least squares problem because the equations are linear in yc. Thus, for an initial guess of the x-values of the control points (xc) the values of t for each data point can be determined (e.g. by Newton-Raphson), and in a single least squares calculation, the y-values of the control points (yc) can be determined to minimize the square error in vertical distance between the data points and the calculated curve.

The problem is how to get updated x-values of the control points (xc). :frown:

I've searched the internet extensively and though have found references to least squares solutions to Bezier curves, I've never found enough detail to figure out how to do it. :confused:

Any/all hints/tips would be appreciated. Perhaps an approach other than least squares is better. Thanks!
 
Last edited:
Mathematics news on Phys.org
  • #2
Last edited by a moderator:
  • #3
That link you posted doesn't exist anymore. Does anyone know where to find that code now?
 
  • #4
Hello:

I'm not sure if you still have interest in this. I was looking into this problem and found this thread. Now that I've made up my mind on the subject, I thought I would share it here.

As stated, the problem is to find the t_i values for the input data points, so that the LS formula can be used to compute the control points.

Two alternatives for initial guesses:

1. Distribute them uniformly in [0,1] according to the number of data
points N: 0, 1/N, 2/N, ...
2. Use chord-length parameterization, which seems to involve a polynomial fit
up to point ith, then measuring the arc length
See Koichi Itoh: "A curve fitting algorithm for character fonts"
ELECTRONIC PUBLISHING, VOL. 6(3), 195–205 (SEPTEMBER 1993)

Now on how to refine the t_i values.

In Mr Itoh's paper, the suggested strategy for updating t_i is:
solve the 5th degree equation in t_i (e.g., with Newton-Raphson):
[B(t)-P_i]B'(t) = 0
with B'(t) the tangent to the Bezier B() at point t
Then the new t_i values are used to generate the LS control points. Iterate

Or, an alternative approach to obtain refined control points can be used.
This deviates somewhat from the original problem, but it results in a good fit nevertheless.

One can treat the total square error as a function of the control points, and use the control points from the initial guess (obtained with either 1. or 2. above) to start a numerical gradient descent. There exists more than one choice of gradient descent.
For example, a signed stochastic gradient approximation iteration reads:

- Let the current control points be C = (x0, x1, x2, x3, y0, y1, y2, y3)
- Perturb them, e.g., with a single random perturbation vector we generate diametrical
candidate vectors:

Cp = C + mu*delta
Cm = C - mu*delta
where mu > 0 is a step size parameter
- Measure the square error for Cp and for Cm
For each data point:
* Compute t_i, e.g. with the cubic root or with Newton-Raphson.
This can be done with any coordinate of the data point
* Find the values B(t_i) and compute the total square error
- Update the control points:
If Cp gives smaller square error, then do C = Cp and iterate
Otherwise do C = Cm and iterate
The iterations go until maximum of iterations is reached or the
square error changes less than a prescribed threshold

You can also use more than two perturbations, e.g., generate 8 random perturbation
vectors and do the plus-minus perturbations for each
Then you need to evaluate the cost function 16 times, but the convergence
may take less iterations

Alternative stochastic methods in the same spirit include the
"random walk with direction exploitation" method, see e.g.,
S. Rao, Optimization: Theory and Applications

I've been using the sotchastic signed gradient method with good results.
The only problem was that I had to supply the starting point manually.
From now on, I will use the control points from the
LS estimate with uniform allocation as starting point

One last thing. I believe that the aforementioned dead link refers to a file in the
Matlab Central: "cubic Bezier least square fitting"

Thanks for the original post and good luck!



Eduardo
 
  • Like
Likes hunt_mat
  • #5
It took a long time to figure out how to approach it, but I was able to solve the problem (with brief guidance from a statistics professor friend). Let's say that y = f(a,b,t) and x = g(c,d,t) and the data points are (xi,yi). The least squares problem can be stated as the following constrained problem:

min [itex]\sum(f-y_i)^2[/itex] s.t. [itex]g-x_i=0[/itex]

Using the langrange multiplier method (one constraint for each data point) the math is straightforward though tedious and results in a set of decoupled equations for the unknown coefficients (a,b,c,d using the above notation) and the ti. Also, since the ti are unknown, a nonlinear approach is needed (i.e. linearizing the equations before taking partial derivatives and setting them equal to zero). The lagrange multipliers conveniently fall out and never have to be computed.

I've since convinced myself that orthogonal least squares makes more sense when using parametric equations since the curves can wrap (e.g. general conic like a rotated ellipse).
 
Last edited:

FAQ: Data Fitting with Bezier curve

What is a Bezier curve?

A Bezier curve is a mathematical curve that is commonly used in computer graphics and engineering. It is defined by a set of control points, which determine the shape and direction of the curve.

How is data fitted with Bezier curves?

Data fitting with Bezier curves involves finding the best-fitting curve that passes through a set of data points. This is achieved by adjusting the control points of the curve to minimize the distance between the curve and the data points.

What are the advantages of using Bezier curves for data fitting?

Bezier curves offer a flexible and intuitive way to fit data as they can be adjusted to closely follow the data points. They also allow for smooth and continuous curves, which can be useful in representing real-world data.

Are there any limitations to using Bezier curves for data fitting?

One limitation of Bezier curves is that they are only able to fit data that can be represented by a single curve. In cases where data has multiple curves or sharp changes in direction, other methods may be more suitable.

How is the accuracy of data fitting with Bezier curves determined?

The accuracy of data fitting with Bezier curves can be measured by calculating the residual error, which is the difference between the actual data points and the points on the fitted curve. A lower residual error indicates a better fit.

Similar threads

Back
Top