The weighted least squares formula - Quick question

binbagsss · Dec 17, 2012

The weighted least squares formula - Quick question !

m=Ʃiwi Ʃiwixiyi - Ʃiwixi Ʃiwiyi

- In this formula, does wi mean to include the weighting of all points - i.e. both x and y points throughout.

Thanks Alot !

haruspex · Dec 17, 2012

Yes: http://en.wikipedia.org/wiki/Pearso...oefficient#Calculating_a_weighted_correlation

binbagsss · Dec 18, 2012

Thanks for the reply. ok so when its eg Ʃwixi - so this means the product of each x value and its weighting, summed up.

So how would you include the y weightings ?

haruspex · Dec 19, 2012

binbagsss said:

Thanks for the reply. ok so when its eg Ʃwixi - so this means the product of each x value and its weighting, summed up.

So how would you include the y weightings ?

I don't understand what it is that you don't understand. The formula you posted in the OP showed how include weightings on both x and y, and so (in a different format) does the link I posted. They're not separate weightings. Each weight is associated with a datapoint, both x and y.

binbagsss · Dec 19, 2012

haruspex said:

I don't understand what it is that you don't understand. The formula you posted in the OP showed how include weightings on both x and y, and so (in a different format) does the link I posted. They're not separate weightings. Each weight is associated with a datapoint, both x and y.

Sorry. Perhaps my original question was not clear enough. I mean for the terms such as Ʃw(i)x(i) and Ʃw(i)y(i) is it just the corresponding weightings you include so for the first term , only weightings on the x data points, and the second term only weightings on the y data points.

But then for the first term Ʃwi - does this just imply to sum up the total weightings assoiated with both x and y data points.

Thanks for your help.

haruspex · Dec 19, 2012

If there are n datapoints, there are n weights. Weight w1 is used for both x1 and y1. Ʃwi is the sum of the n weights.

binbagsss · Dec 20, 2012

haruspex said:

If there are n datapoints, there are n weights. Weight w1 is used for both x1 and y1. Ʃwi is the sum of the n weights.

Such that x1 and y1 take the same weighting?

Basically for my data all the x values have the same weighting, and all the y values have the same weighting - not equal to that of the x values.(where the weighting for each set was deduced by approximating the uncertainty to be half a division of the instrument used.)

haruspex · Dec 20, 2012

binbagsss said:

Basically for my data all the x values have the same weighting, and all the y values have the same weighting

I doubt weighting in that way will have any effect.

rcgldr · Dec 20, 2012

What the weighting does is effectively multiply the number of occurances of each sample point, including the ability to multiply by a non-integer, such as a weight factor of 1.5 which would mean that a sample point would be treated at 1.5 sample points in the formula, while a weight factor of 0.3 would mean that a sample point would be treated as 0.3 sample points. The "total" number of weighted sample points would be the sum of all the weighting factors.

If you want to use the same formula for non-weighted points, just set all the weighting factors to 1.0.

haruspex · Dec 20, 2012

rcgldr said:

What the weighting does is effectively multiply the number of occurances of each sample point, including the ability to multiply by a non-integer, such as a weight factor of 1.5 which would mean that a sample point would be treated at 1.5 sample points in the formula, while a weight factor of 0.3 would mean that a sample point would be treated as 0.3 sample points. The "total" number of weighted sample points would be the sum of all the weighting factors.

If you want to use the same formula for non-weighted points, just set all the weighting factors to 1.0.

binbagsss is proposing to apply just two weights, one to all the x terms and another to all the y. I think this is based on the knowledge that one set of values is known more accurately than the other. It's not apparent what one would do with the Ʃwi term in that case, but even if that could be sensibly handled I believe such a weighting would have no effect, or if it had an effect not a desirable one. Can you corroborate that?

binbagsss · Dec 23, 2012

To be more clear, for this assignment we should deduce the gradient via both ordinary square regression , and weighted least square regresion, and then the idea is to compare the values.

The trouble I'm having is with the weighted one and what to use as the weighting in this case, as both the x and y values of each data point have an associated error. I assume that the uncertainties of the x and y, should some how be combined and then this combined uncertainty used to deduce the weighting. One way that I can think of to do this is via a functional approach , however I'm not sure if you are meant to make any assumptions on the function for a weighted least square regression.

(I would just favour either the x and y uncertainties, neglecting the other, however the errors seem relatively comparable).

Thanks again.

rcgldr · Dec 23, 2012

If the x and y uncertainties are fixed instead of relative to the magnitude of x and y, you could base the weighting factors on the ratio of (magnitude of a data point) / (magnitude of uncertainty of that data point). If the uncertainties are relative to the magnitude of x and y, I don't see any point in using weighting factors.

haruspex · Dec 23, 2012

As I understand it now, there is a known error range in both x and y values, and these can be different at each datapoint. The larger the error range, the smaller the weight to be allocated. Since only one weight can be assigned to each datapoint, it must represent both the x and y uncertainties. A root-mean-square combination sounds reasonable.

rcgldr said:

If the x and y uncertainties are fixed instead of relative to the magnitude of x and y, you could base the weighting factors on the ratio of (magnitude of a data point) / (magnitude of uncertainty of that data point). If the uncertainties are relative to the magnitude of x and y, I don't see any point in using weighting factors.

I reach the opposite conclusion. The weights should be related to the absolute uncertainties, not relative uncertainties.

rcgldr · Dec 23, 2012

rcgldr said:

If the x and y uncertainties are fixed instead of relative to the magnitude of x and y, you could base the weighting factors on the ratio of (magnitude of a data point) / (magnitude of uncertainty of that data point).

haruspex said:

I reach the opposite conclusion. The weights should be related to the absolute uncertainties, not relative uncertainties.

The main thing is the weights should be related to the inverse (1/x) of the uncertainties. Note the ratios I stated was the inverse of the relative uncertainties. An error of ± .001 has more effect on a data value of .01 than it does on a data value of 100, but I'm not sure if or how this should be taken into account.

haruspex · Dec 24, 2012

rcgldr said:

An error of ± .001 has more effect on a data value of .01 than it does on a data value of 100, but I'm not sure if or how this should be taken into account.

Suppose the plan is to fit some curve to the data then use this as the basis for predicting y values given x values in the future. If the cost function of an error it that prediction is related to the relative error (error/value) then you want the curve to be particularly accurate at small values. This means you want to give data in that part of the curve high credence and stop the curve becoming distorted by errors elsewhere in the data. This argues against basing the weighting on relative error.

binbagsss · Dec 24, 2012

haruspex said:

As I understand it now, there is a known error range in both x and y values, and these can be different at each datapoint. The larger the error range, the smaller the weight to be allocated. Since only one weight can be assigned to each datapoint, it must represent both the x and y uncertainties. A root-mean-square combination sounds reasonable.

Yes that's the problem - thanks for the suggesion of the root-mean square combination - I assume this be more ideal than a functional approach to error propagation, as it does not assume anything about the function?

Thanks again.

The weighted least squares formula - Quick question

FAQ: The weighted least squares formula - Quick question

1. What is the weighted least squares formula?

2. When is the weighted least squares formula used?

3. How is the weighted least squares formula different from ordinary least squares?

4. What are the advantages of using the weighted least squares formula?

5. How is the weight for each data point determined in the weighted least squares formula?

Similar threads

Hot Threads

Recent Insights