Weighting calculation to convert weather data from 6 stations into one

mdhastings · Jul 3, 2013

Stephen, last try
this pdf from Greene's about the interactions

Stephen Tashi · Jul 4, 2013

Ok, apparently "interaction terms" is a standard terminologyi the social sciences for "products of variables". It looks like the stuff on this slide show: http://www.google.com/url?sa=t&rct=...HhnYEo&usg=AFQjCNExqJvcPRmlDyn1Gpg9zt_okXJ2DA

I think the basic idea is this:

Supose we have observed data of the form [itex] ( L[k],x[1][k] , x[2][k]...x[n][k] ) [/itex] k = 1,2... number of samples. n = 1,2... number of independent variables. L = the dependent variable in the model.

Suppose we have a model that predicts L as a sum of unknown constants [itex] C [/itex] times known functions [itex] f_i [/itex] of independent variables [itex] f_i [/itex].

[itex] L = \sum_{i=1}^{ N_f} C f_i(x_1,x_2,..x_n) [/itex]

If we have enough data then we can use linear regression to fit this model to the data even if the functions [itex] f_i(x_1,x_2,..x_n) [/itex] are non-linear because we can compute the values of the functions on each vector of observed data and treat each [itex] f_i [/itex] as an independent variable ( even though it depends on the [itex] x_i ) [/itex]).

Do you think the person who wrote the company model thought along these lines?

mdhastings · Jul 4, 2013

Stephen Tashi said:

Ok, apparently "interaction terms" is a standard terminologyi the social sciences for "products of variables". It looks like the stuff on this slide show: http://www.google.com/url?sa=t&rct=...HhnYEo&usg=AFQjCNExqJvcPRmlDyn1Gpg9zt_okXJ2DA

All good Stephen.
I made a correction in the next quote replacing f with x

Stephen Tashi said:

Suppose we have a model that predicts L as a sum of unknown constants [itex] C [/itex] times known functions [itex] f_i [/itex] of independent variables [itex] x_i [/itex].

[itex] L = \sum_{i=1}^{ N_f} C f_i(x_1,x_2,..x_n) [/itex]

Stephen Tashi said:

If we have enough data then we can use linear regression to fit this model to the data even if the functions [itex] f_i(x_1,x_2,..x_n) [/itex] are non-linear because we can compute the values of the functions on each vector of observed data and treat each [itex] f_i [/itex] as an independent variable ( even though it depends on the [itex] x_i ) [/itex]).

Do you think the person who wrote the company model thought along these lines?

This is why when you asked whether this was linear at the start I wanted clarity. These is no documentation available of how he thought yet no doubt this is what he meant.

Going back to the weighting then, I am not a maths thinker but it seems to me that
[itex] L = \sum_{i=1}^{ N_f} C f_i(x_1,x_2,..x_n) [/itex] should be modifiable. That is lose some of the terms [itex] f_i(x_1,x_2,..x_n) [/itex] and set the restriction on the weather coefficients to be positive (how?).

pbuk · Jul 4, 2013

mdhastings said:

Thanks MrAnchovy

No I'm saying the artificial weather station made from 6 stations weather goes into the initial equation - the Meteorology weather forecast replaces that data in the prediction function for the load forecast. What would your methodology be to find the weights?

Oh, that make much more sense.

Ok, first I would identify the goal: which is more important, reducing the mean difference between the forecast and the outcome, (which would be the case if 10 errors of 1% had the same cost as 1 error of 10%) or reducing the frequency of large errors (if the cost of 10 errors of 1% is low compared to 1 error of 10%)? You should tailor your evaluation function accordingly - for example if you are interested in reducing the mean error don't use least squares because this places excessive weight on outliers (this would rule out linear regression of course).

Next I would get some understanding of the weather data. This can be done separately from the relatively expensive forecast computation, you just need to look at 6 time-series data sets.

I'd probably group the data for each weather station by day and by hour, and plot the difference between that station's measurement and the inter-quartile mean. See if there are any patterns. If there aren't then there altering weights isn't going to make much difference, but if there are - say the temperature at station 1 is consistently above the mean between 0900 and 1200 on a business day, you can then investigate whether there is a corresponding increase in historical demand. If there is, then you will want your model to reflect this. This will also indicate whether one set of coefficients will be enough, or if you need different ones for different days/time of day.

But finally there will be no substitute for optimising the coefficients by running the model for each historical data point on trial values.

I would still encourage replacing the absolute temperature and humidity at each station in the model with an average and differences from the average. This has the dual advantage of reducing correlation of parameters and adding resilience to the calculcation - if a weather station goes down its difference from the average can simply be omitted from the model.

Stephen Tashi · Jul 4, 2013

Going back to the weighting then, I am not a maths thinker but it seems to me that
[itex] L = \sum_{i=1}^{ N_f} C f_i(x_1,x_2,..x_n) [/itex] should be modifiable. That is lose some of the terms [itex] f_i(x_1,x_2,..x_n) [/itex] and set the restriction on the weather coefficients to be positive (how?).

From a mathematical point of view, there are many things that can be done, but from a practical point of view the question is whether you can do them. Are you a skilled and experienced programmer? This problem is obviously one that requires a programmer and it would help if that person was a competent mathematician. You've received several suggestions that a mathematican would understand and that a programmer could implement.

I don't have a good picture of the "office politics" side of this scenario. Logic would say that if a company relies heavily on a program and it needs to be fixed or replaced, they would hire an expert to do the work - perhaps a consultant (- and not me since I'm happily retired). Of course, I know that Logic isn't the primarly consideration in management.

mdhastings · Jul 4, 2013

Stephen Tashi said:

From a mathematical point of view, there are many things that can be done, but from a practical point of view the question is whether you can do them. Are you a skilled and experienced programmer?

I am programming in R but this can be a tricky language

Stephen Tashi said:

This problem is obviously one that requires a programmer and it would help if that person was a competent mathematician. You've received several suggestions that a mathematican would understand and that a programmer could implement.

I don't have a good picture of the "office politics" side of this scenario. Logic would say that if a company relies heavily on a program and it needs to be fixed or replaced, they would hire an expert to do the work - perhaps a consultant (- and not me since I'm happily retired). Of course, I know that Logic isn't the primarly consideration in management.

One more question please: In the modified equation I was talking about in my last post how would I set it up to ensure positive coefficients on the weather variables

All been good from you Stephen and I deeply appreciated your involvement. Thanks again

Stephen Tashi · Jul 4, 2013

mdhastings said:

In the modified equation I was talking about in my last post how would I set it up to ensure positive coefficients on the weather variables

I don't see any simple way to set up the linear regression to solve for a new set of weights on the weather stations (even one that need not sum to one) because the regression includes "interaction" terms. For example, if a variable like "temperature" appears inside a sin(...) function, it is represented by a weighted sum. The weights appear as unknowns inside a function sin(...) and they can't be factored out as constants in front of the sin(..). Hence a linear regression can't solve for them because they don't appear as unknown constant coefficients of sin(...). Since they are inside the sin(...) function you can't even run the regression until you have a known set of weights because to evalute sin(...) the weights must be known.

I think the problem must be solved as nonlinear optimization problem with constraints. There are methods for doing this, such as "gradient following", "conjugate gradient", "simulated annealing".

If there is no obvious connection with the current set of weights and the demographics of the city, why do you think changing the weights will better represent the demographics? Are you focusing on the weights of the weather measurements merely because they are the only undocumented constants in the program?

mdhastings · Jul 4, 2013

Stephen Tashi said:

I don't see any simple way to set up the linear regression to solve for a new set of weights on the weather stations (even one that need not sum to one) because the regression includes "interaction" terms. For example, if a variable like "temperature" appears inside a sin(...) function

I can help here [the above is not right]... recall that the interaction terms are products - so [itex] wt1.\sin(\omega t). dow1. \sin(\lambda t)[/itex] is an 4 way interaction term. wt1 would be a temp (say 10C), the dow1 dummy term is 1 for Monday (otherwise 0) and the 2 sine terms are sequence from -1 to 1 for a day's 48 intervals (say -1) and a year the same but across 17520 intervals (say 0.0154) - so the data for this interaction term is their product: 10*-1*1* 0.0154 = -0.154. This is the value for say the 8:30am interval for some day (Monday) of the year. The 9:00am interval will have all terms slightly changed (except the dow1 with Monday). The next interaction term may include a dow2 (Tuesday) and hence the product then is zeroed.

Stephen Tashi said:

If there is no obvious connection with the current set of weights and the demographics of the city, why do you think changing the weights will better represent the demographics? Are you focusing on the weights of the weather measurements merely because they are the only undocumented constants in the program?

The weights to the stations were set many years ago to represent the demographics - since the city has grown then the weights should be changed.

Stephen Tashi · Jul 4, 2013

mdhastings said:

I can help here [the above is not right]... recall that the interaction terms are products - so [itex] wt1.\sin(\omega t). dow1. \sin(\lambda t)[/itex] is an 4 way interaction term. wt1 would be a temp (say 10C), the dow1 dummy term is 1 for Monday (otherwise 0) and the 2 sine terms are sequence from -1 to 1 for a day's 48 intervals (say -1) and a year the same but across 17520 intervals (say 0.0154) - so the data for this interaction term is their product: 10*-1*1* 0.0154 = -0.154. This is the value for say the 8:30am interval for some day (Monday) of the year. The 9:00am interval will have all terms slightly changed (except the dow1 with Monday). The next interaction term may include a dow2 (Tuesday) and hence the product then is zeroed.

Are you saying that all terms of the model are linear in the weather variables? No function like sin(...) has an argument that depends indirectly on the weather variables? There are no terms involving the product of two weather variables?
I don't understand how these two statements jibe.

No in the sense that the weather stations are not matched to areas of population and unfortunately this again is because of closed market and the necessary regulations.

The weights to the stations were set many years ago to represent the demographics - since the city has grown then the weights should be changed.

I don't know how changing the weights of weather measurements can represent a growth in the population.

mdhastings · Jul 5, 2013

Stephen Tashi said:

Are you saying that all terms of the model are linear in the weather variables? No function like sin(...) has an argument that depends indirectly on the weather variables? There are no terms involving the product of two weather variables?

Thanks Stephen, The way the program is set up shows some interactions with a weather variable (either temp or dew point). The main focus of the program is on matching the days load shape. For that, most are not weather interactions.

When I use the daily term sd1 [your sin(...)], I am simply stating it is a sine wave made up of 48 numbers, starting at 1, [itex]sin(∏/2)[/itex], and moving around to -1, [itex]sin(3∏/4)[/itex], before returning to 1, [itex]sin(∏/2)[/itex]. Same with cos terms (e.g. cd1). The yearly term sy1 would be longer pasting through 17520 points. As our data is years long these terms just keep repeating through the dataset.

Stephen Tashi said:

I don't understand how these two statements jibe.

I don't know how changing the weights of weather measurements can represent a growth in the population.

I say this because if the city grows large it may progress outwards and move towards another weather station that we might need input from. We current use 6 of the 9 weather stations. To give some indication, weather stations are up to 50 km apart as it matches where the wires go in our market. There is also the development in the design of the Mcmansion (land with home - sized without a garden so to speak) in our outer suburbs that are all things with heating and air-conditioning etc.

Again I hope this helps

Stephen Tashi · Jul 5, 2013

Even if the only interactions are simple products of weather measurements, this prevents us from factoring out the weights from that product. You should must treat finding the weights as a problem of nonlinear optimization subject to the constraint that the weights add to 1.

mdhastings · Feb 16, 2014

message to Stephen Tashi and others

Hi Stephen,

Could you please provide a response of my "Hello are you out there"

I have more info available as I have found the report.

Stephen Tashi · Feb 17, 2014

I don't recall seeing your "Hello are you out there". Did you send me a private message?

mdhastings · Feb 17, 2014

Hi Stephen,
No other message - I put the title in the email. My apologies if that is frustrating.
I have hunted down how the original calculations were carried out. I wonder if you might still be interested, since I cannot determine Yet the Objective function - it is an optimisation (Nelder-Mead).

Stephen Tashi · Feb 17, 2014

Post what you found out. I'm interested in most math. At the moment, I'm rather busy with non-math activity, but I'll look at what you found.

mdhastings · Feb 17, 2014

Thanks again Stephen,

For reasons of confidentially, I would prefer to limit access to the document to only you at this time. Could I email the document to you please.
If other posters would like access I would prefer to do so on an individual basis.

Weighting calculation to convert weather data from 6 stations into one

Attachments

Similar threads

Hot Threads

Recent Insights