- #1
cosmicminer
- 20
- 1
I 'm reviving this thread
https://www.physicsforums.com/threads/unorthodox-probability-theory.471171/
(it says not open for further replies).
The last post:
It gives almost the same results as the logarithmic pool method (see here for example:
http://homepages.inf.ed.ac.uk/miles/papers/acl05a.pdf).
Furthermore the method can be generalised to m events, n predictors in stepwise mode (take one indian at a time).
My question is the bolded text (the empirical importance functions). Can they be improved ?
https://www.physicsforums.com/threads/unorthodox-probability-theory.471171/
(it says not open for further replies).
The last post:
I made some tests with this using real data and it works.Each of the two forecasters, weatherman and Indian, follows his own set of methods.
We can't tell to what extent those methods are the same.
The ultimate case of independence is this:
Weatherman goes to sleep every night and he is visited by good fairy Bruchilde. Bruchilde knows what is going to happen but she tells the truth in the dream with probability p using a random number generator in her laptop. Weatherman, who is really Bruchilde's spokesperson, proceeds to tell us his view when he wakes up and naturally he scores p percent of the time.
Indian similarly goes to sleep every night and he is visited by good fairy Matilde, who also knows what is going to happen and she reveals the truth to the Indian with probability q, using another laptop.
In a situation such as this, at the end of the proceedings if the event A=Rain occurs N times, it will be scored by both N*p*q times and missed by both N*(1-p)*(1-q) times (plus-minus the random fluctuation).
But on any given day there are only three types of contest going on:
The AA v. BB, the AB v. BA or the BA v. AB.
So for the case of prediction AA v. prediction BB, it's PROB(A) = f(p,q) = p*q / (p*q + (1-p)*(1-q)).
While for the other cases the symmetric formulas apply.
In the real world now where there are no fairies, the real problem has to be attacked.
This in my opinion ought to be done as follows:
Suppose it is true that p > q (weatherman somewhat superior).
Then we write q' = 0.5 + L*(q-0.5) = q(L)
L is a number between 0 and 1. For L = 0 q' = 0.5 while for L = 1, q' = q.
Then f(p,q) becomes f(p,q') = p* (0.5+L*(q-.5))/(p(0.5+L*(q-0.5))+(1-p)* (0.5-L*(q-0.5)))
If L = 0 then f = p (meaning Indian does 't count). If L = 1 then both count.
We write down the daily forecasts and the daily outcomes.
We measure the p and q values.
We let L = 1 and we compute Y = log(f of the posteriously known correct guess) every time and we add the quantities Y. Then we compute I = exp (-sum of Yes/no of records).
Then we repeat with L = 0.99, 0.98 ... down to L = 0.
The value of L that makes I maximum is our best estimator.
This can be done with more than two participants in the contest also.
Are there any better ideas ?
It gives almost the same results as the logarithmic pool method (see here for example:
http://homepages.inf.ed.ac.uk/miles/papers/acl05a.pdf).
Furthermore the method can be generalised to m events, n predictors in stepwise mode (take one indian at a time).
My question is the bolded text (the empirical importance functions). Can they be improved ?