Mixture density neural network prediction bias

In summary, the conversation discusses the use of a mixture density network (MDN) for making predictions. The network has one hidden layer with 10 nodes and one gaussian component. The training process involves minimizing the log-likelihood between the prediction and the expected output. After training, the network produces histograms for the error and standard deviation. However, when computing the mean and error on the mean, there is a small bias observed. This may be due to random noise in the data and can potentially be reduced by adjusting the model complexity or pre-processing the data.
  • #1
Malamala
315
27
Hello! I am using a mixture density network (MDN) to make some predictions. My model is very simple with one hidden layer only with 10 nodes (the details of the network shouldn't matter for my question but I can provide more if needed). Also my MDN has only one gaussian component which basically mean that my MDN predicts for each input a mean and standard deviation of a Gaussian from which to sample the output. During the training I am basically minimizing the log-likelihood between the prediction and the expected output:

$$log(\sigma(x_{in})) + \frac{(y_{real}-\mu(x_{in}))^2}{2\sigma(x_{in})^2}$$

where ##\sigma(x_{in})## and ##\mu(x_{in})## are predicted by the network and are functions of the input. The network seems to be training well i.e. the loss goes down and I am attaching below 2 histograms I obtained after training the network and trying it on new data. The first one is a histogram of ##\frac{dy}{\mu(x_{in})}##, where ##dy = y_{real}-\mu(x_{in})##. The second histogram shows ##\frac{dy}{\sigma(x_{in})}##. Based on these it seems like the network is doing pretty well (the data has Gaussian noise added to it). However when I try to compute the mean and the error on the mean for ##dy## I get:

$$\frac{\sum_i{\frac{dy_i}{\sigma_i^2}}}{\sum_i{1/\sigma_i^2}} = -0.000172 $$
and
$$\sqrt{\frac{1}{\sum_i{1/\sigma_i^2}}} = 0.000003$$
where the sum is over all the data points I test the MDN on. This means that my predictions are biased by -0.000172. However, I am not sure why that is the case, as the MDN should easily notice that and add 0.000172 to all the ##\mu## predictions. I tried training several MDN's with lots of different parameters and I get the same result i.e. the result is biased (not always by the same amount or direction). Am I missing something or missinterpreting the results? Shouldn't the mean of my errors be consistent with zero and shouldn't simply adding that bias (0.000172 in this case) solve the issue? Any insight would be really appreciated. Screenshot 2022-11-23 at 2.41.14 AM.pngScreenshot 2022-11-23 at 2.42.53 AM.png
 
Physics news on Phys.org
  • #2
It is possible that the bias you are seeing is due to the fact that your data contains some random noise. This random noise can cause the model to not be able to perfectly fit the data and therefore lead to a slight bias in the predictions. If this is the case, it may be possible to reduce the bias by either increasing the complexity of the model or by pre-processing the data to remove some of the noise before training.
 

Similar threads

Replies
42
Views
2K
Replies
4
Views
2K
Replies
1
Views
1K
Replies
3
Views
1K
Replies
9
Views
2K
Replies
1
Views
2K
Back
Top