- #1
NotASmurf
- 150
- 2
Backpropagation algorithm E(X,W) = 0.5(target - W^T X)^2 as error, the paper I'm reading notes that covarience matrix of inputs is equal to the hessian, it uses that to develop its learning weight update rule V(k+1) = V(k) + D*V(k), slightly modified (not relevant for my question) version of normal backpropagation feedforward gradient descent, but he uses a mean of zero for the inputs, just like in all other neural networks, but doesn't shifting the mean not affect the covariance matrix, so the eigenvectors ,entropy H=|cov(x)|, second derivative is shouldn't change. It's not even a conjugate prior, its not like I am encoding some terrible prior beliefs if we view it as a distribution mapper. Why does the mean being zero matter here and ostensibly in all ANN's, any help appreciated.