1.11 神经网络的权重初始化(Weight Initialization for Deep Networks)

Previous1.10 梯度消失/梯度爆炸(Vanishing / Exploding gradients)Next1.12 梯度的数值逼近(Numerical approximation of gradients)
Last updated

Last updated
w[l] = np.random.randn(n[l],n[l-1])*np.sqrt(1/n[l-1])w[l] = np.random.randn(n[l],n[l-1])*np.sqrt(2/n[l-1])w[l] = np.random.randn(n[l],n[l-1])*np.sqrt(2/(n[l-1] + n[l]))