3.8 神经网络的梯度下降（Gradient descent for neural networks）

dZ^{[2]}=A^{[2]}-Y

dW^{[2]}=\frac1mdZ^{[2]}A^{[1]T}

db^{[2]}=\frac1mnp.sum(dZ^{[2]},axis=1,keepdim=True)

dZ^{[1]}=W^{[2]T}dZ^{[2]}\ast g'(Z^{[1]})

dW^{[1]}=\frac1mdZ^{[1]}X^T

db^{[1]}=\frac1mnp.sum(dZ^{[1]},axis=1,keepdim=True)

Last updated 6 years ago

Was this helpful?