4.2 前向传播和反向传播(Forward and backward propagation)

正向传播过程

z[l]=W[l]a[l1]+b[l]z^{[l]}=W^{[l]}a^{[l-1]}+b^{[l]}
a[l]=g[l](z[l])a^{[l]}=g^{[l]}(z^{[l]})

mm个训练样本,向量化形式为:

Z[l]=W[l]A[l1]+b[l]Z^{[l]}=W^{[l]}A^{[l-1]}+b^{[l]}
A[l]=g[l](Z[l])A^{[l]}=g^{[l]}(Z^{[l]})

反向传播过程

dz[l]=da[l]g[l](z[l])dz^{[l]}=da^{[l]}\ast g^{[l]'}(z^{[l]})
dW[l]=dz[l]a[l1]TdW^{[l]}=dz^{[l]}\cdot {a^{[l-1]}}^T
db[l]=dz[l]db^{[l]}=dz^{[l]}
da[l1]=W[l]Tdz[l]da^{[l-1]}=W^{[l]T}\cdot dz^{[l]}

得到:

dz[l]=W[l+1]Tdz[l+1]g[l](z[l])dz^{[l]}=W^{[l+1]T}\cdot dz^{[l+1]}\ast g^{[l]'}(z^{[l]})

mm个训练样本,向量化形式为:

dZ[l]=dA[l]g[l](Z[l])dZ^{[l]}=dA^{[l]}\ast g^{[l]'}(Z^{[l]})
dW[l]=1mdZ[l]A[l1]TdW^{[l]}=\frac1mdZ^{[l]}\cdot A^{[l-1]T}
db[l]=1mnp.sum(dZ[l],axis=1,keepdim=True)db^{[l]}=\frac1mnp.sum(dZ^{[l]},axis=1,keepdim=True)
dA[l1]=W[l]TdZ[l]dA^{[l-1]}=W^{[l]T}\cdot dZ^{[l]}
dZ[l]=W[l+1]TdZ[l+1]g[l](Z[l])dZ^{[l]}=W^{[l+1]T}\cdot dZ^{[l+1]}\ast g^{[l]'}(Z^{[l]})

Last updated