DeepLearning.ai深度学习课程笔记
More
Search
Ctrl + K
3.8 神经网络的梯度下降(Gradient descent for neural networks)
Previous
3.7 激活函数的导数(Derivatives of activation functions )
Next
3.9 (选修)直观理解反向传播(Backpropagation intuition )
Last updated
5 years ago
d
Z
[
2
]
=
A
[
2
]
−
Y
dZ^{[2]}=A^{[2]}-Y
d
Z
[
2
]
=
A
[
2
]
−
Y
d
W
[
2
]
=
1
m
d
Z
[
2
]
A
[
1
]
T
dW^{[2]}=\frac1mdZ^{[2]}A^{[1]T}
d
W
[
2
]
=
m
1
d
Z
[
2
]
A
[
1
]
T
d
b
[
2
]
=
1
m
n
p
.
s
u
m
(
d
Z
[
2
]
,
a
x
i
s
=
1
,
k
e
e
p
d
i
m
=
T
r
u
e
)
db^{[2]}=\frac1mnp.sum(dZ^{[2]},axis=1,keepdim=True)
d
b
[
2
]
=
m
1
n
p
.
s
u
m
(
d
Z
[
2
]
,
a
x
i
s
=
1
,
k
ee
p
d
im
=
T
r
u
e
)
d
Z
[
1
]
=
W
[
2
]
T
d
Z
[
2
]
∗
g
′
(
Z
[
1
]
)
dZ^{[1]}=W^{[2]T}dZ^{[2]}\ast g'(Z^{[1]})
d
Z
[
1
]
=
W
[
2
]
T
d
Z
[
2
]
∗
g
′
(
Z
[
1
]
)
d
W
[
1
]
=
1
m
d
Z
[
1
]
X
T
dW^{[1]}=\frac1mdZ^{[1]}X^T
d
W
[
1
]
=
m
1
d
Z
[
1
]
X
T
d
b
[
1
]
=
1
m
n
p
.
s
u
m
(
d
Z
[
1
]
,
a
x
i
s
=
1
,
k
e
e
p
d
i
m
=
T
r
u
e
)
db^{[1]}=\frac1mnp.sum(dZ^{[1]},axis=1,keepdim=True)
d
b
[
1
]
=
m
1
n
p
.
s
u
m
(
d
Z
[
1
]
,
a
x
i
s
=
1
,
k
ee
p
d
im
=
T
r
u
e
)