2.4 逻辑回归的梯度下降(Logistic Regression Gradient Descent)

对单个样本而言,逻辑回归Loss function表达式如下:

z=wTx+bz=w^Tx+b
y^=a=σ(z)\hat y=a=\sigma(z)
L(a,y)=(ylog(a)+(1y)log(1a))L(a,y)=-(y\log(a)+(1-y)\log(1-a))

计算该逻辑回归的反向传播过程:

da=La=ya+1y1ada=\frac{\partial L}{\partial a}=-\frac ya+\frac{1-y}{1-a}
dz=Lz=Laaz=(ya+1y1a)a(1a)=aydz=\frac{\partial L}{\partial z}=\frac{\partial L}{\partial a}\cdot \frac{\partial a}{\partial z}=(-\frac ya+\frac{1-y}{1-a})\cdot a(1-a)=a-y
dw1=Lw1=Lzzw1=x1dz=x1(ay)dw_1=\frac{\partial L}{\partial w_1}=\frac{\partial L}{\partial z}\cdot \frac{\partial z}{\partial w_1}=x_1\cdot dz=x_1(a-y)
dw2=Lw2=Lzzw2=x2dz=x2(ay)dw_2=\frac{\partial L}{\partial w_2}=\frac{\partial L}{\partial z}\cdot \frac{\partial z}{\partial w_2}=x_2\cdot dz=x_2(a-y)
db=Lb=Lzzb=1dz=aydb=\frac{\partial L}{\partial b}=\frac{\partial L}{\partial z}\cdot \frac{\partial z}{\partial b}=1\cdot dz=a-y

则梯度下降算法可表示为:

w1:=w1α dw1w_1:=w_1-\alpha\ dw_1
w2:=w2α dw2w_2:=w_2-\alpha\ dw_2
b:=bα dbb:=b-\alpha\ db

Last updated