# 2.8 Adam 优化算法(Adam optimization algorithm)

Adam（Adaptive Moment Estimation）算法结合了动量梯度下降算法和RMSprop算法。其算法流程为：

$$V\_{dW}=0,\ S\_{dW},\ V\_{db}=0,\ S\_{db}=0$$

$$On\ iteration\ t:$$

$$
\ \ \ \ Cimpute\ dW,\ db
$$

$$
\ \ \ \ V\_{dW}=\beta\_1V\_{dW}+(1-\beta\_1)dW,\ V\_{db}=\beta\_1V\_{db}+(1-\beta\_1)db
$$

$$
\ \ \ \ S\_{dW}=\beta\_2S\_{dW}+(1-\beta\_2)dW^2,\ S\_{db}=\beta\_2S\_{db}+(1-\beta\_2)db^2
$$

$$
\ \ \ \ V\_{dW}^{corrected}=\frac{V\_{dW}}{1-\beta\_1^t},\ V\_{db}^{corrected}=\frac{V\_{db}}{1-\beta\_1^t}
$$

$$
\ \ \ \ S\_{dW}^{corrected}=\frac{S\_{dW}}{1-\beta\_2^t},\ S\_{db}^{corrected}=\frac{S\_{db}}{1-\beta\_2^t}
$$

$$
\ \ \ \ W:=W-\alpha\frac{V\_{dW}^{corrected}}{\sqrt{S\_{dW}^{corrected}}+\varepsilon},\ b:=b-\alpha\frac{V\_{db}^{corrected}}{\sqrt{S\_{db}^{corrected}}+\varepsilon}
$$

Adam算法包含了几个超参数，分别是：$$\alpha,\beta\_1,\beta\_2,\varepsilon$$,$$\beta\_1$$通常设置为0.9，$$\beta\_2$$通常设置为0.999，$$\varepsilon$$通常设置为$$10^{-8}$$。一般只需要对$$\beta\_1$$和$$\beta\_2$$进行调试

Adam算法结合了动量梯度下降和RMSprop各自的优点，使得神经网络训练速度大大提高


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://baozoulin.gitbook.io/neural-networks-and-deep-learning/di-er-men-ke-gai-shan-shen-ceng-shen-jing-wang-luo-chao-can-shu-tiao-shi-zheng-ze-hua-yi-ji-you-hua/improving-deep-neural-networks/optimization-algorithms/28-adam-you-hua-suan-6cd528-adam-optimization-algorithm.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
