# 2.5 学习词嵌入（Learning Word Embeddings）

embedding matrix $$E$$可以通过构建自然语言模型，运用梯度下降算法得到。若输入样本是：

**I want a glass of orange (juice).**

通过这句话的前6个单词，预测最后的单词“juice”。$$E$$未知待求，每个单词可用embedding vector $$e\_w$$表示。构建的神经网络模型结构如下图所示：

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/31347eca490e0ae8541140fb01c04d72.png)](https://legacy.gitbook.com/book/baozou/neural-networks-and-deep-learning/edit#)

神经网络输入层包含6个embedding vectors，每个embedding vector维度是300，则输入层总共有1800个输入。Softmax层有10000个概率输出，与词汇表包含的单词数目一致。正确的输出label是“juice”。其中$$E,W^{\[1]},b^{\[1]},W^{\[2]},b^{\[2]}$$为待求值。对足够的训练例句样本，运用梯度下降算法，迭代优化，最终求出embedding matrix$$E$$

这种算法的效果还不错，能够保证具有相似属性单词的embedding vector相近

为了让神经网络输入层数目固定，可以选择只取预测单词的前4个单词作为输入，例如该句中只选择“a glass of orange”四个单词作为输入。这里的4是超参数，可调

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/747e619260737ded586ae51b3b4f07d6.png)](https://legacy.gitbook.com/book/baozou/neural-networks-and-deep-learning/edit#)

把输入叫做**context**，输出叫做**target**。对应到上面这句话里：

* **context: a glass of orange**
* **target: juice**

关于context的选择有多种方法：

* **target前n个单词或后n个单词，n可调**
* **target前1个单词**
* **target附近某1个单词（Skip-Gram）**$$E$$

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/638c103855ffeb25122259dd6b669850.png)](https://github.com/fengdu78/deeplearning_ai_books/blob/master/images/638c103855ffeb25122259dd6b669850.png)

事实证明，不同的context选择方法都能计算出较准确的embedding matrix $$E$$


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://baozoulin.gitbook.io/neural-networks-and-deep-learning/di-wu-men-ke-xu-lie-mo-xing-sequence-models/di-wu-men-kexulie-mo-578b28-sequence-models/natural-language-processing-and-word-embeddings/25-xue-xi-ci-qian-ru-ff08-learning-word-embeddings.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
