# 1.1 计算机视觉（Computer vision）

图片分类，或图片识别：

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/373615de4e30035c662958ce39115fb4.png)](https://github.com/fengdu78/deeplearning_ai_books/blob/master/images/373615de4e30035c662958ce39115fb4.png)

目标检测：

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/f8ff84bc95636d9e37e35daef5149164.png)](https://github.com/fengdu78/deeplearning_ai_books/blob/master/images/f8ff84bc95636d9e37e35daef5149164.png)

神经网络实现图片风格迁移：

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/bf57536975bce32f78c9e66a2360e8a1.png)](https://github.com/fengdu78/deeplearning_ai_books/blob/master/images/bf57536975bce32f78c9e66a2360e8a1.png)

使用传统神经网络处理机器视觉的一个主要问题是输入层维度很大。例如一张64x64x3的图片，神经网络输入层的维度为12288。如果图片尺寸较大，例如一张1000x1000x3的图片，神经网络输入层的维度将达到3百万，使得网络权重W非常庞大。这样会造成两个后果，一是神经网络结构复杂，数据量相对不够，容易出现过拟合；二是所需内存、计算量较大。解决这一问题的方法就是使用卷积神经网络（CNN）。

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/f126bca19d15f113c0f0371fdf0833d8.png)](https://github.com/fengdu78/deeplearning_ai_books/blob/master/images/f126bca19d15f113c0f0371fdf0833d8.png)

[![](https://github.com/fengdu78/deeplearning_ai_books/raw/master/images/9dc51757210398f26ec96d13540beacb.png)](https://github.com/fengdu78/deeplearning_ai_books/blob/master/images/9dc51757210398f26ec96d13540beacb.png)
