3.3 小练习: 下载美图

图片都存在于 img_list 的这种 < ul> 中

图片地址都是在 < img> 中

<img src="http://image.nationalgeographic.com.cn/2017/1228/20171228030617696.jpg">

下载图片

from bs4 import BeautifulSoup
import requests

URL = "http://www.nationalgeographic.com.cn/animals/"

html = requests.get(URL).text
soup = BeautifulSoup(html, 'lxml')
img_ul = soup.find_all('ul', {"class": "img_list"})

for ul in img_ul:
    imgs = ul.find_all('img')
    for img in imgs:
        url = img['src']
        r = requests.get(url, stream = True)
        image_name = url.split('/')[-1]
        with open('./img/%s' % image_name, 'wb') as f:
            for chunk in r.iter_content(chunk_size = 128):
                f.write(chunk)
        print('Saved %s' % image_name)

"""
Saved 20171227102206573.jpg
...
Saved 20171214020322682.jpg
"""

Last updated

Was this helpful?