Python 中平常下载文件办法九个方面知识
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在 Python 中,<span style="color: black;">咱们</span>常常<span style="color: black;">必须</span>下载文件,例如爬虫下载页面中的<span style="color: black;">照片</span>、下载页面的链接pdf文件<span style="color: black;">或</span>word文档、音频和视频;下载远程服务器上的资源,如压缩文件、视频、grib2等等。<span style="color: black;">那样</span><span style="color: black;">咱们</span><span style="color: black;">能够</span><span style="color: black;">经过</span><span style="color: black;">那些</span><span style="color: black;">办法</span>来实现文件的下载呢?</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">必须</span>说明的是:本例子<span style="color: black;">运用</span>下载网址为:</strong></p><strong style="color: blue;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3,是一首MP3的歌曲,4M的<span style="color: black;">体积</span>。粉丝<span style="color: black;">能够</span>在评论区讨论。</strong>
<h1 style="color: black; text-align: left; margin-bottom: 10px;">第<span style="color: black;">1、</span><span style="color: black;">运用</span>requests</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Requests 是 Python 中非常常用的第三方库,是一个非常强大、简单易用的 HTTP 请求库。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">咱们</span>直接采用requests.get <span style="color: black;">得到</span>文件的内容,<span style="color: black;">而后</span>二进制的<span style="color: black;">办法</span><span style="color: black;">保留</span>文件内容。url:下载的文件名;localfile:<span style="color: black;">保留</span>的本地文件名。timeout:单位秒,<span style="color: black;">倘若</span>文件比<span style="color: black;">很强</span>,<span style="color: black;">能够</span>适当的延<span style="color: black;">长期</span>。<span style="color: black;">倘若</span>文件比<span style="color: black;">很强</span>,<span style="color: black;">必须</span>下载比较长的时间,这个<span style="color: black;">办法</span>就不太适合了。示例代码如下:</span></p><span style="color: black;">import</span> requests
url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>
#url:下载的文件名;localfile:<span style="color: black;">保留</span>的本地文件名。timeout:单位秒
r = requests.<span style="color: black;">get</span>(url,timeout=<span style="color: black;">300</span>)
with <span style="color: black;">open</span>(localfile, <span style="color: black;">"wb"</span>) <span style="color: black;">as</span> fp:
fp.write(r.content)<h1 style="color: black; text-align: left; margin-bottom: 10px;">第<span style="color: black;">2、</span><span style="color: black;">运用</span>urllib</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">urllib库是Python的标准库,<span style="color: black;">因此呢</span>不<span style="color: black;">必须</span>安装它。<span style="color: black;"> urllib 库是一个非常方便的下载工具,<span style="color: black;">供给</span>了简单易用的下载接口。调用 urllib 库的 urlretrieve() <span style="color: black;">办法</span>下载文件。示例代码如下:</span></p><span style="color: black;">import</span> urllib.request
url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>urllib.request.urlretrieve(url, localfile)<h1 style="color: black; text-align: left; margin-bottom: 10px;">第<span style="color: black;">3、</span><span style="color: black;">运用</span>urllib2</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">urllib2库是urllib库的<span style="color: black;">加强</span>版,<span style="color: black;">供给</span>了<span style="color: black;">更加多</span>的功能和更好的错误处理。</span><strong style="color: blue;"><span style="color: black;"><span style="color: black;">必须</span>说明的是:urllib2 是 Python2.7 自带的模块(不<span style="color: black;">必须</span>下载,导入<span style="color: black;">就可</span><span style="color: black;">运用</span>),在 python3 中,urllib2 被改为urllib.request。为了教程的完整性,还是在<span style="color: black;">这儿</span>介绍一下。</span></strong><span style="color: black;">示例代码如下:</span></p><span style="color: black;">import</span> urllib2
url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>
<span style="color: black;">data</span> = urllib2.urlopen(url).read()
with <span style="color: black;">open</span>(localfile, <span style="color: black;">wb</span>) <span style="color: black;">as</span> video:
video.write(<span style="color: black;">data</span>)<h1 style="color: black; text-align: left; margin-bottom: 10px;">第<span style="color: black;">4、</span><span style="color: black;">运用</span>urllib3</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">urllib3是urllib模块的改进版本,<span style="color: black;">是一个轻量级的python库,<span style="color: black;">供给</span>了线程安全,HTTP连接池和重用,文件发送等。</span><span style="color: black;">咱们</span>将<span style="color: black;">经过</span><span style="color: black;">运用</span>urllib3来获取一个链接并将它存储在一个文件中。<span style="color: black;">示例代码如下:</span></p><span style="color: black;">import</span> urllib3
url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>
<span style="color: black;">#创建一个HTTP连接池</span>
http = urllib3.PoolManager()
r = http.request(<span style="color: black;">GET</span>,url)
print(r.data)
<span style="color: black;">with</span> open(localfile, <span style="color: black;">wb</span>) <span style="color: black;">as</span> mp3:
mp3.write(r.data)
r.release_conn() <span style="color: black;">#最后释放这个HTTP连接</span>
<h1 style="color: black; text-align: left; margin-bottom: 10px;">第<span style="color: black;">5、</span><span style="color: black;">运用</span>wget</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">wget是一个从网络上自动下载文件的自由工具,跨平台软件。它支持HTTP,HTTPS和FTP协议,<span style="color: black;">能够</span><span style="color: black;">运用</span>HTTP代理。<span style="color: black;">这儿</span><span style="color: black;">运用</span>Python的wget模块从一个URL下载文件。wget.download直接从<span style="color: black;">位置</span>url下载到本地localfile。<span style="color: black;">必须</span><span style="color: black;">运用</span>pip命令安装后<span style="color: black;">才可</span><span style="color: black;">运用</span>。</span></p><span style="color: black;">pip</span> install wget<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">示例代码如下:</span></p><span style="color: black;">import</span> wget
url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>
wget.download(url,localfile)<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">除了<span style="color: black;">运用</span> Python 编程语言,<span style="color: black;">咱们</span>还<span style="color: black;">能够</span>使用命令行工具 wget 来下载文件。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">下面是一个示例代码:</span></p><span style="color: black;">import</span> os
url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>localfile =<span style="color: black;">./233599851.mp3</span>
os.system(<span style="color: black;">fwget -O <span style="color: black;">{localfile}</span> <span style="color: black;">{url}</span></span>)<h1 style="color: black; text-align: left; margin-bottom: 10px;">第<span style="color: black;">6、</span>下载重定向的文件</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">有时<span style="color: black;">咱们</span>下载一个链接文件,<span style="color: black;">然则</span>远端的服务器将URL会被重定向到另一个的源URL。遇到这种<span style="color: black;">状况</span>,<span style="color: black;">运用</span>requests.get<span style="color: black;">能够</span><span style="color: black;">容易</span><span style="color: black;">处理</span>。get<span style="color: black;">办法</span>中,<span style="color: black;">咱们</span>将allow_redirects设置为True,这将<span style="color: black;">准许</span>URL中的重定向,<span style="color: black;">得到</span>重定向后的内容后,<span style="color: black;">而后</span>二进制的<span style="color: black;">办法</span><span style="color: black;">保留</span>文件内容。示例代码如下:</span></p>url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>
r = requests.<span style="color: black;">get</span>(url,timeout=<span style="color: black;">300</span>, allow_redirects=True)
with <span style="color: black;">open</span>(localfile, <span style="color: black;">"wb"</span>) <span style="color: black;">as</span>fp:
fp.write(r.content)<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">下面介绍大文件的下载<span style="color: black;">办法</span>。</p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">大文件的下载,<span style="color: black;">必须</span><span style="color: black;">运用</span>文件流进行下载,它是一种更<span style="color: black;">有效</span>、更安全的方式。<span style="color: black;">运用</span>文件流进行下载的原理,<span style="color: black;">便是</span>将文件<span style="color: black;">一起</span><span style="color: black;">一起</span>地按<span style="color: black;">次序</span>从网络上读取,每读取<span style="color: black;">一起</span>就立即写入本地文件。<span style="color: black;">这般</span>的<span style="color: black;">办法</span>既<span style="color: black;">能够</span><span style="color: black;">保准</span>安全<span style="color: black;">靠谱</span>地下载大文件,又<span style="color: black;">能够</span>避免因内存不足而<span style="color: black;">引起</span>的程序崩溃。</span></p>
<h1 style="color: black; text-align: left; margin-bottom: 10px;">第<span style="color: black;">7、</span><span style="color: black;">运用</span>requests库中的get<span style="color: black;">办法</span>,下载大文件</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">运用</span>requests库中的get<span style="color: black;">办法</span>,并将stream参数设置为True<span style="color: black;">就可</span>。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">示例代码如下:</span></p>url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>
r = requests.get(url, stream=True)
fp = <span style="color: black;">open</span>(localfile, <span style="color: black;">"wb"</span>)
<span style="color: black;">for</span> chunk <span style="color: black;">in</span> r.iter_content(chunk_size=<span style="color: black;">1024</span>):
<span style="color: black;">if</span>chunk:
fp.<span style="color: black;">write</span>(chunk)
fp.<span style="color: black;">flush</span>()<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">第<span style="color: black;">8、</span><span style="color: black;">运用</span></span>urllib2,下载大文件</strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">必须</span>说明的是</strong>,urllib2是Python2.7 自带的模块,示例代码只能在Python2.7运行。<span style="color: black;">做为</span>知识<span style="color: black;">认识</span>一下。<span style="color: black;">示例代码如下:</span></p><span style="color: black;">url</span> = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
<span style="color: black;">localfile</span> = <span style="color: black;">./233599851.mp3</span>
<span style="color: black;">r</span> = <span style="color: black;">urllib2.Request(url)</span>
<span style="color: black;">u</span> = <span style="color: black;">urllib2.urlopen(r)</span>
<span style="color: black;">with</span> <span style="color: black;">open(localfile, wb) as f:</span>
<span style="color: black;">while</span> <span style="color: black;">True:</span>
<span style="color: black;">tmp</span> = <span style="color: black;">u.read(1024)</span>
<span style="color: black;">if</span> <span style="color: black;">not tmp:</span>
<span style="color: black;">break</span>
<span style="color: black;">f.write(tmp)</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">第<span style="color: black;">9、</span><span style="color: black;">运用</span></span>urllib3,下载大文件</strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">既然<span style="color: black;">运用</span>urllib2库<span style="color: black;">能够</span>下载文件,当然<span style="color: black;">运用</span>urllib3库<span style="color: black;">亦</span>能办到。利用http连接池的request<span style="color: black;">办法</span>,并将preload_content参数设置为False<span style="color: black;">就可</span>。示例代码如下:</span></p><span style="color: black;">import</span> urllib3
url = <span style="color: black;">http://cv.sycdn.kuwo.cn/99d1a17cc43457c58cd0049db033c348/650e4bbc/resource/n1/17/37/233599851.mp3</span>
localfile = <span style="color: black;">./233599851.mp3</span>
#创建一个HTTP连接池
http = urllib3.PoolManager()
r = http.request(<span style="color: black;">GET</span>, url, preload_content=False)
chunk_size = <span style="color: black;">1024</span>with<span style="color: black;">open</span>(localfile, <span style="color: black;">wb</span>) <span style="color: black;">as</span> <span style="color: black;">out</span>:
<span style="color: black;">while</span> True:
<span style="color: black;">data</span> = r.read(chunk_size)
<span style="color: black;">if</span> not <span style="color: black;">data</span>:
<span style="color: black;">break</span>
<span style="color: black;">out</span>.write(<span style="color: black;">data</span>)
r.release_conn() #最后释放这个HTTP连接<h1 style="color: black; text-align: left; margin-bottom: 10px;">总结</h1>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">经过</span>介绍<span style="color: black;">平常</span>下载文件<span style="color: black;">办法</span>九个方面知识,<span style="color: black;">咱们</span>可以<span style="color: black;">容易</span>地实现 Python 下载文件的功能,<span style="color: black;">同期</span><span style="color: black;">亦</span>阐述了大文件下载的处理<span style="color: black;">办法</span>。当然,下载文件应该还有其他的<span style="color: black;">办法</span>,<span style="color: black;">这儿</span>就不一一介绍。其中,<span style="color: black;">运用</span> Requests 库是最为常用的<span style="color: black;">办法</span>,<span style="color: black;">由于</span>它<span style="color: black;">供给</span>了<span style="color: black;">非常多</span>强大而灵活的功能,非常方便编写<span style="color: black;">繁杂</span>的下载程序,这个<span style="color: black;">亦</span>是我在项目中常用的<span style="color: black;">办法</span>。</span>wget库下载功能强大,<span style="color: black;">亦</span>是一个非常好的<span style="color: black;">办法</span>。</p>
谷歌网站排名优化 http://www.fok120.com/ 感谢你的精彩评论,为我的思绪打开了新的窗口。
页:
[1]