To download a large file in python with requests:
- Use the iter_content() method
- Use the shutil.copyfileobj() method
Using request library to download a large file in Python
With the following streaming code, the Python memory usage is restricted regardless of the size of the downloaded file
def download_file(url): local_filename = url.split('/')[-1] # NOTE the stream=True parameter below with requests.get(url, stream=True) as r: r.raise_for_status() with open(local_filename, 'wb') as f: for chunk in r.iter_content(chunk_size=8192): # If you have chunk encoded response uncomment if # and set chunk_size parameter to None. #if chunk: f.write(chunk) return local_filename
Note that the number of bytes returned using iter_content is not exactly the chunk_size; it’s expected to be a random number that is often far bigger and is expected to be different in every iteration.
the chunk_size is crucial. by default, it’s 1 (1 byte). that means that for 1MB it’ll make 1 milion iterations. So be careful about that.
Using shutil and requests module to download a large file
This streams the file to disk without using excessive memory, and the code is simple.
import requests import shutil def download_file(url): local_filename = url.split('/')[-1] with requests.get(url, stream=True) as r: with open(local_filename, 'wb') as f: shutil.copyfileobj(r.raw, f) return local_filename
Download a large file using urllib module
Using the requests module is good but what about the urllib module to download large files in python. Check the following code that uses the urllib module to download large files.
from urllib.request import urlretrieve url = 'http://mirror.pnl.gov/releases/16.04.2/ubuntu-16.04.2-desktop-amd64.iso' dst = 'ubuntu-16.04.2-desktop-amd64.iso' urlretrieve(url, dst)
Try this code, If you want to save it to memory. It just copies the file to a temporary file in python.
from urllib.request import urlopen from shutil import copyfileobj from tempfile import NamedTemporaryFile url = 'http://mirror.pnl.gov/releases/16.04.2/ubuntu-16.04.2-desktop-amd64.iso' with urlopen(url) as fsrc, NamedTemporaryFile(delete=False) as fdst: copyfileobj(fsrc, fdst)
Large file download using wget module
You can also use the wget module to download files. Check the code below.
import wget wget.download(url)