How to Download Files from URL in Python

Learn how to use requests and tqdm libraries to build a powerful file downloader with progress bar using Python.
  · 3 min read · Updated may 2024 · General Python Tutorials

Get a head start on your coding projects with our Python Code Generator. Perfect for those times when you need a quick solution. Don't wait, try it today!

Downloading files from the Internet is one of the most common daily tasks to perform on the Web. It is important because a lot of successful software allows their users to download files from the Internet.

In this tutorial, you will learn how to download files over HTTP in Python using the requests library.

Related: How to Use Hash Algorithms in Python using hashlib.

Let's get started, installing the required dependencies:

pip3 install requests tqdm

We gonna use the tqdm module here just to print a good-looking progress bar in the downloading process.

Open up a new Python file and import:

from tqdm import tqdm
import requests
import cgi
import sys

We'll be getting the file URL from the command line arguments:

# the url of file you want to download, passed from command line arguments
url = sys.argv[1]

Now the method we gonna use to download content from the web is requests.get(), but the problem is it downloads the file immediately and we don't want that, as it will get stuck on large files and the memory will be filled. Luckily for us, there is an attribute we can set to True, which is stream parameter:

# read 1024 bytes every time 
buffer_size = 1024
# download the body of response by chunk, not immediately
response = requests.get(url, stream=True)

Now only the response headers are downloaded and the connection remains open, hence allowing us to control the workflow by the use of iter_content() method. Before we see it in action, we first need to retrieve the total file size and the file name:

# get the total file size
file_size = int(response.headers.get("Content-Length", 0))
# get the default filename
default_filename = url.split("/")[-1]
# get the content disposition header
content_disposition = response.headers.get("Content-Disposition")
if content_disposition:
    # parse the header using cgi
    value, params = cgi.parse_header(content_disposition)
    # extract filename from content disposition
    filename = params.get("filename", default_filename)
else:
    # if content dispotion is not available, just use default from URL
    filename = default_filename

We get the file size in bytes from Content-Length response header, we also get the file name in Content-Disposition header, but we need to parse it using cgi.parse_header() function.

Let's download the file now:

# progress bar, changing the unit to bytes instead of iteration (default by tqdm)
progress = tqdm(response.iter_content(buffer_size), f"Downloading {filename}", total=file_size, unit="B", unit_scale=True, unit_divisor=1024)
with open(filename, "wb") as f:
    for data in progress.iterable:
        # write data read to the file
        f.write(data)
        # update the progress bar manually
        progress.update(len(data))

iter_content() method iterates over the response data, this avoids reading the content at once into memory for large responses, we specified buffer_size as the number of bytes it should read into memory in every loop.

We then wrapped the iteration with a tqdm object, which will print a fancy progress bar. We also changed the tqdm default unit from iteration to bytes.

After that, in each iteration, we read a chunk of data and write it to the file opened, and update the progress bar.

Here is my result after trying to download a file, you can choose any file you want, just make sure it ends with the file extension (.exe, .pdf, etc.):

C:\file-downloader>python download.py https://download.virtualbox.org/virtualbox/6.1.18/VirtualBox-6.1.18-142142-Win.exe
Downloading VirtualBox-6.1.18-142142-Win.exe:   8%|██▍                             | 7.84M/103M [00:06<01:14, 1.35MB/s]

It is working!

Alright, we are done, as you may see, downloading files in Python is pretty easy using powerful libraries like requests, you can now use this on your Python applications, good luck!

Here are some ideas you can implement:

By the way, if you wish to download torrent files, check this tutorial.

Happy Coding ♥

Take the stress out of learning Python. Meet our Python Code Assistant – your new coding buddy. Give it a whirl!

View Full Code Auto-Generate My Code
Sharing is caring!



Read Also



Comment panel

    Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!