How to Extract Image Metadata in Python

Abdeladim Fadheli · 3 min read · Updated apr 2024 · Ethical Hacking · Web Scraping · Digital Forensics

Confused by complex code? Let our AI-powered Code Explainer demystify it for you. Try it out!

In this tutorial, you will learn how you can extract some useful metadata within images using the Pillow library in Python.

Devices like digital cameras, smartphones, and scanners use the EXIF standard to save images or audio files. This standard contains many useful tags to extract, which can be useful for forensic investigation, such as the make and model of the device, the exact date and time of image creation, and even the GPS information on some devices.

Please note that there are free tools to extract metadata such as ImageMagick or ExifTool on Linux, the goal of this tutorial is to extract metadata with the Python programming language.

To get started, you need to install the Pillow library:

$ pip3 install Pillow

Open up a new Python file and follow along:

from PIL import Image
from PIL.ExifTags import TAGS

Now, this will only work on JPEG image files, take any image you took and test it for this tutorial (if you want to test on my image, you'll find it in the tutorial's repository):

# path to the image or video
imagename = "image.jpg"

# read the image data using PIL
image = Image.open(imagename)

We loaded the image using the Image.open() method. Before calling the getexif() function, the Pillow library has some attributes on the image object; let's print them out:

# extract other basic metadata
info_dict = {
    "Filename": image.filename,
    "Image Size": image.size,
    "Image Height": image.height,
    "Image Width": image.width,
    "Image Format": image.format,
    "Image Mode": image.mode,
    "Image is Animated": getattr(image, "is_animated", False),
    "Frames in Image": getattr(image, "n_frames", 1)
}

for label,value in info_dict.items():
    print(f"{label:25}: {value}")

Get our Ethical Hacking with Python EBook

Now, let's call the getexif() method on the image which returns image metadata:

# extract EXIF data
exifdata = image.getexif()

The problem with exifdata variable now is that the field names are just IDs, not a human-readable field name, that's why we gonna need the TAGS dictionary from PIL.ExifTags module, which maps each tag ID into a human-readable text:

# iterating over all EXIF data fields
for tag_id in exifdata:
    # get the tag name, instead of human unreadable tag id
    tag = TAGS.get(tag_id, tag_id)
    data = exifdata.get(tag_id)
    # decode bytes 
    if isinstance(data, bytes):
        data = data.decode()
    print(f"{tag:25}: {data}")

Here is my output:

Filename                 : .\image.jpg
Image Size               : (5312, 2988)       
Image Height             : 2988
Image Width              : 5312
Image Format             : JPEG
Image Mode               : RGB
Image is Animated        : False
Frames in Image          : 1
ExifVersion              : 0220
ShutterSpeedValue        : 4.32
ApertureValue            : 1.85
DateTimeOriginal         : 2016:11:10 19:33:22
DateTimeDigitized        : 2016:11:10 19:33:22
BrightnessValue          : -1.57
ExposureBiasValue        : 0.0
MaxApertureValue         : 1.85
MeteringMode             : 3
Flash                    : 0
FocalLength              : 4.3
ColorSpace               : 1
ExifImageWidth           : 5312
FocalLengthIn35mmFilm    : 28
SceneCaptureType         : 0
ImageWidth               : 5312
ExifImageHeight          : 2988
ImageLength              : 2988
Make                     : samsung
Model                    : SM-G920F
Orientation              : 1
YCbCrPositioning         : 1
XResolution              : 72.0
YResolution              : 72.0
ImageUniqueID            : A16LLIC08SM A16LLIL02GM

ExposureProgram          : 2
ISOSpeedRatings          : 640
ResolutionUnit           : 2
ExposureMode             : 0
FlashPixVersion          : 0100
WhiteBalance             : 0
Software                 : G920FXXS4DPI4
DateTime                 : 2016:11:10 19:33:22
ExifOffset               : 226
MakerNote                : 0100 
                                Z@P
UserComment              :
ExposureTime             : 0.05
FNumber                  : 1.9

A bunch of useful stuff; by quickly googling the Model, I concluded that this image was taken by a Samsung Galaxy S6. Run this on images that were captured by other devices, and you'll see different (maybe more) fields.

Alright, we're done. A good challenge for you is to download all images from a URL and then run this tutorial's script on every image you find and investigate the interesting results!

Finally, we have an EBook for ethical hackers like you, where we build 35+ hacking tools with Python from scratch! Check it out here.

Learn also: How to Use Steganography to Hide Secret Data in Images in Python.

Happy Coding ♥

Want to code smarter? Our Python Code Assistant is waiting to help you. Try it now!

View Full Code Improve My Code

Sharing is caring!

Comment panel

Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!