Ready to take Python coding to a new level? Explore our Python Code Generator. The perfect tool to get your code up and running in no time. Start now!
The metadata in PDFs is useful information about the PDF document, it includes the title of the document, the author, last modification date, creation date, subject, and much more. Some PDF files got more information than others, and in this tutorial, you will learn how to extract PDF metadata in Python.
There are a lot of libraries and utilities in Python to accomplish the same thing but I like using pikepdf, as it's an active and maintained library. Let's install it:
Pikepdf is a Pythonic wrapper around the C++ QPDF library. Let's import it in our script:
We'll also use the sys module to get the filename from the command-line arguments:
Let's load the PDF file using the library, and get the metadata:
The docinfo
attribute contains a dictionary of the document's metadata. Here is an example execution:
Output:
Related: How to Split PDF Files in Python.
Here is another PDF file:
Output:
As you can see, not all documents have the same fields, some contain much less information.
Notice that the /ModDate
and /CreationDate
are the last modification date and creation date respectively in the PDF datetime format. If you want to convert this format into Python datetime format, then I have copied this code from StackOverflow and edit it a little to run on Python 3:
Here is the same output previously, but with datetime formats converted to Python datetime objects:
Master PDF Manipulation with Python by building PDF tools from scratch. Get your copy now!
Download EBookMuch better. I hope this quick tutorial helped you to get the metadata of PDF documents with Python.
Check the complete code here.
Here are some PDF-related tutorials:
For more PDF handling guides on Python, you can check our Practical Python PDF Processing EBook, where we dive deeper into PDF document manipulation with Python, make sure to check it out here if you're interested!
Learn also: How to Extract Image Metadata in Python
Happy coding ♥
Why juggle between languages when you can convert? Check out our Code Converter. Try it out today!
View Full Code Understand My Code
Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!