Disclosure: This post may contain affiliate links, meaning when you click the links and make a purchase, we receive a commission.
Encryption is the process of encoding a piece of information so that only authorized parties can access it. It is critically important because it allows you to securely protect data that you don't want anyone to see or access.
In this tutorial, you will learn how to use Python to encrypt files or any byte object (also string objects) using the cryptography library.
We will use symmetric encryption, which means the same key we used to encrypt data is also usable for decryption. There are a lot of encryption algorithms out there. The library we gonna use is built on top of the AES algorithm.
There are many encryption uses in the real world. In fact, if you're reading this, then your browser is securely connected to this website (i.e., encryption). However, there are malicious uses of encryption, such as building ransomware; we have a tutorial on how to build such a tool. Check it out here.
Note: It is important to understand the difference between encryption and hashing algorithms. In encryption, you can retrieve the original data once you have the key, wherein hashing functions, you cannot; that's why they're called one-way encryption.
Table of content:
Let's start off by installing
pip3 install cryptography
Open up a new Python file, and let's get started:
from cryptography.fernet import Fernet
Fernet is an implementation of symmetric authenticated cryptography; let's start by generating that key and writing it to a file:
def write_key(): """ Generates a key and save it into a file """ key = Fernet.generate_key() with open("key.key", "wb") as key_file: key_file.write(key)
Fernet.generate_key() function generates a fresh fernet key, you really need to keep this in a safe place. If you lose the key, you will no longer be able to decrypt data that was encrypted with this key.
Since this key is unique, we won't be generating the key each time we encrypt anything, so we need a function to load that key for us:
def load_key(): """ Loads the key from the current directory named `key.key` """ return open("key.key", "rb").read()
Now that we know how to generate, save and load the key, let's start by encrypting string objects, just to make you familiar with it first.
Generating and writing the key to a file:
# generate and write a new key write_key()
Let's load that key:
# load the previously generated key key = load_key()
message = "some secret message".encode()
Since strings have the type of
str in Python, we need to encode them and convert them to
bytes to be suitable for encryption, the
encode() method encodes that string using the utf-8 codec. Initializing the
Fernet class with that key:
# initialize the Fernet class f = Fernet(key)
Encrypting the message:
# encrypt the message encrypted = f.encrypt(message)
f.encrypt() method encrypts the data passed. The result of this encryption is known as a "Fernet token" and has strong privacy and authenticity guarantees.
Let's see how it looks:
# print how it looks print(encrypted)
decrypted_encrypted = f.decrypt(encrypted) print(decrypted_encrypted)
b'some secret message'
That's indeed, the same message.
f.decrypt() method decrypts a Fernet token. This will return the original plaintext as the result when it's successfully decrypted, otherwise, it'll raise an exception.
Learn also: How to Encrypt and Decrypt PDF Files in Python.
Now you know how to basically encrypt strings, let's dive into file encryption; we need a function to encrypt a file given the name of the file and key:
def encrypt(filename, key): """ Given a filename (str) and key (bytes), it encrypts the file and write it """ f = Fernet(key)
After initializing the
Fernet object with the given key, let's read the target file first:
with open(filename, "rb") as file: # read all file data file_data = file.read()
file_data contains the data of the file, encrypting it:
# encrypt data encrypted_data = f.encrypt(file_data)
Writing the encrypted file with the same name, so it will override the original (don't use this on sensitive information yet, just test on some junk data):
# write the encrypted file with open(filename, "wb") as file: file.write(encrypted_data)
Okay, that's done. Going to the decryption function now, it is the same process, except we will use the
decrypt() function instead of
encrypt() on the
def decrypt(filename, key): """ Given a filename (str) and key (bytes), it decrypts the file and write it """ f = Fernet(key) with open(filename, "rb") as file: # read the encrypted data encrypted_data = file.read() # decrypt data decrypted_data = f.decrypt(encrypted_data) # write the original file with open(filename, "wb") as file: file.write(decrypted_data)
Let's test this. I have a
data.csv file and a key in the current directory, as shown in the following figure:
It is a completely readable file. To encrypt it, all we need to do is call the function we just wrote:
# uncomment this if it's the first time you run the code, to generate the key # write_key() # load the key key = load_key() # file name file = "data.csv" # encrypt it encrypt(file, key)
Once you execute this, you may see the file increased in size, and it's unreadable; you can't even read a single word!
To get the file back into the original form, just call the
# decrypt the file decrypt(file, key)
That's it! You'll see the original file appears in place of the encrypted previously.
Instead of randomly generating a key, what if we can generate the key from a password? Well, to be able to do that, we can use algorithms that are for this purpose.
One of these algorithms is Scrypt. It is a password-based key derivation function that was created in 2009 by Colin Percival, we will be using it to generate keys from a password.
If you want to follow along, create a new Python file and import the following:
import cryptography from cryptography.fernet import Fernet from cryptography.hazmat.primitives.kdf.scrypt import Scrypt import secrets import base64 import getpass
First, key derivation functions need random bits added to the password before it's hashed; these bits are called the salt, which helps strengthen security and protect against dictionary and brute-force attacks. Let's make a function to generate that using the
def generate_salt(size=16): """Generate the salt used for key derivation, `size` is the length of the salt to generate""" return secrets.token_bytes(size)
We have a tutorial on generating random data. Make sure to check it out if you're unsure about the above cell.
Next, let's make a function to derive the key from the password and the salt:
def derive_key(salt, password): """Derive the key from the `password` using the passed `salt`""" kdf = Scrypt(salt=salt, length=32, n=2**14, r=8, p=1) return kdf.derive(password.encode())
We initialize the Scrypt algorithm by passing:
lengthof the key (32 in this case).
n: CPU/Memory cost parameter, must be larger than 1 and be a power of 2.
r: Block size parameter.
p: Parallelization parameter.
As mentioned in the documentation,
p can adjust the computational and memory cost of the Scrypt algorithm. RFC 7914 recommends values of
p=1, where the original Scrypt paper suggests that
n should have a minimum value of
2**14 for interactive logins or
2**20 for more sensitive files; you can check the documentation for more information.
Next, we make a function to load a previously generated salt:
def load_salt(): # load salt from salt.salt file return open("salt.salt", "rb").read()
Now that we have the salt generation and key derivation functions, let's make the core function that generates the key from a password:
def generate_key(password, salt_size=16, load_existing_salt=False, save_salt=True): """ Generates a key from a `password` and the salt. If `load_existing_salt` is True, it'll load the salt from a file in the current directory called "salt.salt". If `save_salt` is True, then it will generate a new salt and save it to "salt.salt" """ if load_existing_salt: # load existing salt salt = load_salt() elif save_salt: # generate new salt and save it salt = generate_salt(salt_size) with open("salt.salt", "wb") as salt_file: salt_file.write(salt) # generate the key from the salt and the password derived_key = derive_key(salt, password) # encode it using Base 64 and return it return base64.urlsafe_b64encode(derived_key)
The above function accepts the following arguments:
password: The password string to generate the key from.
salt_size: An integer indicating the size of the salt to generate.
load_existing_salt: A boolean indicating whether we load a previously generated salt.
save_salt: A boolean to indicate whether we save the generated salt.
After we load or generate a new salt, we derive the key from the password using our
derive_key() function, and finally, return the key as a Base64-encoded text.
Now we can use the same
encrypt() function we defined earlier:
def encrypt(filename, key): """ Given a filename (str) and key (bytes), it encrypts the file and write it """ f = Fernet(key) with open(filename, "rb") as file: # read all file data file_data = file.read() # encrypt data encrypted_data = f.encrypt(file_data) # write the encrypted file with open(filename, "wb") as file: file.write(encrypted_data)
decrypt() function, we add a simple try-except block to handle the exception when the password is wrong:
def decrypt(filename, key): """ Given a filename (str) and key (bytes), it decrypts the file and write it """ f = Fernet(key) with open(filename, "rb") as file: # read the encrypted data encrypted_data = file.read() # decrypt data try: decrypted_data = f.decrypt(encrypted_data) except cryptography.fernet.InvalidToken: print("Invalid token, most likely the password is incorrect") return # write the original file with open(filename, "wb") as file: file.write(decrypted_data) print("File decrypted successfully")
Awesome! Let's use
argparse so we can pass arguments from the command line:
if __name__ == "__main__": import argparse parser = argparse.ArgumentParser(description="File Encryptor Script with a Password") parser.add_argument("file", help="File to encrypt/decrypt") parser.add_argument("-s", "--salt-size", help="If this is set, a new salt with the passed size is generated", type=int) parser.add_argument("-e", "--encrypt", action="store_true", help="Whether to encrypt the file, only -e or -d can be specified.") parser.add_argument("-d", "--decrypt", action="store_true", help="Whether to decrypt the file, only -e or -d can be specified.") args = parser.parse_args() file = args.file if args.encrypt: password = getpass.getpass("Enter the password for encryption: ") elif args.decrypt: password = getpass.getpass("Enter the password you used for encryption: ") if args.salt_size: key = generate_key(password, salt_size=args.salt_size, save_salt=True) else: key = generate_key(password, load_existing_salt=True) encrypt_ = args.encrypt decrypt_ = args.decrypt if encrypt_ and decrypt_: raise TypeError("Please specify whether you want to encrypt the file or decrypt it.") elif encrypt_: encrypt(file, key) elif decrypt_: decrypt(file, key) else: raise TypeError("Please specify whether you want to encrypt the file or decrypt it.")
Let's test our script by encrypting
data.csv as previously:
$ python crypt_password.py data.csv --encrypt --salt-size 16 Enter the password for encryption:
You'll be prompted to enter a password,
get_pass() hides the characters you type, so it's more secure. You'll also notice that the
salt.salt file is generated.
If you open the target
data.csv file, you'll see it's encrypted. Now let's try to decrypt it with the wrong password:
$ python crypt_password.py data.csv --decrypt Enter the password you used for encryption: Invalid token, most likely the password is incorrect
data.csv remains as is. Let's pass the correct password that was used in the encryption:
$ python crypt_password.py data.csv --decrypt Enter the password you used for encryption: File decrypted successfully
Amazing! You'll see that the
data.csv returned to its original form.
Note that if you generate another salt (by passing
--salt-size) while decrypting, even if it's the correct password, you won't be able to recover the file as a new salt will be generated that overrides the previous one, so make sure to not pass
--salt-size when decrypting.
Check cryptography's official documentation for further details and instructions.
Note that you need to beware of large files, as the file will need to be completely on memory to be suitable for encryption. You need to consider using some methods of splitting the data or file compression for large files!
Here is the full code for both techniques used in this tutorial.
Happy Coding ♥View Full Code