Unlock the secrets of your code with our AI-powered Code Explainer. Take a look!
In this tutorial, you will learn how you can hide data into images with Python using OpenCV and NumPy libraries. This is called Steganography.
Table of content:
Steganography is the practice of hiding a file, message, image, or video within another file, message, image, or video. The word Steganography is derived from the Greek words "steganos" (meaning hidden or covered) and "graphe" (meaning writing).
Hackers often use it to hide secret messages or data within media files such as images, videos, or audio files. Even though there are many legitimate uses for Steganography, such as watermarking, malware programmers have also been found to use it to obscure the transmission of malicious code.
In this tutorial, we will write Python code to hide text messages using Least Significant Bit.
Least Significant Bit (LSB) is a technique in which the last bit of each pixel is modified and replaced with the data bit. This method only works on Lossless-compression images, which means the files are stored in a compressed format. However, this compression does not result in the data being lost or modified. PNG, TIFF, and BMP are examples of lossless-compression image file formats.
As you may already know, an image consists of several pixels, each containing three values (Red, Green, and Blue); these values range from 0 to 255. In other words, they are 8-bit values. For example, a value of 225 is 11100001 in binary, and so on.
To simplify the process, let's take an example of how this technique works; say I want to hide the message "hi"
in a 4x3
image. Here are the example image pixel values:
By looking at the ASCII Table, we can convert the "hi"
message into decimal values and then into binary:
Now, we iterate over the pixel values one by one; after converting them to binary, we replace each least significant bit with that message bit sequentially. 225 is 11100001, we replace the last bit (highlighted), the bit in the right (1), with the first data bit (0), which results in 11100000, meaning it's 224 now.
After that, we go to the next value, which is 00001100, and replace the last bit with the following data bit (1), and so on until the data is completely encoded.
This will only modify the pixel values by +1 or -1, which is not visually noticeable. You can also use 2-Least Significant Bits, which will change the pixel values by a range of -3 to +3.
Here are the resulting pixel values (you can check them on your own):
You can also use the three or four least significant bits when the data you want to hide is a little bigger and won't fit your image if you use only the least significant bit. In the upcoming sections, we will add an option to use any number of bits you want.
Related: How to Use Hashing Algorithms in Python.
Now that we understand the technique we are going to use, let's dive into the Python implementation; we are going to use OpenCV
to manipulate the image, you can use any other imaging library you want (such as PIL
):
Open up a new Python file and follow along:
Get: Build 35+ Ethical Hacking Scripts & Tools with Python Book
Let's start by implementing a function to convert any type of data into binary, and we will use this to convert the secret data and pixel values to binary in the encoding and decoding phases:
The below function will be responsible for hiding text data inside images:
Here is what the encode()
function does:
cv2.imread()
function.Now here is the decoder function:
We read the image and then get the last bits of every image pixel. After that, we keep decoding until we see the stopping criteria we used during encoding.
Let's use these functions:
I have an example PNG image here; use whatever picture you want. Just make sure it is a Lossless-compression image format such as PNG, as discussed earlier.
The above code will take image.PNG
image, encode secret_data
string into it and save it into encoded_image.PNG
. After that, we use the decode()
function that loads the new image and decodes the hidden message in it.
After the execution of the script, it will write another file "encoded_image.PNG"
with precisely the same image looking but with secret data encoded in it. Here is the output:
So we can decode about 122KB (125028 bytes) on this particular image. This will vary from one image to another based on its resolution size.
Related: Build 35+ Ethical Hacking Scripts & Tools with Python Book
In this section, we will make another script that is more advanced than the previous one, which has the following additional features:
To get started, we import the necessary libraries and the to_bin()
function as before:
Now let's make the new encode()
function:
This time, secret_data
can be an str
(hiding text) or bytes
(hiding any binary data).
Besides that, we wrap the encoding with another for
loop iterating n_bits
times. The default n_bits
parameter is set to 2, meaning we encode the data in the two least significant bits of each pixel, and we will pass command-line arguments to this parameter. It can be as low as 1 (won't encode much data) or as high as 6, but the resulting image will look different and a bit noisy.
For the decoding part, it's the same as before, but we add the in_bytes
boolean parameter to indicate whether it's binary data. If it is so, then we use bytearray()
instead of a regular string to construct our decoded data:
Next, we use the argparse
module to parse command-line arguments to pass to the encode()
and decode()
functions:
Note: You can always check the complete code here.
Here we added five arguments to pass:
-t
or --text
: If we want to encode text into an image, then this is the parameter we pass to do so.-f
or --file
: If we want to encode files instead of text, we pass this argument along with the file path.-e
or --encode
: The image we want to hide our data into.-d
or --decode
: The image we want to extract data from.-b
or --n-bits
: The number of least significant bits to use. If you have larger data, then make sure to increase this parameter. I do not suggest being higher than 4, as the image will look scandalous and too apparent that something is going wrong with the image.Master Ethical Hacking with Python by building 35+ Tools from scratch. Get your copy now!
Download EBookLet's run our code. Now I have the same image (image.PNG
) as before:
Let's try to hide the
data.csv
file into it:
We pass the image using the -e
parameter, and the file we want to hide using the -f
parameter. I also specified the number of least significant bits to be one. Unfortunately, see the output:
This error is totally expected since using only one bit on each pixel value won't be sufficient to hide the entire 363KB file. Therefore, let's increase the number of bits (-b
parameter):
Two bits is still not enough. The maximum bytes to encode is 250KB, and we need around 370KB. Increasing to 3:
You'll see now the data.csv
is successfully encoded into a new image_encoded.PNG
and it appeared in the current directory:
Let's extract the data from the
image_encoded.PNG
now:
Amazing! This time I have passed the encoded image to the -d
parameter. I have also passed data_decoded.csv
to -f
for the resulting filename to write. Let's recheck our directory:
As you can see, the new file appeared identical to the original. Note that you must set the same
-b
parameter when encoding and decoding.
I emphasize that you only increase the -b
parameter when necessary (i.e., when the data is big). I have tried to hide a larger file (over 700KB) into the same image, and the minimum allowed least significant bit was 6. Here's what the resulting encoded image looks like:
So there is clearly something wrong with the image, as the pixel values change in the range of -64 and +64, so that's a lot.
Awesome! You just learned how you can implement Steganography in Python on your own!
As you may notice, the resulting image will look exactly the same as the original image only when the number of least significant bits (-b
parameter) is low such as one or two. So whenever a person sees the image, they won't be able to detect whether there is hidden data within it.
If the data you want to hide is big, then make sure you take a high-resolution image instead of increasing the -b
parameter to a higher number than 4 because it will be so evident that there is something wrong with the picture.
Also, if you're familiar with Linux commands, you can also perform Steganography using standard Linux commands.
Here are some ideas and challenges you can do:
Finally, we have an Ethical Hacking with Python Ebook, where we build 35+ hacking tools and scripts! Make sure to check it out if you're interested.
Learn also: How to Extract Image Metadata in Python.
Happy Coding ♥
Found the article interesting? You'll love our Python Code Generator! Give AI a chance to do the heavy lifting for you. Check it out!
View Full Code Transform My Code
Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!