Step up your coding game with AI-powered Code Explainer. Get insights like never before!
Object detection algorithms, especially those employing sliding window techniques, often generate multiple candidate bounding boxes for the same object. These overlapping detections can clutter the final output, making it difficult to discern individual objects and impacting the overall performance of the detection system.
This is where Non-Maximum Suppression (NMS) comes in. NMS is a critical post-processing step in object detection that addresses a common challenge: multiple detections of the same object. This technique is essential for refining the detection results, ensuring each detected object is represented by a single, most accurate bounding box.
NMS improves the accuracy and performance of object detection systems by eliminating multiple overlapping detections for the same object.
An example of how NMS removes overlapping detections and keeps only the detection with the highest confidence score.
So, you might be wondering how NMS works. Well, the process of NMS can be broken down into the following steps:
You may have noticed the emphasis on Intersection-over-Union (IoU) in the third step of the NMS process. This highlights its importance and sets the stage for a deeper dive into IoU and its significance in computer vision.
Intersection-over-Union (IoU) is a critical metric in computer vision, particularly within the realm of object detection. It measures how much two bounding boxes overlap with each other, offering a measure of the accuracy of an object detector in predicting the location of objects.
The IoU between two bounding boxes is calculated by dividing the area of their intersection by the area of their union. The formula for IoU is shown in the figure below:
This metric ranges from 0 to 1, where 0 indicates no overlap and 1 signifies perfect overlap.
IoU is not only crucial for NMS but also (as discussed above) serves as a standard for evaluating object detection models. So it serves two primary functions:
Below is an illustration of how Non-Maxima Suppression applies the IoU metric in different overlapping scenarios. These examples highlight when NMS would retain both boxes or choose to remove one, based on the IoU value:
IoU examples for NMS: low overlap (IoU = 0.2), high overlap (IoU = 0.8), and no overlap (IoU = 0), indicating when bounding boxes are kept or removed.
The image above illustrates three scenarios (suppose our IoU threshold is set to 0.5 and the green bounding box is the bounding box with the highest confidence score):
Related: Real-time Object Tracking with OpenCV and YOLOv8 in Python.
To follow this tutorial, you'll need to install the OpenCV and NumPy libraries. To do so, run the following command in your terminal:
Now that we know what NMS is and how it works, it's time to put theory into practice. In this section, we'll get our hands dirty and implement Non-Maximum Suppression using OpenCV.
Create a new file called nms.py
and copy the code below:
First, we import the necessary libraries, then load an image from the disk and copy the original image to draw the bounding boxes on it after applying NMS.
Next, we define a set of bounding boxes. These are intended to mimic the potential outputs of an object detection model. You can check this tutorial, where we actually applied NMS to object detection.
Alongside these boxes, we define their corresponding confidence_scores
, quantifying the model's confidence that each bounding box accurately identifies an object.
Finally, we define the threshold
variable, which is used to define the IoU threshold for NMS. This threshold determines the minimum level of overlap (measured by the IoU) at which two boxes are considered for suppression. Essentially, if the IoU of two boxes exceeds this threshold, the one with the lower confidence score will be suppressed.
Next, let's draw the bounding boxes on the image:
Overlapping bounding boxes before Non-Maximum Suppression.
The image shows the manually drawn bounding boxes (which simulate what an object detection model might produce before the NMS process).
The next step is to apply NMS to filter out the overlapping boxes and retain the one with the highest confidence score. For this, we'll use the cv2.dnn.NMSBoxes()
function provided by OpenCV:
The cv2.dnn.NMSBoxes()
function takes 4 parameters:
bboxes
: This parameter contains the coordinates of each bounding box that potentially encloses an object.scores
: For each bounding box provided in the bboxes
parameter, there is a corresponding confidence score in the scores
parameter.score_threshold
: This parameter filters out detections based on their confidence scores before applying NMS. Only detections with a confidence score higher than the score_threshold
are considered for NMS.nms_threshold
: This is the threshold for the Intersection over Union (IoU) metric. If the IoU between two boxes is higher than the nms_threshold
, the box with the lower confidence score is discarded.The score_threshold
and nms_threshold
parameters play distinct roles in the NMS process, and their differences can sometimes be confusing:
score_threshold
are immediately filtered out and not processed further.nms_threshold
), only the box with the highest score is kept. The rest are suppressed.The function then returns the indices of the boxes to keep. We then filter out the boxes using these indices to get the final list of detections after applying NMS. Finally, we draw the filtered boxes on the image, show the image, and print the filtered boxes on the terminal.
Final bounding box after Non-Maximum Suppression.
As you can see, the NMS process removed the overlapping bounding boxes, and now we have a single, well-defined box (the one with the highest confidence score) around the car.
If you check the terminal, you can see that the filtered box is printed as follows:
The implementation using OpenCV's cv2.dnn.NMSBoxes()
function is straightforward. After feeding it our list of bounding boxes and their corresponding confidence scores, along with the thresholds, it returns the indices of the boxes that have made the cut (in our case, we have only one box).
In this tutorial, we've looked closer at Non-Maxima Suppression (NMS), a key technique that helps clean up our object detection results. By walking through the implementation with OpenCV and Python, I've shown you how NMS effectively eliminates redundant bounding boxes, leaving us with the most accurate representation of our target object.
Choosing the right score_threshold
and nms_threshold
is important. It's about striking the perfect balance to ensure we're capturing true positives without being overwhelmed by false positives. Again, I recommend you check this tutorial if you want to apply NMS to object detection.
The code for this tutorial is available here. Hope you enjoyed the article.
Learn also: Real-time Object Tracking with OpenCV and YOLOv8 in Python.
Happy coding ♥
Finished reading? Keep the learning going with our AI-powered Code Explainer. Try it now!
View Full Code Build My Python Code
Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!