Since you are here, I suppose that you have gone through Part1 and Part2 of this series and, hopefully, found them beneficial. As I promised in part 2 of this series, here is a new article which will help you prepare for your next Computer Vision/Image Processing coding interview.
The main two sections of this article are:
1. Implementation for Gaussian blur with support to gray scale and colored images.
2. Find the closest face to the camera.
Gaussian blur filter is one of the smoothing filters. The smoothing effect is the result of blurring the image by convolving the famous Gaussian function.
Smoothing filters are used usually to reduce the noise in the image. The Gaussian blur filter’s size must be positive and odd and the width and height are not required to match.
For simplicity, our implementation of the Gaussian blur filter in this tutorial will deal with:
- [0–255] values for the pixels.
- Supporting both 1-channel and 3-channel images.
- Using data raw pointer provided by cv::Mat .
- Filter size is a squared shape.
- Boundary pixel values will keep their values (challenge yourself and try to expand the image and apply the filter on boundary pixels as well).
Gaussian Blur Kernel
As mentioned in the previous section, Gaussian Filter works by applying the convolution operation on the image. Gaussian kernel size (width and height) can differ but they both must be positive and odd. Our implementation will support a square kernel only(challenge yourself and support different width and height. easy, right? ^_^ ). Looking back into the Gaussian function, you can notice that in addition to the filter size, we need sigma as a parameter which represents the gaussian standard deviation. The standard deviation plays a significant part in the smoothing effect.
Therefore, our function which creates the kernel will take two parameters: kernel_size and sigma. If sigma is zero or non-positive, it is computed from the following equation (source OpenCV): sigma = 0.3*((kernel_size-1)*0.5–1) + 0.8
The code is self-explanatory. We are using a vector to store the kernel values. Then, we are looping through the filter dimensions and applying the Gaussian function on each of them. The final output will be a vector with a kernel_size*kernel_size length.
Gaussian Blur Filter
As in the Sobel Filter in Part 2, the kernel is generated using the function described above. Then, we convolve the filter throughout the image. The only difference here is, we are supporting colored images and the boundary pixels remain the same.
The function takes 4 parameters: Input and Output images, Kernel size and Sigma.
- Line 2: Create a Gaussian Kernel using our get_guassian_kernel function.
- Line 4: Check that the input and output images have the same number of channels (both are colored or gray).
- Line 9–11: Three loops to go through each pixel value in each channel. The real number of columns in memory here is:
column_size * number_of_channels. So, we increase the column index by number_of_channels for each step and then at Line 11, we are looping throughout the values of this pixel in each channel.
- Line 13–18: If the pixel is a boundary pixel, keep the original value for it.
- Line 20–29: The convolution operation between the kernel and the image pixels. It is very similar to the one in Sobel filter implementation except that here we are adding the channel index to loop through Red, Blue and Green values in case of colored images.
*Note: The convolution operation should be done on images with a range of [0–1] and as mentioned earlier, our function takes only images with pixel values in the range of [0–255]. So, we should convert the value while we are performing the operation and convert it back to [0–255] range when we are updating the value. Try to add few conditions to check the image type and support wider range of input image types.
Let us now run our implementation for Gaussian Blur filter as well as the OpenCV one to compare the results for both colored (3-channel) and gray (1-channel) images.
Yeah!!! The results are almost the same. It is clear that the OpenCV function adds padding for the boundary pixels and then blurs them as well. Yes, go and add that part to make it exactly the same.
Closest Person Face:
Now, let us shift gears to work on a different task and leave those tiring filters implementation. The new task is to find the closet person’s face to the camera. The first step is to detect the faces in the image. This is a well-studied task and it is known as Face Detection. There are many ways to solve this problem starting from OpenCV Harr Cascade Classifier and ending with many Deep Learning Models for face detections. The main difference between all these methods can be categorized under three factors: model size, accuracy and speed. Since the focus for this exercise is not about the face detections, we will utilize Harr Cascade Classifier from OpenCV. So, how can we get the closest person’s face to the camera after detecting all the faces in the image? Check the following animated picture and answer yourself. An animated image is worth a thousand words, right?
When the person is closer to the camera, the face area is bigger. So, getting the face with the biggest area size is equal to getting the closest face to the camera. Of course, this assumption is not 100% true and does not work when people are very close to each other because of the face size and the model accuracy. We won’t consider those situations in our implementation.
- Line 3–10: Create an OpenCV CascadeClassifier and load the model. You can find the model here.
- Line 11: Load the image in which we want to find the closest face to the camera.
- Line 16: The detection operation.
- Line 19–26: Loop through the detected faces and find the one with the maximum area. Also, draw a white rectangle around each face the detector was able to find.
- Line 28–33: Draw a red rectangle around the closest face.
It is clear that the closest face has the red rectangle, so we achieved our goal. We can notice as well, that the detector missed a few faces, which means that we need to look into more accurate models for face detection (which is outside the scope of this article).
This is the last article in this series unless someone suggest ideas to write about. Do not hesitate to do that :)
One final note: the solutions I have provided in this article may not be fully optimized. Our goal was to practice together, so do not hesitate to suggest any modifications that can be done. Also, feel free to provide any ideas or share any concerns