Computer Vision

What is Perspective Warping? | OpenCV and Python

A step-by-step guide to applying a perspective transformation on images

G SowmiyaNarayanan
Towards AI
Published in
6 min readSep 17, 2020

--

Computer vision is all abuzz now. People everywhere are working on some form of deep-learning-based computer vision projects. But before the advent of Deep Learning, image processing techniques were employed to manipulate and transform images in order to obtain insights that would help us achieve the task at hand. Today, let’s see how we can implement a simple yet helpful technique known as Perspective Projection to warp an image.

But wait! What does warping an image mean? I could explain it with a lot of fancy words and technical jargon. But then, it will be easy to just show the end results so that you can learn by seeing. You are reading a Computer Vision article anyway :)

Base Image (Source) — Subject Image (Source) — Warped Output (Image by Author)

So basically, you take an image and shear it to make it fit into a canvas of any desired shape. Note that the other way round is also possible. Now that, that’s out of the way, let’s take a look at how we can implement this using OpenCV and our trustworthy friend — Python(❤).

For you people who just want the code, no worries, I got you guys covered :P Here is the link to my GitHub repository.

Before we get into the main parts of the code, we must first import the necessary libraries.

import cv2
import numpy as np

Now, let’s read in the base image and the subject image as follows.

base_image = cv2.imread('base_img.jpg')
base_image_copy = base_image.copy()
subject_image = cv2.imread('subject.jpg')
Base Image(left) — Subject Image(right)

Initialize an array to store the coordinates of the 4 corners within which we want to overlay our subject image. We can choose these 4 points manually using the setMouseCallback() function as shown down below.

def click_event(event, x, y, flags, params):
if event == cv2.EVENT_LBUTTONDOWN:
cv2.circle(base_image_copy, (x, y), 4, (0, 0, 255), -1)
points.append([x, y])
if len(points) <= 4:
cv2.imshow('image', base_image_copy)
points = []base_image = cv2.imread('base_img.jpg')
base_image_copy = base_image.copy()
subject_image = cv2.imread('subject.jpg')

cv2.imshow('image', base_image_copy)
cv2.setMouseCallback('image', click_event)
cv2.waitKey(0)
cv2.destroyAllWindows()

In the code snippet given above, we define a function called click_event() and pass it as an argument to the setMouseCallback() function. With this method, we’ll first display the base image. We can then manually choose four points within the image to set as our target. Our subject image would be warped onto this target. The coordinates are recorded when the left mouse button is pressed. These are stored in the points array we initialized earlier. The selected points are highlighted as red dots, as shown below.

Selecting Corner Points (GIF by Author)

As we know, each one of us might choose 4 points in any random order. There is hence a need to maintain a constant ordering among the chosen points. I chose to order the points in a clockwise manner, i.e, from top-left to top-right to bottom-right to bottom-left. This is achieved by the sort_pts() method shown below. We use the fact that the sum of the x- and y-coordinates are minimum at the top-left corner and maximum at the bottom-right corner. Similarly, the difference between them is minimum at the top-right corner and maximum at the bottom-left corner. Take a moment to verify that it’s true by yourself. Keep in mind that for images, the origin is at the top-left corner of the image.

def sort_pts(points):
sorted_pts = np.zeros((4, 2), dtype="float32")
s = np.sum(points, axis=1)
sorted_pts[0] = points[np.argmin(s)]
sorted_pts[2] = points[np.argmax(s)]

diff = np.diff(points, axis=1)
sorted_pts[1] = points[np.argmin(diff)]
sorted_pts[3] = points[np.argmax(diff)]

return sorted_pts
sorted_pts = sort_pts(points)

After sorting the points, let's use them to calculate the transformation matrix. We create a numpy array called “pts1” which holds the coordinates of the 4 corners of the subject image. Similarly, we create a list called “pts2” which holds the sorted points. The order of the coordinates of “pts1” should match that of “pts2”.

h_base, w_base, c_base = base_image.shape
h_subject, w_subject = subject_image.shape[:2]

pts1 = np.float32([[0, 0], [w_subject, 0], [w_subject, h_subject], [0, h_subject]])
pts2 = np.float32(sorted_pts)

We now obtain the transformation matrix that is required to warp the subject image. This is obtained using the function, cv2.getPerspectiveTransform(). Since we want to transform the subject image in such a way that it fits the box we chose in the base image, the “src” should be “pts1 ”and the “dst” should be “pts2”. The size of the generated image can be specified as a tuple. We make sure the resultant image has the dimensions of the base image. Using the generated matrix we can warp the image using cv2.warpPerspective() method as shown in the given snippet.

transformation_matrix = cv2.getPerspectiveTransform(pts1, pts2)

warped_img = cv2.warpPerspective(subject_image, transformation_matrix, (w_base, h_base))
cv2.imshow('Warped Image', warped_img)

The warped image would look like this:-

Warped Image (Image by Author)

The next step is to create a mask for which we create a blank image with the shape of the base image.

mask = np.zeros(base_image.shape, dtype=np.uint8)
Initial Mask (Image by Author)

Onto this blank mask we draw a polygon with corners given by the “sorted_pts” and fill it with white color using the cv2.fillConvexPoly() method. The resultant mask would look like this.

roi_corners = np.int32(sorted_pts)

cv2.fillConvexPoly(mask, roi_corners, (255, 255, 255))
Filled-In Mask (Image by Author)

Now we invert the mask colors using the cv2.bitwise_not() method.

mask = cv2.bitwise_not(mask)
Inverted Mask (Image by Author)

Now we take the mask and base image and perform bitwise-AND operation using the cv2.bitwise_and() method.

masked_image = cv2.bitwise_and(base_image, mask)

This would give us an image as shown below. You can see that the area onto which the subject image is to be placed alone, is black.

Masked Base Image (Image by Author)

The final step would be to take the warped image and the masked image and perform bitwise-OR operation using the cv2.bitwise_or() method. This would generate the fused image we set out to accomplish.

output = cv2.bitwise_or(warped_img, masked_image)
cv2.imshow('Fused Image', output)
cv2.imwrite('Final_Output.png', output)
cv2.waitKey(0)
cv2.destroyAllWindows()

We have done it! We have successfully overlaid one image onto another.

Fused Image (Image by Author)

This is a very simple use case of perspective transformation. You can use this for generating a bird’s eye view of an area when you are tracking the movement of objects/persons in the frame. The only limit is your imagination.

The entire code for this article could be found in my GitHub repository through this link.

~~ SowmiyaNarayanan G

PS:-

Feel free to reach out to me if you have any doubts, I am happy to help. My doors are always open to constructive criticism, and hence don’t hesitate to let me know what you think of my work. You can also connect with me on LinkedIn.

Mind Bytes:-

“Everything you can imagine is real”― Pablo Picasso

--

--

Using Deep Learning and Computer Vision to tackle challenges | Diagnosed with Obsessive Coffee Disorder | Motto : Try -> Succeed/Fail -> Persevere