Deep Learning

Find Unauthorized Constructions Using Aerial Photography and Deep Learning with Code (Part 1)

Maciej Zieniewicz
Towards AI
Published in
6 min readDec 7, 2020

--

Orthophoto, ground truth mask with buildings and predicted outcome of DL model.
The final outcome of our Detector.

Introduction

In this article, we will go through data gathering and a preprocessing phase. It’s an important step that needs to be done before we will jump into deep learning models.

We have developed a unique — and more importantly, automated — method of acquisition and preparation of training data. Using publicly available resources, we have collected RGB photos for over 50,000 locations.

Using the Land and Buildings Register information and some image transformations, we prepared binary masks for each location, delimiting the area of all registered buildings.

You can find a link to the discussed notebook at the end of this article.

Problem Statement

Building permission is one of the very first steps when erecting a building. This is a necessity that everyone needs to go through. Despite severe penalties, many people decide to erect the building without it.

Is it a serious problem in Poland? In 2011, aerial photos were taken in Gliwice (an average size city in Poland), with the help of which the legal analysis of the buildings was conducted (manually!). The results are shocking — 1374 discrepancies were detected [link].

The results of this project clearly show that the number of abnormalities detected and the official data are only a fraction of the reality.

The main reason for this divergence is the high cost of aerial / satellite imagery and the need for manual image analysis. Technical limitations and the lack of digital competences of local officials additionally make it difficult to supervise the legality of buildings and possible errors.

Data Gathering and Preprocessing

We will use a few typical packages and rasterio for handling raster data — in our case, spatial (orthophoto) images.

from io import BytesIO
import requests
import logging
from datetime import datetime

import numpy as np
import matplotlib.pyplot as plt

import rasterio
from scipy import ndimage as ndi
from skimage.transform import resize
from skimage.segmentation import watershed
from skimage.morphology import closing

logging.basicConfig(level=logging.INFO)

Now we need to set up parameters for our image, including size and coordinates. You can experiment with these attributes freely, but you should follow some rules:

  1. x and y coordinates have to be placed within Polish borders (we are using a specific data source)
  2. coordinates are based on EPSG:2180 projection (used in Poland for small scale maps) instead of a common EPSG:4326. You can find an appropriate converter here.
# image resolution "on the ground"
meters_by_pixel = 1
img_side = 256 # in meters
img_size = (abs(int(img_side/meters_by_pixel)), abs(int(img_side/meters_by_pixel))) # for comparing sources

# image coordinates
x_min = 597500
y_min = 657450
x_max = x_min + img_side
y_max = y_min + img_side

logging.info(f"image coordinates: {(x_min, y_min), (x_max, y_max)}, image size: {img_size} resolution: {img_side}/{meters_by_pixel}")

We will use the official geoportal service to get current orthophoto images for the chosen location. The website also supports English if you want to test its capabilities.

To create requests correctly, we need to use a GetMap request from WMS (Web Map Service) protocol, which is widely used to share geographic data over the internet.

Then, BytesIO helps us to handle the response’s content as it is actually a png image and rasterio retrieves the picture we are interested in as a NumPy array. Because our raw data array has a “channel-first” shape (4, 256, 256), we need to transform it into a (256, 256, 4) array and remove the unnecessary channel (alpha).

# GEOPORTAL ORTO
def get_orto_image(aoi, resolution):
x_min, y_min, x_max, y_max = aoi

ORTO_WMS_URL = "mapy.geoportal.gov.pl/wss/service/img/guest/ORTO/MapServer/WMSServer"

layer = 'Raster'
resolution = resolution
size = (abs(int((x_min - x_max)/resolution)), abs(int((y_min - y_max)/resolution)))

params = {
'request': 'GetMap',
'service': 'WMS',
'version': '1.1.1',
'layers': layer,
'styles': '',
'width': size[0],
'height': size[1],
'srs': 'EPSG:2180',
'bbox': ','.join((str(x) for x in (x_min, y_min, x_max, y_max))),
'format': 'image/png',
'transparent': 'TRUE'
}

parsed_url = "http://" + ORTO_WMS_URL + "?" + "&".join([f"{k.upper()}={v}" for k,v in params.items()])

logging.info(f"Requesting image from get_orto_image({aoi}, {resolution}), img_size: {size}")
response = requests.get(parsed_url)

if response.ok:

img = BytesIO(response.content)
with rasterio.MemoryFile(img) as memfile:
with memfile.open() as dataset:
data_array = dataset.read()
data_array = np.moveaxis(data_array, 0, 2)

return data_array[:,:,:3]

logging.warning(f"Could not get image from get_orto_image({aoi}, {resolution})")
return
Orthophoto for the chosen location from Poland
Picture 1. Orthophoto for the chosen location (geoportal response).

Orthophoto images are the first part of our dataset. Now we need to find officially registered buildings for these locations using a similar function. Polish Office of Geodesy and Cartography allows us to get the necessary data.

# BUILDINGS
def get_building_image(aoi, resolution):
x_min, y_min, x_max, y_max = aoi
BUILD_WMS_URL = "integracja.gugik.gov.pl/cgi-bin/KrajowaIntegracjaEwidencjiGruntow"

layer = 'budynki'
resolution = resolution
size = (abs(int((x_min - x_max)/resolution)), abs(int((y_min - y_max)/resolution)))

params = {
'request': 'GetMap',
'service': 'WMS',
'version': '1.1.1',
'layers': layer,
'styles': '',
'width': size[0],
'height': size[1],
'srs': 'EPSG:2180',
'bbox': ','.join((str(x) for x in (x_min, y_min, x_max, y_max))),
'format': 'image/png',
'transparent': 'TRUE'
}

parsed_url = "http://" + BUILD_WMS_URL + "?" + "&".join([f"{k.upper()}={v}" for k,v in params.items()])

logging.info(f"Requesting image from get_orto_image({aoi}, {resolution}), img_size: {size}")
response = requests.get(parsed_url)

if response.ok:

img = BytesIO(response.content)
with rasterio.MemoryFile(img) as memfile:
with memfile.open() as dataset:
data_array = dataset.read()
data_array = np.moveaxis(data_array, 0, 2)

return data_array[:,:,:3]

logging.warning(f"Could not get image from get_orto_image({aoi}, {resolution})")
return
Contours of buildings for the same coordinates.
Picture 2. Contours of buildings for the same coordinates (geoportal response).

To effectively compare orthophotos with contours, we need to create a mask by closing and filling all the contours, both closed and opened ones. We will combine some existing solutions that skimage and scipy packages provide.

def get_building_mask(aoi, resolution):

logging.info(f"Generating image for get_building_mask({aoi}, {resolution})")
data_array = get_building_image((x_min, y_min, x_max, y_max), meters_by_pixel)

if data_array is not None:
data_array = data_array[:,:,0] > 0

# closing contours
data_array = closing(data_array)

# filling contours
data_array = ndi.binary_fill_holes(data_array)

# filling open contours with watershed
water_mask = watershed(data_array)
water_mask = water_mask != np.bincount(water_mask.flatten()).argmax()
data_array = water_mask + data_array

return data_array

logging.warning(f"Could not get image for get_building_mask({aoi}, {resolution})")
return

First, we extract only one (red) channel, then use closing transformation to fill in invisible gaps that might have left contours open. We are filling closed contours with binary_fill_holes, which brings us closer to the final solution.

Closed countours closed with binary fill holes function
Picture 3. Result of the mask after using fill_binary_holes.

Now it’s time for a trick with a watershed segmentation algorithm. As per this explanation:

The watershed transform floods an image of elevation starting from markers in order to determine the catchment basins of these markers. Watershed lines separate these catchment basins and correspond to the desired segmentation. [link]

In our example, the watershed recognizes empty edges as objects overlapping with the background and returns three instances: background and two buildings. The background is returned as value 1, so our objective is to grab only instances with higher numbers.

Contours after using watershed segmentation
Picture 4. Result of the mask after using watershed segmentation.

Now, we simply add our closing and watershed masks together, and the final mask is ready for training purposes.

Two previous masks added together.
Picture 5. The final mask of registered buildings.

What’s Next…?

In the next article, we will focus on training our segmentation model with the use of U-Net architecture to recognize buildings on orthophoto images.

Resources and Authors

Link to the official notebook

Meet our team:
Marta Augustynowicz
Łukasz Sawaniewski
Dariusz Tanajewski
Igor Wieczorek
Maciej Zieniewicz

The project was carried out within the DataWorkshop Foundation community.

DataWorkshop Foundation is primarily about machine learning and people. We focus on pro-social activities, using the potential of machine learning. The goal of the DataWorkshop Foundation is to disseminate knowledge about machine learning and AI through practical activity, which leads to the solution of important current problems. As part of the foundation’s activities, we build a knowledge base and enrich the experience, thanks to that we want to use the existing machine learning techniques and tools for innovative solutions.

--

--