ID Cards Segmentation using U-Net

Reza Arrazi
3 min readJan 7, 2021

--

Most people in the deep learning and computer vision communities understand what image classification is: we want our model to tell us what single object or scene is present in the image. Classification is very coarse and high-level.

Many are also familiar with object detection, where we try to locate and classify multiple objects within the image, by drawing bounding boxes around them and then classifying what’s in the box. Detection is mid-level, where we have some pretty useful and detailed information, but it’s still a bit rough since we’re only drawing bounding boxes and don’t really get an accurate idea of object shape.

Semantic Segmentation is the most informative of these three, where we wish to classify each and every pixel in the image, just like you see in the gif above! Over the past few years, this has been done entirely with deep learning.

If you’d like to try out the models yourself, you can checkout my Semantic Segmentation for ID Cards, complete with TensorFlow training and testing code for U-Net also the labeling tools!

https://gitlab.com/rezaarrazi/idcard_segmentation

Model

A U-NET was used as the model. U-Net is a convolutional neural network that was developed for biomedical image segmentation at the Computer Science Department of the University of Freiburg, Germany. The network is based on the fully convolutional networkand its architecture was modified and extended to work with fewer training images and to yield more precise segmentations. Segmentation of a 512*512 image takes less than a second on a modern GPU.

architecture of U-Net

Metric

The Metric IoU (Intersection over Unit / Jaccard-Coefficient) was used to measure the quality of the model. The closer the Jaccard coefficient is to 1, the greater the similarity of the quantities. The minimum value of the Jaccard coefficient is 0.

IoU formula

Results

Raw image example:

raw image of selfie with id card
raw image of id card photo

Segmentation result:

segmentation result of selfie with id card
segmentation result of id card photo

Perspective transformation:

perspective transformation of selfie with id card
perspective transformation of id card

--

--