Introduction to Image Augmentation in Deep Learning

Image augmentation is an engineered solution to create a new set of images by applying standard image processing methods to existing images.

This solution is mostly useful for neural networks or CNN when the training dataset size is small. Although, Image augmentation is also used with a large dataset as a regularization technique to build a generalized or robust model.

Deep learning algorithms are not powerful just because of their ability to mimic the human brain. They are also powerful because of their ability to thrive with more data. In fact, they require a significant amount of data to deliver considerable performance.

High performance with a small dataset is unlikely. Here, an image data augmentation technique comes in handy when we have small image data to train an algorithm.

Image augmentation techniques create different image variations from the existing set of images using the below methods:

  • Image flipping
  • Rotation
  • Zoom
  • Height and width shifting
  • Perspective and tilt
  • Lighting transformations
  • Crop
  • Noise addition
  • Cutout or erasing

Why does image augmentation work?

As discussed earlier, deep neural networks work best with a large amount of data. The amount of data accelerates the performance as shown below image.

Here, with image augmentation, we can create numerous image variations by combining multiple of the above-mentioned transformations.

You may create 20, 25, 30, or more image variations from each image if needed. Thus, it fulfills the requirement for a fair amount of data, as you can scale your dataset 30 or more times (in a sense) using augmentation techniques.

Image data augmentation as a regularization technique

Apart from dataset scaling, image augmentation can be considered a regularization method.

A trained algorithm is required to perform well on unseen real-world data. The images attained in a real-time environment could be crazy compared to images in the training set.

These real-world images have different lighting conditions, frame position, size of an object of interest, perspective, and more.

To make algorithm learning robust, we can use these augmentations while training.

This way, the final model will be able to generalize more. I prefer to use augmentations most of the time even if the dataset is massive.

What Image Augmentation techniques are we talking about in this blog?

Here, when we talk about data augmentation, we won’t create new images and save them to storage which could also be achieved using the OpenCV library.

The OpenCV-like library will help to increase the number of images in storage but similar model performance can be obtained with in-memory image augmentation techniques.

No doubt, physical image augmentation (augmented image generation and stored on a disk) provides much more flexibility for image processing and manipulation but virtual or in-memory image augmentation is just enough for training a model.

In-memory image augmentation takes images from storage and modifies the image in real-time before feeding to neural networks while keeping the original image as it is.

The code is based on the Keras library’s ImageDataGenerator class. After covering all these techniques, we will also cover the code for TensorFlow’s preprocess experimental layers for image augmentation and fastai’s image data augmentation capability.

Read an In-depth blog about image augmentation code (it also includes custom augmentations)

Learning the weights of life with ReLu. Taking positives and eliminating negatives.