Feature transforms for image data augmentation

[abstract]

A problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature..

Keywords: data augmentation, deep learners, ensembles.

[Full Paper]