Historical document image binarization via style augmentation and atrous convolutions

Hanif Rasyidi, Data61
Salman Khan, Inception Institute of Artificial Intelligence


Historical documents suffer from a variety of degradations, making it challenging to recover the original textual content. The image binarization problem seeks to separate the original textual content from the image degradations. In this paper, we present a new binarization technique to accurately learn original text patterns from a limited amount of available historical document data. Our approach consists of a cascade of style augmentation and image binarization networks. Our style augmentation network uses a random style transfer approach to improve the variety of training data by generating new style patterns for the existing documents. The binarization network employs an encoder-decoder-based text segmentation approach with atrous convolutions to preserve the spatial details. The resulting segmentations contain a considerably low noise level and smooth texture. Compared to other leading binarization methods available throughout the DIBCO competition, our proposed methods gain top performances across various evaluation measures.