Bridging the gap between complex scientific research and the curious minds eager to explore it.

Computer Science, Computer Vision and Pattern Recognition

Efficient Unfolding of Documents Folded in Half Using Neural Networks

Efficient Unfolding of Documents Folded in Half Using Neural Networks
  • Document digitalization has become crucial in today’s world, and mobile devices are increasingly popular for capturing images of documents.
  • However, common OCR methods cannot be applied directly to camera-captured images due to distortions caused by folding or other factors.
  • To address this issue, researchers have proposed rectification techniques that involve restoring a document image as if it were scanned with a flatbed scanner based on the image of a document captured with a mobile device.
  • In this article, we propose a new deep learning-based algorithm called Unfolder to rectify distorted document images.

The Proposed Algorithm: Unfolder

  • Unfolder is designed to address the problem of rectifying documents folded in half.
  • The algorithm consists of two stages: content recognition and image rectification.
  • In the first stage, Unfolder detects the creases in the document using a convolutional neural network (CNN) and recognizes the contents of the document.
  • In the second stage, Unfolder uses a fully convolutional network (FCN) to rectify the distorted image of the document by unwrapping the texture around the contour of the creases.
  • The rectified image is then cropped to remove any irrelevant information and enhance the quality of the text.

Advantages of Unfolder

  • Unfolder is fast and efficient, taking less than a second to perform on a smartphone mobile processor.
  • Unfolder can handle documents folded in different ways, including those with creases, tears, or other types of distortions.
  • Unfolder provides accurate results by using deep learning techniques to recognize the contents of the document and rectify the image.
  • Unfolder is open-source, allowing researchers to explore and modify its architecture for further improvement.

Conclusion

  • In this article, we proposed a new deep learning-based algorithm called Unfolder to rectify distorted document images captured with mobile devices.
  • Unfolder demonstrates fast performance, accuracy, and flexibility in handling different types of distortions, making it an excellent solution for document digitalization applications.
  • By leveraging the power of deep learning, Unfolder can help democratize document digitalization and make it more accessible to people around the world.