Skip to content

ad-si/Perspec

Repository files navigation

Icon Perspec

App and workflow to perspectively correct images. For example whiteboards, document scans, or facades.

App Workflow

Step Description Result
1 Take photos Original image
2 Open Perspec app Opened Perspec App
3 Drop the images onto the window Dropped image
4 Mark the corners by clicking on them Marked corners
5 Click one of the save buttons (or [Enter]) Corrected image

Installation

WARNING: Perspec currently only works on macOS and Linux. Any help to make it work on Microsoft (Ticket) would be greatly appreciated!

Prebuilt

You can get this (and previous) versions from the releases page.

The current nightly version can be downloaded from https://github.com/feramhq/Perspec/actions. However, it's necessary to fix the file permissions after download:

chmod +x \
  ./Perspec.app/Contents/MacOS/Perspec \
  ./Perspec.app/Contents/Resources/{perspec,script,imagemagick/bin/magick}

On macOS you can also install it via this Homebrew tap:

brew install --cask ad-si/tap/perspec

From Source

Build it from source with Haskell's stack.

Platypus, with command line tools enabled , is required to build from source.

git clone https://github.com/feramhq/Perspec
cd Perspec
make install

This copies the Perspec.app to your /Applications directory and makes the perspec command available on your path. You can then either drop images on the app window, or use it via the CLI like perspec fix image.jpeg

Usage via CLI

It's also possible to directly invoke Perspec via the CLI like so:

/Applications/Perspec.app/Contents/Resources/perspec fix path/to/image.jpeg

You can also pass several images and they will all be opened one after another. This is very useful for batch correcting a large set of images.

Photo Digitization Workflow

  1. Take photos
    1. Use camera app which lets you lock rotation (e.g. OpenCamera). Otherwise check out the guide below to fix rotation.
    2. Use a sound activated camera to take photos simply by clicking your tongue or snipping your finger. E.g. with:
  2. Use perspec rename sub-command to fix order and names of scanned files.
  3. Verify that
    • All pages were captured and have the correct filename
    • Images are sharp enough
    • Images have a high contrast
    • Images have correct orientation
  4. For best image quality convert images optionally to a lossless format (e.g. png), apply rotations, and convert them to grayscale. Attention: Exclude the covers!
    mogrify -verbose -format png \
      -auto-orient -colorspace gray photos/*.jpeg
  5. Use Perspec to crop images
    perspec fix photos/*.png

Additional Steps

Improve colors with one of the following steps:

  1. Normalize dynamic range:
mogrify -verbose -normalize photos/*.png
  1. Convert to black and white:
    #! /usr/bin/env bash
    
    find . -iname "*.png" | \
    while read -r file
    do
      magick \
        -verbose \
        "$file" \
        \( +clone -blur 0x60 -brightness-contrast 40 \) \
        -compose minus \
        -composite \
        -negate \
        -auto-threshold otsu \
        "$(basename "$file" ".png")"-fixed.png
    done

In order to rotate all photos to portrait mode you can use either

mogrify -verbose -auto-orient -rotate "90>" photos/*.jpeg

or

mogrify -verbose -auto-orient -rotate "-90>" photos/*.jpeg

Features

  • Rescale image on viewport change
  • Handle JPEG rotation
  • Draw lines between corners to simplify guessing of clipped corners
  • Bundle Imagemagick
  • Better error if wrong file format is dropped (images/error-message.jpg)
  • Center Perspec window on screen
  • Drag'n'Drop for corner markers
  • "Submit" button
  • "Convert to Grayscale" button
  • Add support for custom output size (e.g. A4)
  • Manual rotation buttons
  • Zoom view for corners
  • Label corner markers

Algorithms

Perspective Transformation

Once the corners are marked, the correction is equivalent to:

magick \
  images/example.jpg \
  -distort Perspective \
    '8,35 0,0 27,73 0,66 90,72 63,66 67,10 63,0' \
  -crop 63x66+0+0 \
  images/example-fixed.jpg

Grayscale Conversion

Converts image to grayscale and normalizes the range of values afterwards. (Uses Imagemagick's -colorspace gray -normalize)

BW Conversion

Converts image to binary format with OTSU's method. (Uses Imagemagick's -auto-threshold OTSU -monochrome)

Interpolation of Missing Parts

Perspec automatically interpolates missing parts by using the closest pixel. (https://www.imagemagick.org/Usage/misc/#edge)

Technologies

Related

  • Hasscan - OpenCV document scanner in Haskell.

Check out ad-si/awesome-scanning for an extensive list of related projects.