How ImageMagick Help Me Organize my Invoice and Receipts

How ImageMagick Help Me Organize my Invoice and Receipts

It has become a habit of mine to scan important documents to preserve them better. One type of document that is regularly uploaded to my file storage is our residence's monthly fees. The setup is pretty simple: we receive an invoice in our mailboxes every month, then we pay it and receive the printed receipt. When I got home, the invoice and receipt are then scanned into a 2-page PDF file: the first page is the invoice and the second page is the receipt. This file is then stored together with the other important files.

Then the pandemic came. The process has changed – the invoice is still sent through our mailboxes physically, but the payments are done online. The invoice is scanned in my scanner and the scanned receipt is received via email. Because of this, the scan quality of the invoice and receipt are different.

Now that the invoice and receipt now comes from 2 different scanners or sources, I need to use ImageMagick to combine the files to produce a single PDF to match my previous PDFs in the file store.

convert invoice.jpg receipt.jpg result.pdf
Simple ImageMagick convert command to turn multiple JPGs to a PDF.

However, this produces a really disproportionate PDF where the first page is really small and the second page is really big. For recording purposes, this does not really matter since I can just zoom in or zoom out when reading the PDFs. This does not sit well with me – I want the new PDFs to match how proper the previous PDFs looks like.

How the resulting PDF looks like. I need to zoom in the invoice part to read it and I need to zoom out the receipt to read it.

First thing I noticed is that the dimensions of the images are different. The invoice image files has width around 2500px and the receipt has around 3100px width. With this information, I tried to scale down the receipt image using ImageMagick.

convert receipt.jpg -resize 80% receipt_scaled.jpg
ImageMagick convert command to scale an image to 80% of its size.

Even with this, the PDF still looks weird. For some reason, the created PDF has still disproportionate page as if no scaling has been done! I even tried tinkering with different page sizes and scaling options, and nothing seems to work.

The resulting PDF using the scaled-down receipt. See that the invoice is still really tiny.

Then I realized, maybe the image densities are different? Here I tried to use ImageMagick's identify command to see the image density.

identify invoice.jpg
identify receipt.jpg
Running the identify command to determine additional image information.

Unfortunately, the default output of the identify command does not show the image density.

$ identify invoice.jpg
invoice.jpg JPEG 2550x3200 2550x3200+0+0 8-bit sRGB 1.53711MiB 0.000u 0:00.000

$ identify receipt.jpg
receipt.jpg JPEG 3150x1325 3150x1325+0+0 8-bit sRGB 842776B 0.000u 0:00.000
The result of running the identify command without other flags.

Because of this, we need to format the output to show the image density:

identify -format "%x x %y" receipt.jpg
The -format flag allows you to control which information are shown by the identify command.

Using the identify -format flag, I can now see other information that are not shown by default. We can show the image density using the %x and %y specifiers in the format string. The %x format specifier shows the horizontal image density, and %y shows the horizontal density.

Using this command, I saw that the invoice has 300×300 density and the receipt is 72×72. With this information, I know that I need to match the image densities as well as scale them to match. To do this, I use the -density flag in the convert command:

convert receipt.jpg -resize 80% -density 300 receipt_scaled.jpg

Now that the invoice and receipt scans have matching densities and almost matching widths, I can now convert them to a PDF file:

convert invoice.jpg receipt_scaled.jpg result.pdf

The resulting PDF now looks how it was when I was scanning both invoice and receipts physically, and it does not need to be zoomed-in or zoomed-out to read!

How the final PDF looks like.

In summary...

I used ImageMagick's convert and identify commands to get the job done.

Initially, I simply combined 2 JPG files together to a single PDF using the basic form of the convert command:

convert invoice.jpg receipt.jpg result.pdf

But the resulting PDF has a really small first page and a large second page.

Tried scaling the second image to match the width of the first page using:

convert receipt.jpg -resize 80% receipt_scaled.jpg

Then making a PDF with it again still creates a disproportionate PDF.

I tried looking at the image density, since it might affect the resulting PDF.

identify -format "%x x %y" receipt.jpg

With this, I saw that the images have different densities. That's why I changed the density of the image for them to match:

convert receipt.jpg -density 300 -resize 80% receipt_scaled.jpg

Now that the images have the same density and almost the same width, I can now just make a PDF out of it and it should look nice:

convert invoice.jpg receipt_scaled.jpg result.pdf

Indeed, the resulting PDF looks nice!

Show Comments