Adjusting cell phone photo orientation with Go
Programming Snapshot – Rotating Photos with Go
Cell phones often store photos upside down or sideways for efficiency reasons and record the fact in the Exif metadata. However, not all apps can handle this. Mike Schilli turns to Go to make the process foolproof.
Somehow, I seem to hold my cell phone wrong when taking pictures. It often happens that a picture that looks right in the phone's photo gallery is suddenly upside down or on its side when I want to send it to friends on WhatsApp. A quick look at the metadata of the photo in question reveals what's going wrong (Listing 1) [1].
Listing 1
Photo Metadata
$ exiftool pic.jpg | grep Rotate Orientation: Rotate 180
It appears that my phone is saving images with the wrong orientation, but it is also storing the correction value, which shows up in the JPEG format's Exif header to tell you how the photo actually should be rotated. I don't know who came up with this idea, but assuming that any app looks in the Exif header first and then aligns the image correctly seems like a fundamental flaw to me. There's no doubt in my mind that the original camera app should perform the rotation when the picture is taken, rather than having the work done downstream time and time again by an unmanageable number of photo apps.
Upside Down World
Figure 1 illustrates that the cell phone still displays the image the right way round in its photo app, even though it was stored incorrectly. The desktop version of the Whatsapp client in the browser, however, obviously ignores the associated Exif header and sends the picture upside down (Figure 2). Imagine the recipient's reaction to receiving a message about my surf trip to Ocean Beach in San Francisco with a photo on its head!
High time to write a Go program that correctly orients a photo retrieved from a phone and then deletes the Exif header entirely, so that any downstream application will process the image as-is – very much like Gimp (Figure 3) does when it sees a rotated photo. While I'm at it, I'll throw in some algorithms for rotating images 90 or 180 degrees: After all, someone might ask about them in your next job interview.
A digital photograph is ultimately an m by n matrix of pixel values. To turn it upside down (i.e., to perform a 180-degree rotation), an algorithm simply swaps pixel values above and below the center line. The X values of the pixels in an image traditionally run from left to right, while the Y values increase from top to bottom. This means that the origin of the photo can be addressed as (0,0)
top left, while the bottom right corner is at the coordinates (w-1,h-1)
, where w
specifies the image width and h
the image height, each in pixels.
Double Flip
But be careful: If you simply swap the Y values for the rotation, you will end up with a mirror image of the original photo. Instead, small X values (pixels in the upper left corner) need to be put in the lower right corner for a 180-degree rotation. This means that the rotation algorithm needs to mirror both the Y values and the X values. A pixel located in the upper half of the image with the coordinates (x0,y0)
thus ends up in the lower half of the image at position x1,y1
. The distance from y0
to the center line is equal to the distance from y1
to the center line. At the same time, x0
is as far from the left edge of the image as x1
is from the right.
Listing 2 shows the algorithm that turns a JPEG photo upside down. The two nested for
loops starting in line 13 work their way through the Y values from
to the bottom of the image and through the X values from
to the right edge. These coordinates correspond to the x0
and y0
positions of the original pixels. Inside the double loop, line 15 retrieves the original pixel value in jimg.At(x,y)
. To rotate the image, it stores the value at the mirrored position in the newly created, modifiable photo dimg
at position x1,y1
. The position is calculated as the distance of x0
from the right or y0
from the bottom edge of the image. The algorithm efficiently mirrors the image at both the horizontal and vertical center lines, turning it upside down as desired.
Listing 2
rotate-180.go
01 package main 02 03 import ( 04 "image" 05 ) 06 07 func rot180(jimg image.Image) *image.RGBA { 08 bounds := jimg.Bounds() 09 width, height := bounds.Max.X, bounds.Max.Y 10 11 dimg := image.NewRGBA(bounds) 12 13 for y := 0; y < height; y++ { 14 for x := 0; x < width; x++ { 15 dimg.Set(width-1-x, height-1-y, jimg.At(x, y)) 16 } 17 } 18 19 return dimg 20 }
Writeable Copy
When Go reads a JPEG photo from disk, it usually ends up in a pixel array that can't be modified. But since the algorithm wants to manipulate the photo, line 11 in Listing 2 first creates a writable photo with the same dimensions as the original using NewRGBA()
but leaves it empty. Then the rot180
function keeps calling jimg.At()
for all pixels of the original and transfers them to the target photo, mirrored one by one, with Set(x,y,Value)
. Line 19 returns the finished rotated image as a pointer to the calling main program.
This covers the algorithm for rotating a photo in memory by 180 degrees. But how does the image, which is compressed in JPEG format on disk, find its way into RAM as a pixel matrix in the first place? The imgMod()
function in Listing 3 takes the name of the photo file in srcFile
, the name of the target file as the second parameter, and, as the third parameter, a callback function that performs the desired rotation of the image in RAM.
Listing 3
imgmod.go
01 package main 02 03 import ( 04 "image" 05 "image/jpeg" 06 "log" 07 "os" 08 ) 09 10 func imgMod(srcFile string, dstFile string, cb func(image.Image) *image.RGBA) { 11 f, err := os.Open(srcFile) 12 if err != nil { 13 log.Fatalf("os.Open failed: %v", err) 14 } 15 16 jimg, _, err := image.Decode(f) 17 if err != nil { 18 log.Fatalf("image.Decode failed: %v", err) 19 } 20 21 dimg := cb(jimg) 22 if err != nil { 23 log.Fatalf("Modifier failed") 24 } 25 26 f, err = os.Create(dstFile) 27 if err != nil { 28 log.Fatalf("os.Create failed: %v", err) 29 } 30 err = jpeg.Encode(f, dimg, nil) 31 if err != nil { 32 log.Fatalf("jpeg.Encode failed: %v", err) 33 } 34 }
Functions are full-featured data types in Go and can easily be passed to other functions, which basically tells Go: "Here's the algorithm to apply to the original image data." Listing 3 opens the original file for reading in line 11, uses Decode()
to decode the JPEG data from the standard image/jpeg package, and stores the results into a matrix pointed to by the jimg
variable if there are no errors. Line 21 then calls the function previously passed in as a parameter with the variable cb
, passes the image data in jimg
to it, and receives the modified data of the then rotated image in dimg
. All that remains is to create a new destination file, dstFile
, and write the JPEG-encoded data for the modified photo in line 30.
Buy this article as PDF
(incl. VAT)