Extracting the Thumbnail from the PhotoShop private TIFF Tag

This document explains how to extract the thumbnail included in the Photoshop tag data. Background information and pseudo-code is included. C and Delphi ObjectPascal code interfacing with LibTiff and LibJpeg/VCL will also be made available later.

PhotoShop Tag highest level structure

The Photoshop tag data is a series of so-called 'Image Resource Blocks'. Here's a detailed view on this 'Image Resource Block' structure...

OffsetDatatypeValue
04 BytesImage Resource Block Signature
ASCII '8BIM'
4Motorola-order WordImage Resource ID
This identifies the resource type
6 n Bytes; so-called Pascal string, padded to even size First byte of this 'Pascal string' is length indication, next there is an ASCII sequence of indicated length. If the length is even, and thus the total of length byte and ASCII sequence isn't, a padding byte is appended.
Officially, this is a 'name', but it turns out to be the null string most of the time. Expect two bytes of 0.
6+nMotorola-order LongSize of resource data
10+nVariable number of Bytes, as indicated by previous field, padded to even sizeResource data

This means the following pseudo-code will walk through the list of Image Resource Blocks.

let m point to the start of the tag data
while m is not beyond the end of the tag data
    sanity check: there should be at least 7 bytes of tag data available at m
    sanity check: the first four bytes pointed to by m should read ASCII '8BIM'
    increment m with 4
    let n be the Motorola-order Word value pointed to by m
    increment m with 2
    let o be the Byte value pointed to by m
    increment o
    if o is not even, increment o
    increment m with o
    sanity check: there should be at least 4 bytes of tag data available at m
    let p be the Motorola-order Long value pointed to by m
    increment m with 4
    sanity check: there should be at least p bytes of tag data available at m
    found resource data
        id: n
        location: m
        length: p
    if p is not even, increment p
    increment m with p

Thumbnail resources

There are two versions of the thumbnail resource. Both have a unique ID. PhotoShop 4.0 used to write the thumbnail in a resource with ID 1033, PhotoShop 5.0 and upward writes the thumbnail in a resource with ID 1036. There is only one subtle difference between these two resources, and we'll explain that later. The following detailed structure applies to both.

OffsetDatatypeValue
0 Motorola-order Long Appears to always equal 1
Officially, this denotes the 'format', and this value could also equal 0, but it is not clear what that should mean, nor have we ever seen this.
4Motorola-order LongThumbnail width
8Motorola-order LongThumbnail height
12Motorola-order LongScanline size ( = thumbnail width * 3, padded to nearest multiple of 4)
16Motorola-order LongTotal decompressed thumbnail memory size ( = scanline size * thumbnail height)
20Motorola-order LongSize of the JFIF data ( = size of resource data - 28)
24Motorola-order WordAppears to always equal 24
Officially, this denotes the number of bits per pixel
26Motorola-order WordAppears to always equal 1
Officially, this denotes the number of planes
28Variable number of bytes, as indicated by 'Size of JFIF data' fieldJFIF data

Thus, the following pseudo-code will perform necessary sanity checks and locate the JFIF data block

let m be the location of the thumbnail resource
sanity check: there should be at least 28 bytes of resource data available at m
sanity check: the Motorola-order Long value pointed to by m should equal 1
increment m by 4
let nx be the Motorola-order Long value pointed to by m
increment m by 4
let ny be the Motorola-order Long value pointed to by m
increment m by 4
let o be nx*3, incremented to the nearest multiple of 4
sanity check: Motorola-order Long value pointed to by m should equal o
increment m by 4
sanity check: Motorola-order Long value pointed to by m should equal o*ny
increment m by 4
sanity check: Motorola-order Long value pointed to by m should equal (resource size - 28)
increment m by 4
sanity check: Motorola-order Word value pointed to by m should equal 24
increment m by 2
sanity check: Motorola-order Word value pointed to by m should equal 1
increment m by 2
found JFIF thumbnail data
    location: m
    length: resource size - 28
    thumbnail width: nx
    thumbnail height: ny

These nx and ny values, the thumbnail width and height, may seem redundant, since they will also surface when decoding the JFIF. However, some applications may choose to reject thumbnails that are, for instance, too small for their purposes, and the early indication of thumbnail width and height may allow them to do so without setting up JPEG decoding structures first.

Next, the JFIF data could be put through a JPEG decompression cycle. This will yield the thumbnail, ready for previewing. At least, this is true for thumbnail resource 1036, written by PhotoShop 5.0 and upward.

PhotoShop 4.0 seems to swap the R and B channels when feeding the JPEG compressor that builds the JFIF block contained in the 1033 thumbnail resource, and swap them back after decompression, before display. The regular JFIF structures are present, the regular conversion between RGB and YCbCr is applied, except for these swapped channels. This means that, when having decoded thumbnail resource 1033 to a RGB raster with your familiar JPEG library, you need to next swap the R and B channels.