Extracting the Thumbnail from the PhotoShop private TIFF Tag
This document explains how to extract the thumbnail included in the Photoshop tag data. Background information and pseudo-code is included. C and Delphi ObjectPascal code interfacing with LibTiff and LibJpeg/VCL will also be made available later.
PhotoShop Tag highest level structure
The Photoshop tag data is a series of so-called 'Image Resource Blocks'. Here's a detailed view on this 'Image Resource Block' structure...
Offset | Datatype | Value |
---|---|---|
0 | 4 Bytes | Image Resource Block Signature ASCII '8BIM' |
4 | Motorola-order Word | Image Resource ID This identifies the resource type |
6 | n Bytes; so-called Pascal string, padded to even size |
First byte of this 'Pascal string' is length indication, next there is an ASCII sequence of indicated length. If the length is even, and thus the total of length byte and ASCII sequence
isn't, a padding byte is appended. Officially, this is a 'name', but it turns out to be the null string most of the time. Expect two bytes of 0. |
6+n | Motorola-order Long | Size of resource data |
10+n | Variable number of Bytes, as indicated by previous field, padded to even size | Resource data |
This means the following pseudo-code will walk through the list of Image Resource Blocks.
let m point to the start of the tag data
while m is not beyond the end of the tag data
sanity check: there should be at least 7 bytes of tag data available at m
sanity check: the first four bytes pointed to by m should read ASCII '8BIM'
increment m with 4
let n be the Motorola-order Word value pointed to by m
increment m with 2
let o be the Byte value pointed to by m
increment o
if o is not even, increment o
increment m with o
sanity check: there should be at least 4 bytes of tag data available at m
let p be the Motorola-order Long value pointed to by m
increment m with 4
sanity check: there should be at least p bytes of tag data available at m
found resource data
id: n
location: m
length: p
if p is not even, increment p
increment m with p
Thumbnail resources
There are two versions of the thumbnail resource. Both have a unique ID. PhotoShop 4.0 used to write the thumbnail in a resource with ID 1033, PhotoShop 5.0 and upward writes the thumbnail in a resource with ID 1036. There is only one subtle difference between these two resources, and we'll explain that later. The following detailed structure applies to both.
Offset | Datatype | Value |
---|---|---|
0 | Motorola-order Long | Appears to always equal 1 Officially, this denotes the 'format', and this value could also equal 0, but it is not clear what that should mean, nor have we ever seen this. |
4 | Motorola-order Long | Thumbnail width |
8 | Motorola-order Long | Thumbnail height |
12 | Motorola-order Long | Scanline size ( = thumbnail width * 3, padded to nearest multiple of 4) |
16 | Motorola-order Long | Total decompressed thumbnail memory size ( = scanline size * thumbnail height) |
20 | Motorola-order Long | Size of the JFIF data ( = size of resource data - 28) |
24 | Motorola-order Word | Appears to always equal 24 Officially, this denotes the number of bits per pixel |
26 | Motorola-order Word | Appears to always equal 1 Officially, this denotes the number of planes |
28 | Variable number of bytes, as indicated by 'Size of JFIF data' field | JFIF data |
Thus, the following pseudo-code will perform necessary sanity checks and locate the JFIF data block
let m be the location of the thumbnail resource
sanity check: there should be at least 28 bytes of resource data available at m
sanity check: the Motorola-order Long value pointed to by m should equal 1
increment m by 4
let nx be the Motorola-order Long value pointed to by m
increment m by 4
let ny be the Motorola-order Long value pointed to by m
increment m by 4
let o be nx*3, incremented to the nearest multiple of 4
sanity check: Motorola-order Long value pointed to by m should equal o
increment m by 4
sanity check: Motorola-order Long value pointed to by m should equal o*ny
increment m by 4
sanity check: Motorola-order Long value pointed to by m should equal (resource size - 28)
increment m by 4
sanity check: Motorola-order Word value pointed to by m should equal 24
increment m by 2
sanity check: Motorola-order Word value pointed to by m should equal 1
increment m by 2
found JFIF thumbnail data
location: m
length: resource size - 28
thumbnail width: nx
thumbnail height: ny
These nx and ny values, the thumbnail width and height, may seem redundant, since they will also surface when decoding the JFIF. However, some applications may choose to reject thumbnails that are, for instance, too small for their purposes, and the early indication of thumbnail width and height may allow them to do so without setting up JPEG decoding structures first.
Next, the JFIF data could be put through a JPEG decompression cycle. This will yield the thumbnail, ready for previewing. At least, this is true for thumbnail resource 1036, written by PhotoShop 5.0 and upward.
PhotoShop 4.0 seems to swap the R and B channels when feeding the JPEG compressor that builds the JFIF block contained in the 1033 thumbnail resource, and swap them back after decompression, before display. The regular JFIF structures are present, the regular conversion between RGB and YCbCr is applied, except for these swapped channels. This means that, when having decoded thumbnail resource 1033 to a RGB raster with your familiar JPEG library, you need to next swap the R and B channels.