Wednesday, August 18, 2010

A lightweight Java solution to searching image archives

About a month ago, my wife found a collection of dress images, one of her favorite things to search for online, as she is fond of analyzing how clothes are put together and then making her own. The collection was in the form of a Shockwave file, in a neat little scroller app that lets you scan through dresses seen from different angles, find what you like, and go buy it. My wife was unable to save image files from the app, so she elicited my help in the matter.

After a few minutes, I was able to download the Shockwave file, and take a peek at it in a hex editor. Prominently displayed in the first block was "JFIF", which told me all I needed to know. "Hey," I said, "this file has embedded jpegs; I bet I can just iterate through the file looking for image headers, and start saving a new file each time I hit one..." I mused over the problem for a while, and whipped up a perl script to do just that:

#!/usr/bin/perl -w
use strict;

my $count = 0;
my $fileName = $ARGV[0];
$fileName =~ s/\.\w{3}$//;
open(FILE, '<', $ARGV[0]) or die 'Error opening input file';
binmode(FILE) or die 'Error setting binary mode on input file';
$/ = \1;
my $writeFlag = 0;
my $lastFF = 0;
while (<>) {
  if ($_ eq "\xD8" && $lastFF) {
    $writeFlag = 1;
    open (JPG, '>', $fileName . '_img' . sprintf("%04d",$count) . '.jpg')
      or die "Can't create image file";
    binmode(JPG) or die 'Error setting binary mode on image file';
    print JPG "\xFF";
  print JPG $_ if $writeFlag;
  if ($writeFlag && ($_ eq "\xD9") && $lastFF) {
    $writeFlag = 0;
    close JPG;
  if ($_ eq "\xFF") { $lastFF = 1 } else { $lastFF = 0 }

close FILE;

I used a template I had created earlier for extracting TIFF and EBCDIC data out of X9.37 files ("Check21" in banking lingo), yanked out the text processing, and added in a simplistic attempt at finding headers and trailers. By the end of the evening she had her collection of jpegs extracted, save for a few which for some reason had corruption artifacts introduced. Having finished both a tiring day at work and a second glass of Chardonnay, I was in no shape to analyze the issue any further, and slept on it instead.

When I awoke, I decided that it would be nice to have an application that shows you embedded images in a file, and lets you select which ones you want to extract. So over the next few weeks I dabbled while Liberty did her linguistics homework (they use the R programming language to analyze data, very interesting, but another story altogether) with creating an interface to do just that.

Since the function of the application would be limited, I wanted the interface to be very slim. No pull-down menus, no toolbars, just enough functionality to be able to cycle through pictures, move them around and zoom, and export the ones you were interested in. I ran into several file handling and UI design problems that I believe I found nice solutions for along the way, which I'll talk about later, but first, here is the interface as it stands right now:

Initially, you are presented with an empty window prompting you to drag and drop a file onto it. The file need only have embedded jpegs, pngs, or gif files in it (the only formats supported right now) to work. PDF files with embedded images work, as do Shockwave files, tarballs (provided they aren't compressed), Open Office odt files, etc.

Once an image collection is dropped onto the program's window, thumbnails of the embedded images appear on the right, and the first image is displayed as large as can fit in the left side, up to its original size. Not shown here is the load process, which gives brief previews of the images as they are parsed - a debugging method I was using, which I decided was cool enough to leave in and call a feature.

Clicking on a thumbnail displays the full-size image in the left pane. When your mouse cursor enters the left pane, translucent save and zoom controls appear. When those controls are in turn moused over, they become opaque. Clicking the single disk save icon will export the image currently in the left pane, where clicking the multi-disk icon exports them all. At this point, you aren't prompted to give a filename or a location to save the file; an image directory is created as a subdirectory of the original file's location, and the images are just save with a generic filename plus a count. (Fixing that is one of the many things on my to-do list of enhancements.)

The zoom control is a simple slider. Just click and drag the thumb to the zoom level you want. The scale goes linearly from 1 to 400% of the image's original size, and the longer black line on the track is where 100% is. Note that no matter how zoomed or not the image is, what gets exported is only the image as it was in the collection file. You can also click and drag the image around, as I've done here to position Elvis so he is centered and Nixon isn't showing.

Problems along the way

Bytes, chars, ints, and magic numbers

A "magic number" refers to the first few bytes of a given file type. For example, all GIF files start with "GIF87a" or "GIF89a", where all PNG files start with hex 89, then "PNG", then hex 0D 0A 1A 0A, a rather ingenuous way of detecting problems with line end conversions and file reader problems. An image reader typically reads in a few bytes of a file to determine what image type it is, and then parses it accordingly. My task was an extension of that: Read a file until you see an image magic number somewhere in it, then parse that image, then look for the next magic number.

Since I wanted users to be able to export individual images, I needed a method of tagging the start and end points of each image in the file (the alternative of holding the original bytes in a byte array in addition to Java Image objects seemed wasteful), so I quested for an appropriate reader. File readers want to grab 16 bit chars instead of 8 bit bytes, which makes searching for magic numbers painful, and data input readers want to return either 32 bit integers representing 0 - 255 or -1 for end of file, or two's complement bytes that throw an exception at end of file.

I didn't particularly like the behavior of any of the readers, and settled on using FileImageInputStream to handle the file, and to return buffered images from it, since it handled all the background business of figuring out what type-specific readers are available, and invoking the needed one. Unfortunately, a side effect of using is that your input stream gets closed automatically.

Since I wanted to keep the stream open and continue looking for more images, I came up with a quick hack: extend the input stream to ignore close requests:

class ExportableFileStream extends FileImageInputStream {
  public ExportableFileStream(File f) throws FileNotFoundException, IOException {

  public void close() {
    // Don't actually close

  public void actuallyClose() throws IOException {

With a mechanism to be able to control when a stream closed, I next needed a simple object to associate images with their start and end positions in the file. That was a no-brainer:

class ImagePlusPositions {
  Image image;
  long start, end;

  public ImagePlusPositions(Image i, long s, long e) {
    image = i;
    start = s;
    end = e;

And lastly, I decided to define magic numbers as 32 bit integers representing their first 4 bytes (except for jpgs, where only 3 bytes are always the same). When iterating through the file, I read in a byte at a time, and use bitshifting to build a 32 bit integer representing the last four bytes, and compare it to all my defined magic numbers. Here is a trimmed down version of that class:

public class ExportableFile extends {
  static final int jpgMagic = 0xffd8ff;
  static final int gifMagic = 0x47494638;
  static final int pngMagic = 0x89504E47;
  private List<ImagePlusPositions> imageList;

  // constructor, other methods omitted

  private void getImageList() throws IOException {
    imageList = new ArrayList<ImagePlusPositions>();
    ExportableFileStream is = new ExportableFileStream(this);
    int c;
    int last4 = 0;
    while ((c = != -1) {
      last4 = (last4 << 8) | c;
      if (last4 == pngMagic || (last4 >>> 8) == jpgMagic) {
      } else if (last4 == gifMagic) {

The readImage() method in turn backs up four bytes in the file, and calls to do the image parsing, without fear of prematurely closing the input stream. The business with setting big-endian is due to GIF readers explicitly setting the stream's byte order to little-endian. Since the GIF reader assumes it won't be sharing the input stream with anything else, it doesn't set the byte order back to its previous setting. Since CompuServe had a bunch of crazy proprietary tech, it doesn't surprise me that their file format would have an anomaly like that. But that's another story.

Scrolling and Thumbnails

I intended to use vertical scrollbars on the right-side thumbnail pane, whereas in the left pane you simply drag the main image around Google Maps style. Scrollbars in Java turned out to be sort of odd. Swing has a component called a JScrollPane whose function is to contain a component ("view") that can't fit on the screen, and present scrollbars that let you move around. This initially translated to me as needing two containers just to add some damned scrollbars, but on later reflection in seemed sensible.

Somewhere between my initial reaction and later reflection, I decided to create a class for holding thumbnail images in a scrolling window. I did this by extending JScrollPane to create its own view (a JPanel, in this case), manage the view's layout, call the superclass constructor to set vertical scrollbars only, and have a couple methods to manage adding thumbnails to the view with 10px gaps between them. The extra 20 pixel width keeps the centered thumbnail from overlapping the scrollbar.

public class ThumbnailPanel extends JScrollPane {
  private JPanel view;
  static final Dimension gap = new Dimension(0,10);

  public ThumbnailPanel() {
    this(new JPanel());

  public ThumbnailPanel(JPanel v) {
    super(v, ScrollPaneConstants.VERTICAL_SCROLLBAR_AS_NEEDED,
    view = v;
    view.setLayout(new BoxLayout(view, BoxLayout.Y_AXIS));

  public void addThumbnail(Thumbnail c) {
    c.setPreferredSize(new Dimension(c.image.getWidth(this) + 20,

  public void clearView() {

The Thumbnail class itself is handy. It's defined as a JComponent, and its constructor takes an image and scales it down to a certain size. The paintComponent method centers the image in the component (minus 10 pixels, to account for the scrollbar), allowing the thumbnail panel to draw the collection center justified, which looks nice.

public class Thumbnail extends JComponent {
  Image image;
  int index;
  static final int side = 200;

  public Thumbnail(Image image, int index) {
    this.image = getThumbnail(image);
    this.index = index;

  public void paintComponent(Graphics g) {
    if (image == null) return;
    int imgX = (getWidth() - image.getWidth(this)) / 2 - 10;
    int imgY = (getHeight() - image.getHeight(this)) / 2;
    g.drawImage(image, imgX, imgY, this);

  private Image getThumbnail(Image i) {
    int x = i.getWidth(this);
    int y = i.getHeight(this);
    if (x < side && y < side) return i;
    if (x > y) {
      y = y * side/x;
      x = side;
    } else {
      x = x * side/y;
      y = side;
    return i.getScaledInstance(x, y, Image.SCALE_FAST);

Zooming and panning

I put some thought into how to handle zooming and make it seem natural. When the displayed image is larger than the application window, the zoom point of reference should be the center of the window. That is, you should be taking a closer look at the item that was already in the center of the window.

When the image is smaller than the window, however, this behavior results in an undesired effect: the image fleeing the window if it wasn't already centered on it. So in cases where the image fits in the window, the zoom happens at the image's center, not the window's. In the code below, an unshown method inits the imgX and imgY variables so that the image is centered in the window.

int imgX, imgY;
  public void paintComponent(Graphics g) {
    g.drawImage(display, imgX, imgY, this);

  public void stateChanged(ChangeEvent e) {
    float factor = scale;
    scale = (float) slider.getValue() / 100;
    factor = scale / factor;
    int centerX, centerY;
    if (display.getWidth(null) > getWidth() || display.getHeight(null) > getHeight()) {
      // if scaled image doesn't fit in panel, reference point for zooming is the center of the panel,
      centerX = (getWidth() - getX()) / 2;
      centerY = (getHeight() - getY()) / 2;
    } else {
      // otherwise the zoom reference point is the center of the scaled image.
      centerX = imgX + display.getWidth(null) / 2;
      centerY = imgY + display.getHeight(null) / 2;
    imgX = centerX - (int) ((centerX - imgX) * factor);
    imgY = centerY - (int) ((centerY - imgY) * factor);

Panning by dragging the image with the mouse was surprisingly straightforward. The MouseEvent class has a method "getPoint()" that returns the mouse's position at the time an event was fired. The mousePressed() event fires the moment a mouse button is pressed. mouseDragged() fires several times throughout a drag motion.

I only needed to record the original position when mousePressed() fired, and have mouseDragged() do a little math and repaint the panel. The overridden paintComponent() method above handles the rest.

Point dragOrigin;

  public void mousePressed(MouseEvent e) {
    dragOrigin = e.getPoint();

  public void mouseDragged(MouseEvent e) {
    Point dragTo = e.getPoint();
    imgX += dragTo.x - dragOrigin.x;
    imgY += dragTo.y - dragOrigin.y;
    dragOrigin = dragTo;

The links here are an early build of the application. It behaves nicely, but has some UI quirks to work out, and it needs an overhaul of the paint methods to make them behave more intelligently. With that caveat, the jar file built at the time of this writing is available here, and its source code is here. This was written by me, and is released under GPL3.



  1. I have been messing with extracting the data and image files from FRB X9.37 files today. I have had the most luck with your code (I believe you wrote the other TIFF version that's floating around here as well).

    However, the TIFFs and JPGs are all unusable. Do you know what artifacts are being introduced into these image files making them "corrupt"? I wouldn't know where to start in identifying the artifacts but you mentioned it in your post so I was wondering if you knew what could be searched for/replaced/etc. I could work to get those out of the image files created.

    Any assistance is much appreciated. Thanks again for pointing me in the right direction with these Perl scripts.


  2. Jeff, the artifacts problem I was having was with pulling jpegs out of Shockwave files. I didn't identify what the problem was, opting to use the java program instead. I didn't experience the same thing pulling images out of Check21 files.

    If the script at doesn't extract everything properly from your X9.37 file, it's possible that there are file conversion problems going on, like EBCDIC to ASCII, or Unix to Windows linefeed conversion... but that's just a guess. It could also be something simple like naming the file .jpg when it's really TIFF data and your image viewer doesn't figure out what's up.

    Are you the recipient of the file? Did you get it via FTP? If so, make sure you pull it down in binary mode.

    If not, and if it's a test file that doesn't have real bank account info in it, shoot it to me in an email ( and I'll try to give you a better answer.