20 Image Resizing Pitfalls

Posted on May 19, 2009

Dozens of articles on server-side image resizing have been written. If we count other tongues, maybe hundreds. These contributions to the community
have been invaluable to me, and I truly appreciate the time each author spent to share his or her knowledge.

So why am I writing another?

Because each article I have read includes one of the errors below, leading readers to write either
slow, insecure, or incorrectly functioning code. I have discovered many of these pitfalls the hard way.
I hope others won’t have to.

Instead of giving step-by-step instructions, this article will simply list pitfalls and the alternatives. I hope to publish my approach in a Part 2.

Security and Performance Pitfalls

  1. Not using using(){}. You *must* wrap your Graphics, Bitmap, and MemoryStream objects
    in a using(){} clause, or else they will not get cleaned out of memory for a while.
    Under load this can cause *serious* issues. Read

    to dispose, or not to dispose, that’s the 1GB question
    if you have
    any doubts regarding the severity of this error.

    If you find yourself nesting a lot of using(){} statements, you can also use equivalent try{}finally{} code.

    //Using method. object must implement IDisposable for this to work
    using (object a = new object())
    {
        using (object b = new object())
        {
            using (object c = new object())
            {
    
            }
        }
    }
    
    //Try finally method
    object a = null;
    object b = null;
    object c = null;
    try
    {
        a = new object();
        b = new object();
        c = new object();
    }
    finally
    {
        if (a != null) a.Dispose(); //If one of the Dispose methods throws an error, the others will not execute.
        if (b != null) b.Dispose(); //So there is an advantage to using nested using(){} clauses
        if (c != null) c.Dispose(); //You could nest try{} finally{ try{} finally{ try{} finally{}}} to solve that...
        //Two different techniques - take your pick
    }
    
  2. Using on-the-fly image resizing without disk caching! The ASP.NET
    memory cache won’t cut it here folks – it gets cleaned out every application reboot,
    and besides, you probably have more images than RAM. Resizing an image is fast,
    but it will still flood the CPU if a single user browses a single page with 20 or more
    resized images on it. This is a do-it-yourself
    DOS
    attack. On-the-fly resizing is fine if you have disk caching.
  3. Not using on-the-fly resizing. This one bites also. If you decide to convert all
    your images up-front, please realize how difficult it will be to track down the
    originals and resizing them again next time you make a resolution jump. I’ve been
    through this enough, and it’s painful – that’s why I wrote a dynamic image resizer!
  4. Disk-caching without checking for updated (or reverted!) source files. Debugging
    a resized image that won’t update can eat up lots of time. Make sure you set the
    LastWriteTimeUTC on your cached images to match the source image file (and check
    they match) – don’t simply check to see if the source file is newer than the cached
    file, since that will break if you copy an older file over a source image. Always
    use something like RoughCompare() to compare filesystem dates – *never* inequalities.
    Remember that filesystem dates are less precise than DateTime, and get rounded.

    /// <summary>
    /// Returns true if both dates are equal (to the nearest 200th of a second)
    /// </summary>
    /// <param name="modifiedOn"></param>
    /// <param name="dateTime"></param>
    /// <returns></returns>
    private static bool RoughCompare(DateTime d1, DateTime d2)
    {
    	return (new TimeSpan((long)Math.Abs(d1.Ticks - d2.Ticks)).Milliseconds <= 5);
    }
    
  5. Disk-caching without cleanup! Another do-it-yourself
    DOS
    attack, although not quite as bad as the first. Left unchecked, your cache
    directory could grow very large over a few years as orphaned
    image versions accumulate. If a malicious visitor realizes that you have automatic resizing,
    he could try to fill up your hard drive by requesting an endless variety of resolutions
    for a given image. Of course, security-conscious developers will have cache-limiting
    systems in place. I suggest cleaning out the least recently used 10-20% of the cache
    directory whenever the file limit is reached. Handle locked files gracefully.
  6. Checking the cache size for cleanup every image request. This will swamp your I/O.
    Instead of running that directory listing each time, keep a static counter that tracks how many new images have been resized
    since the application started. Run the cache cleanup on the first image request and each time the counter passes the cleanup threshold.
  7. Disk caching without protecting the cache directory. Unless you want anonymous users to potentially view the same images
    as authorized users, you need your cache directory locked down. A Web.config file in the directory can do this – just verify your URL rewriting rules
    don’t leave another way to access the directory.

    The cache directory needs to stay inside the application to permit request rewriting to the cached files.

  8. Disk caching without proper locking code. This is a minor problem, since the
    consequences are light – but it is good to remember that 2 image requests for the
    same image size could happen at the same time, and (if they aren’t cached), they
    may conflict when trying to write to the same file at the same time. You’ll probably
    get a “The process cannot access the file because it is being used by another process.”
    message if this happens. You can prevent this by creating a locking system so that
    only one thread can save a give resized image at a time. Optimally, you want multiple resizes for different images to occur at the same time. If you’re
    not as concerned about concurrency performance as I was, you could cheat at make the whole resizing method locked. (For new image requests only!)
  9. Writing directly to the output stream. If you’re caching to disk, but still serving
    the image contents in code, you’re only supporting a little bit of the HTTP standard,
    and you’re bypassing all of the work Thomas
    Marquardt did to bring StaticFileHandler up to snuff
    . Implement your resizer
    as and HttpModule, not an HttpHandler or you’re stuck.
  10. Serving a file from disk by loading it into memory. Think about how much RAM your server has,
    how large a single image is, how long
    it has to stay in memory before users finish downloading it, and how many users
    you have requesting images. Don’t load anything into memory after the initial resize.

    WriteFile()
    serves directly from disk, and is *much* safer and more efficient. However – you
    shouldn’t be using WriteFile() either if you can avoid it. Letting StaticFileHandler
    do its job is a much better choice.

  11. Making an HttpHandler instead of an HttpModule. I actually did this in
    v1.0, and it was a *mess*, as well as being non-optimal from a performance standpoint.
    There are several problems with doing this as an HttpHandler.

    1. It’s very difficult to make an HttpHandler catch only *some* requests (i.e., those
      requesting resizing), for a certain extension. It’s very hard, in fact, and involves
      subclassing DefaultHttpHandler and re-implementing a lot of code. While that’s possible
      on IIS5/6/7 classic, it doesn’t work on IIS7 Integrated. So IIS7 integrated is a
      complete deal-breaker if you want to let standard images alone.
    2. It’s difficult to pass a request from one HttpHandler to another. When building
      an image resizer, we don’t want to be responsible for serving the resized file,
      just making sure the resized version has been cached to disk, and then rewriting
      the request to point to that file. An HttpModule, on the other hand, is perfectly
      suited to checking for image resize requests, caching the results, and rewriting
      the request so StaticFileHandler or whatever is the default in IIS 8 , 9, or 10
      can take of it. I do this in PostAuthorizeRequest, by calling context.RewritePath(virtualPath, false);
  12. Not setting context.Response.ContentType properly. You’ll get all kinds of interesting,
    varied, and peculiar results from browsers if you omit this step. Things can be really
    interesting if the format is changed during the resize, since the extension will
    match the original format.
  13. Obvious, but you should have caching enabled for your images, regardless
    of whether they are being resized or not. Disk caching is great, but memory caching allows for even faster responses to frequently requested images, and shouldn’t be omitted.
    In addition, HttpCacheability.Public enables client and proxy caching too, so browsers and some firewalls will cache the result from the server. You can adjust the amount of time
    the files are cached with SetExpires.

    This is the code I use during PreSendRequestHeaders

    HttpApplication app = sender as HttpApplication;
    HttpContext context = (app != null) ? app.Context : null;
    
    if (context != null && context.Items != null && context.Items["FinalContentType"] != null && context.Items["FinalCachedFile"] != null)
    {
    	//Clear previous output
    	//context.Response.Clear();
    	context.Response.ContentType = context.Items["FinalContentType"].ToString(); //FinalContentType is set to image/jpeg or whatever the image mime-type is earlier in code.
    	//Add caching headers
    	context.Response.AddFileDependency(context.Items["FinalCachedFile"].ToString());
    
    	if (context.Items["ContentExpires"] != null)
    		context.Response.Cache.SetExpires((DateTime)context.Items["ContentExpires"]); //ContentExpires is set to DateTime.Now.AddMinutes(x), where x is how long the clients should locally cache the image before checking for updates.
    
    	//Enables in-memory caching
    	context.Response.Cache.SetCacheability(HttpCacheability.Public);
    	context.Response.Cache.SetLastModifiedFromFileDependencies();
    	context.Response.Cache.SetValidUntilExpires(false);
    }
    
  14. Accepting the file path as a querystring parameter. This mistake makes me cringe
    – I find it amazing each time how much people trust their filtering code to prevent
    abuse of this feature. (If they have path filtering code at all!) Just… don’t…
    do it… please. Do you know how many ways there are to encode filenames and circumvent
    pattern-matching techniques? Yes, there are ways to protect this kind of system, but why?

    Why choose /resizeimage.ashx?path=~%2fimg%2fproducts%2fbox.jpg&maxwidth=100&maxheight=100
    over /img/products/box.jpg?maxwidth=100&maxheight=100 ?

    If you’re stuck in IIS6 and you aren’t allowed to modify handler mappings, you should look for a better host.

Pitfalls in Image Resizing

  1. Using GetThumbnailImage().
    GetThumbnailImage
    () seems the obvious choice, and many articles recommend its
    use.
    Unfortunately, it always grabs the embedded jpeg thumbnail if present.
    Some photos have these, some don’t – it usually depends on your camera. You’ll wonder
    why GetThumbnailImage works good on some photo, but on others is horribly
    blurred. GetThumbnailImage() isn’t reliable for photos larger than 10px by 10px for that reason.
  2. Forgetting to set InterpolationMode, SmoothingMode, CompositingQuality, and PixelOffsetMode.
    With all these set properly, you
    should be able to get resized images indistinguishable from Photoshop results. If
    you don’t, you’ll end up with trash. GDI has dumb defaults. (BTW, the low-quality
    settings aren’t always much faster)
    This article
    explains why those are needed to make DrawImage compose the
    image well.

    graphics.InterpolationMode = InterpolationMode.HighQualityBicubic;
    graphics.SmoothingMode  = SmoothingMode.HighQuality;
    graphics.CompositingQuality = CompositingQuality.HighQuality;
    graphics.PixelOffsetMode = PixelOffsetMode.HighQuality;
                
  3. Not maintaining aspect ratio. I see this often, and I’m not sure why – the math
    isn’t too hard. Well, for those who are wondering how, I hope this code is rather
    transparent (no pun intended).

    double aspectRatio = imageWidth/imageHeight;
    double boxRatio = maxWidth/maxHeight;
    double scaleFactor = 0;
    if (boxRatio > aspectRatio)
     //Use height, since that is the most restrictive dimension of box.
     scaleFactor = maxHeight / imageHeight;
    else
     scaleFactor = maxWidth / imageWidth;
    
    double newWidth = imageWidth * scaleFactor;
    double newHeight = imageHeight * scaleFactor;
    
  4. Not setting the Jpeg quality to 90. You’ll get huge Jpegs from Image.Save unless
    you pass in the proper parameters. 90 seems to be the magic value – great quality
    and much lower file size than 100.

    int quality = 90; //90 is the magic setting - really. It has excellent quality and file size.
    System.Drawing.Imaging.EncoderParameters encoderParameters = new System.Drawing.Imaging.EncoderParameters(1);
    encoderParameters.Param[0] = new System.Drawing.Imaging.EncoderParameter(System.Drawing.Imaging.Encoder.Quality, (long)quality);
    thumb.Save(stream, GetImageCodeInfo("image/jpeg"), encoderParameters);
    
    /// <summary>
    /// Returns the first ImageCodeInfo instance with the specified mime type. Some people try to get the ImageCodeInfo instance by index - sounds rather fragile to me.
    /// </summary>
    /// <param name="mimeType"></param>
    /// <returns></returns>
    public static ImageCodecInfo GetImageCodeInfo(string mimeType)
    {
    	ImageCodecInfo[] info = ImageCodecInfo.GetImageEncoders();
    	foreach (ImageCodecInfo ici in info)
    		if (ici.MimeType.Equals(mimeType, StringComparison.OrdinalIgnoreCase)) return ici;
    	return null;
    }
    
  5. Using the built-in quantization (palette creation) for GIFs, 8-bit PNGs and BMPs.
    The default palette is truly terrible, and while you can specify your own set of
    255 colors – which ones should they be? The process of determining which colors
    to choose for the palette and to produce the best quality images is call quantization.
    I recommend the very efficient and decent-quality
    octree quantization algorithm
    . It does have a number of bugs you will have
    to patch. Follow the transparency patch instructions found in the comments. Use the safe version of the library. Patch the Marshal.ReadInt32() bug (original is ReadByte()).
    Change any casts from IntPtr->int to IntPtr->long to make the code 64-bit safe.

    I’m working on adding adjustable Floyd-Steinberg dithering to the version in my resizer , and
    the results have been very promising so far.

  6. Inheriting the palette from the original image. While at first this seems like an

    easy way to solve the palette problem
    for GIFs, realize that the bicubic
    resizing will have combined colors, and the new thumbnail may not have any of colors
    of the original image. Also, any operations performed on the bitmap in 8-bit mode
    will be poor quality, and this won’t allow conversion between image formats. There are
    other ways to keep transparency. This is probably better than leaving the default palette, but YMMV.
  7. Resizing images that don’t request it. Your code should only activate when an image has a querystring with one of the supported
    commands. Pushing all images through your code is unnecessary.
  8. And one last piece of advice. Have Good Defaults. Always.

    The output image type should default to the source image type, unless it’s a BMP or TIFF.
    Default behavior should always preserve aspect ratio.

    Many developers stop after making their code configurable. They don’t take that extra 10 minutes to give
    everything smart defaults. Smart defaults distinguish good software from great software.

I hope to post Part 2 soon. I plan on revealing the architecture that has evolved in my resizer,
and how to design a IIS5/6/7 compatible HttpModule.

Tags: ,

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.