Optimizing media delivery with Cloudinary

2022-03-01T00:00:00+00:00

I remember how we needed to deal with different image formats and sizes years ago: From using the WordPress-style approach of automatically saving different resolutions on the server when uploading a picture, to using a PHP script to resize or crop images on the fly and return the result as a response to the frontend. Of course, many of those approaches were expensive, and not fully optimized for different browsers or device sizes.

With those experiences in mind, it was a nice surprise for me to discover Cloudinary when working on a new project a couple of months ago. It’s basically a cloud service that saves and delivers media content with a lot of transformations and management options for us to use. There is a free version with a usage limit: Up to 25K transformations or 25 GB of storage/bandwidth, which should be enough for most non-enterprise websites. The cheapest paid service is $99 per month.

Here’s a list of the image features we used on that project. I know they offer many other things that can be used as well, but I think this is a good start for anyone who hasn’t used this service yet:

Resizing and cropping

When you make a request for an image, you can instruct the Cloudinary API to retrieve it with a given size, which will trigger a transformation on their end before delivering the content. You can also use a cropping method: fill, fill with padding, scale down, etc.

Gravity position

When we specify a gravity position to crop an image, the service will keep the area of the image we decide to use as the focal point. We can choose a corner (for example, top left), but also—and this is probably one of the most interesting capabilities on this service—we can specify “special positions”: By using machine learning, we can instruct Cloudinary to use face detection, or even focus on other objects, like an animal or a flower in the picture.

Automatic format

Another cool feature is the automatic format, which will use your request headers to find the most efficient picture format for your browser type and version. For example, if the browser supports it, Cloudinary will return the image in WebP format, which is generally more efficient than standard JPEG, as End Point CTO Jon Jensen demonstrates on his recent blog post.

Automatic format in action: Returning a WebP image in Chrome

Other features

There are many other options for us to choose when querying their API, like setting up a default placeholder when we don’t have an image, applying color transformations, removing red eyes, among other things. The Transformation reference page on their documentation section is a great resource.

NuxtJS integration

The project I mentioned above was a NuxtJS application with a Node.js backend. And since there’s a NuxtJS module for Cloudinary, it made sense to use it instead of building the queries to the API from scratch.

The component works great, except for one bug that we found that didn’t allow us to fully use their image component with server-side rendering enabled. Between that drawback and some issues trying to use the lazy loading setting, we ended up creating a Vue component ourselves that used a standard image tag instead. But we still used their component to generate most of the API calls and render the results.

Below is an example of using the Cloudinary Image component on a Vue template:

<template>
  <div>
    <cld-image
      :public-id="publicId"
      width="200"
      height="200"
      crop="fill"
      gravity="auto:subject"
      radius="max"
      fetchFormat="auto"
      quality="auto"
      alt="An image example with Cloudinary"
    />
  </div>
</template>

Alternatives

Of course, Cloudinary is not the only image processing and CDN company out there: There are other companies offering similar services, like Cloudflare images, Cloudimage, or imagekit.io.

Do you know any other good alternatives, or have you used any other Cloudinary feature that is not listed here? Feel free to add a comment below!

Image compression: WebP presets, HEIC, AVIF, JPEG XL

2022-02-15T00:00:00+00:00

How time flies. Eight years ago I wrote the blog post WebP images experiment on End Point website to describe and demonstrate how the WebP image format can store an equivalent-quality image in a much smaller file size than the older JPEG, PNG, and GIF formats.

My WebP examples there were 17–23% of the JPEGs they came from, or about 5–6× smaller. While experimenting with higher levels of compression, I found that WebP tends to leave less-noticeable artifacts than JPEG does.

The main drawback at the time was that, among major browsers, only Chrome and Opera supported WebP, and back then, Chrome was far less popular than it is now.

Can I use it?

Since Apple’s iOS 14 and macOS 11 (Big Sur) became available in late 2020, the WebP image format now works in all the currently supported major operating systems and browsers: Linux, Windows, and macOS, running Chromium, Chrome, Brave, Edge, Opera, Firefox, and Safari. You can see the specifics at the ever-useful site “Can I use”.

It only took about 10 years! 😁

So for you who are hosting websites, all your site visitors can see WebP images and animations except those vanishing few using Internet Explorer (now long past its end of support by Microsoft and dangerous to use), and people (or their organizations) who intentionally do not allow their older browsers and operating systems to be updated.

That means you may want to continue using JPEG, PNG, and/or GIF images in places crucial for rendering your main site features, for people to be able to understand the main things your site is trying to communicate.

But for any images that are less essential and primarily making it prettier, or where you don’t mind suggesting that users of old browsers upgrade to see them, WebP can now be your default image format.

How do I create WebP images?

Mobile phones and digital cameras typically save JPEG, HEIF, or raw (uncompressed) images. Some stock photography collections offer WebP downloads, but many still use JPEG.

So in many cases you won’t be starting with a WebP image, and you’ll convert some other image to WebP, likely after cropping, scaling, and other adjustments.

GIMP (GNU Image Manipulation Program) supports WebP images since about 2017, and Adobe Photoshop does not natively but can use the free WebPShop plugin.

The oldest way to convert images to WebP, and still very useful for batch processing or fine-tuning, is Google’s WebP converter “cwebp”.

cwebp settings

With the power “cwebp” offers comes some complexity, but it is mostly harmless.

Run this to see its many options:

cwebp -longhelp

Or read the same thing in its online documentation.

There among other things you will see the -z option which activates preset features for lossless encoding, with an integer level chosen from 0 to 9 where 0 is fastest but compresses less and 9 is slowest but compresses better. Use this to replace PNG files when you want no degradation of the image at all.

The documentation also shows the useful -preset and -hint options for lossy compression similar to what JPEG does, but better:

-preset <string> ....... preset setting, one of:
                          default, photo, picture,
                          drawing, icon, text

-hint <string> ......... specify image characteristics hint,
                         one of: photo, picture or graph

The meaning of a few of those terms, especially “photo” and “picture”, was not clear to me and not defined elsewhere in the documentation that I could see.

To find that out I had to make a quick trip into the source code, and there are comments explaining each option’s use case a bit:

The comments for the -preset option:

WEBP_PRESET_PICTURE,  // digital picture, like portrait, inner shot
WEBP_PRESET_PHOTO,    // outdoor photograph, with natural lighting
WEBP_PRESET_DRAWING,  // hand or line drawing, with high-contrast details
WEBP_PRESET_ICON,     // small-sized colorful images
WEBP_PRESET_TEXT      // text-like

And the comments for the -hint option:

WEBP_HINT_PICTURE,    // digital picture, like portrait, inner shot
WEBP_HINT_PHOTO,      // outdoor photograph, with natural lighting
WEBP_HINT_GRAPH,      // Discrete tone image (graph, map-tile etc).

So in short, “cwebp” considers a “picture” to be indoors and close-up, while “photo” is outdoors and more likely with more distant focus. That’s good to know.

Batch conversion

With that in mind, I can convert a pile of screenshots that have been collecting on my computer to refer to later. One kind of screenshots I sometimes take is of video meetings with mostly indoor views of people. I will use the “picture” preset for those.

A simple bash script works well to process many images in a row:

for infile in Screenshot*.png
do
    echo $infile
    base=$(basename "$infile" .png)
    cwebp -preset picture -v "$infile" -o "$base".webp
done

If you have many images to convert to WebP and want to do several at the same time to get done faster, you can use GNU parallel.

My screenshots when converted from PNG to WebP consistently take about 3% of the original space, 33–35× smaller! And the quality looks about the same. Amazing.

These screenshots are 2880×1800 pixels, mostly of Google Meet low-bandwidth video streams. The originals of these screenshots don’t look particularly good to begin with, with some blurriness. But exactly because of this, there is no reason for me to keep a larger high-quality original here. The much smaller WebP is fine.

Competitors to WebP

Other newer image formats have also been in the works for years, chasing some of the same goals. Should we skip WebP and use one of them instead?

HEIC

The HEIC (High-Efficiency Image Container) subset of the HEIF (High Efficiency Image File Format) standard uses High Efficiency Video Coding (HEVC, H.265) for storing images with a .heic suffix.

Compared to JPEG, HEIC offers the nice advantages of smaller file sizes for the same quality level (roughly half the size of an equivalent JPEG), and animation support (to replace GIF).

On the downside, HEIC is encumbered by patents that limit its use for major commercial projects, even on devices licensed for consumer use. It is also slower to encode/decode. And, concerning for archivists, HEIC shows severe visual damage to the entire image if part of the file is corrupted. In contrast, with corrupted JPEG files the visual damage is typically localized to particular smaller square regions, not the entire image.

HEIC has been used in Apple’s operating systems since the 2017 release of the iPhone 7 and iOS 11 and macOS 10.13 (High Sierra). Support was later added to Windows 10, Android 9, and Ubuntu 20.04.

As of this writing, no major browsers support HEIC natively, not even Apple’s own Safari.

So for now, HEIC is primarily used by Apple to store photos more efficiently on its mobile devices.

AVIF

The AVIF (AV1 Image File Format) competes with HEIC and, confusingly, uses the same HEIF container file format that HEIC does. That confusion is reduced in practice by its use of the separate file extension .avif.

AVIF is supported in current Chrome, Firefox, and Opera. Support was added to WebKit in 2021, but it still has not made its way into Safari. It also works in newer VLC, GIMP, Windows, Android, etc.

Netflix has a very detailed blog post comparing AVIF to JPEG and showing AVIF’s many advantages.

JPEG XL

A semi-compatible successor to JPEG has long been in the works, and JPEG XL seems likely to eventually fill that role.

Whereas the other new image formats mentioned above usually lose some quality when recompressing JPEG and other images that were already lossy-compressed, according to the Joint Photographic Experts Group (JPEG):

Existing JPEG files can be losslessly transcoded to JPEG XL, significantly reducing their size.

It has been reported that JPEG XL is expected to become available in its final standard form in 2022, and support is already available in preliminary form in some software (see the Wikipedia JPEG XL article).

That of course means that for now, no major browsers support JPEG XL natively.

Use WebP now

So all the other new options are not yet usable for general web images. WebP is the current obvious choice, whether you want a lossy replacement for JPEG photos, a lossless replacement for PNG images, or a replacement for GIF animations.

Our developers have set up automatic server-side app conversion of high-quality PNG originals to WebP or JPEG on the fly, with the image size dependent on the browser viewport size. And we have worked with Cloudinary, Cloudflare, and other CDNs to use their image conversion services. We are available to help with your projects too.

Reference

The Mozilla Development Network (MDN) has excellent documentation of web image format details, filename suffixes, and support in the major browsers in its Image file type and format guide.

Decreasing your website load time

2020-01-07T00:00:00+00:00

Photo by Johan Larsson, used under CC BY 2.0

We live in a competitive world, and the web is no different. Improving latency issues is crucial to any Search Engine Optimization (SEO) strategy, increasing the website’s ranking and organic traffic (visitors from search engines) as a result.

There are many factors that can lead to a faster response time, including optimization of your hosting plan, server proximity to your main traffic source, or utilization of a Content Distribution Network (CDN) if you are expecting visitors on an international level. Some of these solutions and many others can be implemented with only a couple hours of coding.

Inline styles and scripts for the topmost content

Nobody enjoys waiting for long load times. When opening a Google search link, being met with a blank page or a loading GIF for several seconds can seem agonizing. That’s why optimizing the initial rendering of your page is crucial.

The content that immediately appears to the user without the need to scroll down is referred to as “above-the-fold”. This is where your optimization efforts should be aimed. So here’s a plan to load and display as quickly as possible:

First, differentiate the critical styles and scripts you need to render the topmost content, and separate them from the rest of our stylesheet and external script references.
Then, minify the separated styles and scripts, and insert them directly on our page template, right before the closing </head> tag.
Finally, take the stylesheet and scripts link references from the <head> tag (where it’s usually located) and move them to the end of the above-the-fold content.

Now, the user won’t have to wait until all references are loaded before seeing content. Tip: Remember to use the async tag on scripts whenever possible.

example.html:

<head>
    <style>{above-the-fold minified inline styles goes here}</style>
    <script type="text/javascript">{above-the-fold critical scripts goes here}</script>
</head>
<body>
    <div class="above-the-fold-content"></div>
    <link rel="stylesheet" href="{below-the-fold minified stylesheet reference goes here}" />
    <script async src="{below-the-fold minified javascript reference goes here}" />
    <div class="below-the-fold-content"></div>
</body>

Deferred loading of ads

If you’re monetizing your website through Google AdSense or another ad agency that uses scripts to load ads, consider loading ads after the content is fully rendered. This may have a small impact on your revenue, but will improve the user’s experience while optimizing the load speed.

Although there are several ways to achieve this, a technique I have successfully used on many websites is removing all of the script references to Google AdSense until your page is fully loaded. A short delay can be added in order to allow some browsing time before showing ads.

Remove script references, the comment, and extra spaces from your original ad code, to convert it from something like this…

<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<!-- Your ad name -->
<ins class="adsbygoogle"
     style="display:inline-block;width:728px;height:90px"
     data-ad-client="ca-pub-XXXXXXXXXXXXXXXXX"
     data-ad-slot="XXXXXXXXX"></ins>
<script>
     (adsbygoogle = window.adsbygoogle || []).push({});
</script>

… to something like this:

<ins class="adsbygoogle" style="display:inline-block;width:728px;height:90px" data-ad-client="ca-pub-XXXXXXXXXXXXXXXXX" data-ad-slot="XXXXXXXXX"></ins>

A lot shorter, isn’t it? This will create an empty slot in which the ad will be displayed after the page is fully rendered. To accomplish that, a new script like the one below must be added (assuming jQuery is present on the website):

async-ads.js:

// Create a script reference
function addScript(src, async, callback) {
    var js = document.createElement("script");
    js.type = "text/javascript";
    if (async)
        js.async = true;
    if (callback)
        js.onload = callback;
    js.src = src;
    document.body.appendChild(js);
}

// Called when document is ready
$(document).ready(function() {

    // Wait for one second to ensure the user started browsing
    setTimeout(function() {
        (adsbygoogle = window.adsbygoogle || []);
        $("ins.adsbygoogle").each(function() {
            $("<script>(adsbygoogle = window.adsbygoogle || []).push({})</script>").insertAfter($(this));
        });
        addScript("https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js", true);
    }, 1000);

});

This code will wait for one second once the document is ready, and then leave instructions for Google to push a new ad for each slot. Finally, the AdSense external script will be loaded so that Google will read the instructions and start filling all the slots with ads.

Tip: Enabling balancing from your AdSense dashboard may improve the average load speed as well as the user’s experience since ads will not be shown when the expected revenue is deprecable. And if you’re still on the fence about showing fewer ads, try out an experiment like I did. A balance of 50% worked well in my case, but the right balance will depend on your niche and website characteristics.

Lazy load for images

Because the user will most likely spend the majority of the visit reading above-the-fold content (and may even leave before scrolling at all), loading all images from content below-the-fold at first is impractical. Implementing a custom lazy-loading script (also referred to as deferred-loading or loading-on-scroll) for images can be an easy process. Even though changes to the backend would be likely, the concept of this approach is simple:

Replacing the src attributes from all images that will have lazy loading with a custom attribute such as data-src (this part will probably require backend changes) and set a custom class for them, like lazy.
Creating a script that will copy the data-src content into the src attribute as we scroll through the page.

lazy-load.js:

;(function($) {

    $.fn.lazy = function(threshold, callback) {
        var $w = $(window),
        th = threshold || 0,
        attrib = "data-src",
        images = this,
        loaded;
        this.one("lazy", function() {
            var source = this.getAttribute(attrib);
            source = source || this.getAttribute("data-src");
            if (source) {
                this.setAttribute("src", source);
                if (typeof callback === "function") callback.call(this);
            }
        });

        function lazy() {
            var inview = images.filter(function() {
                var $e = $(this);
                if ($e.is(":hidden")) return;
                var wt = $w.scrollTop(),
                wb = wt + $w.height(),
                et = $e.offset().top,
                eb = et + $e.height();
                return eb >= wt - th && et <= wb + th;
            });
            loaded = inview.trigger("lazy");
            images = images.not(loaded);
        }

        $w.scroll(lazy);
        $w.resize(lazy);
        lazy();
        return this;
    };

})(window.jQuery);

$(document).ready(function() {
    $('.lazy').each(function () {
        $(this).lazy(0, function() {
        $(this).load(function() {
            this.style.opacity = 1;
        });
    });
});

// Set the correct attribute when printing
var beforePrint = function() {
    $("img.lazy").each(function() {
        $(this).trigger("lazy");
        this.style.opacity = 1;
    });
};
if (window.matchMedia) {
    var mediaQueryList = window.matchMedia('print');
    mediaQueryList.addListener(function(mql) {
        if (mql.matches)
            beforePrint();
    });
}
window.onbeforeprint = beforePrint;

This script will search for all <img> tags with class lazy, and change the data-src attribute to the src attribute once the image becomes visible due to scrolling. It also includes some additional logic to set the src attribute before printing the page.

Server-side caching

Instead of performing all the backend rendering calculations every time, server-side caching allows you to output the same content to the clients over a period of time from a temporary copy of the response. This not only results in a decreased response time but also saves some resources on the server.

There are several ways to enable server-side caching, depending on factors such as the backend language and hosting platform (e.g. Windows/IIS vs. Linux/Apache), among other things. For this example, we will use ASP.NET (C#) since I’m mostly a Windows user.

The best and most efficient way to do this is by adding a declaration in the top of our ASP.NET page:

<%@ OutputCache Duration="10" VaryByParam="id;date" %>

This declaration is telling the compiler that we want to cache the output from the server for 10 minutes, and we will save different versions based on the id and date URL parameters. So pages like:

https://www.your-url.com/cached-page/?id=1&date=2020-01-01
https://www.your-url.com/cached-page/?id=2&date=2020-01-01
https://www.your-url.com/cached-page/?id=2&date=2020-02-01

will be saved and then served from different cache copies. If we only set the id parameter as a source for caching, pages with different dates will be served from the same cache source (this can be useful as the date parameter is only evaluated on frontend scripts and ignored in the backend).

There are other configurations in ASP.NET to set our output cache policy. The output can be set to be based on the browser, the request headers, or even custom strings. This page has more useful information on this subject.

GZip compression

GZip compression—when the client supports it—allows compressing the response before sending it over the network. In this way, more than 70% of the bandwidth can be saved when loading the website. Enabling GZip compression for dynamic and static content on a Windows Server with IIS is simple: Just go to the “Compression” section on the IIS Manager and check the options “Enable dynamic/static content compression”.

However, if you are running an ASP.NET MVC/WebForms website, this won’t be enough. For all backend responses to be compressed before sending them to the client, some custom code will also need to be added to the global.asax file in the website root:

global.asax:

<%@ Application Language="C#" %>

<script runat="server">

    void Application_PreRequestHandlerExecute(object sender, EventArgs e)
    {
        HttpApplication app = sender as HttpApplication;
        string acceptEncoding = app.Request.Headers["Accept-Encoding"];
        System.IO.Stream prevUncompressedStream = app.Response.Filter;

        if (app.Context.CurrentHandler == null)
            return;

        if (!(app.Context.CurrentHandler is System.Web.UI.Page ||
            app.Context.CurrentHandler.GetType().Name == "SyncSessionlessHandler") ||
            app.Request["HTTP_X_MICROSOFTAJAX"] != null)
            return;

        if (acceptEncoding == null || acceptEncoding.Length == 0)
            return;

        if (Request.ServerVariables["SCRIPT_NAME"].ToLower().Contains(".axd")) return;
        if (Request.ServerVariables["SCRIPT_NAME"].ToLower().Contains(".js")) return;
        if (Request.QueryString.ToString().Contains("_TSM_HiddenField_")) return;

        acceptEncoding = acceptEncoding.ToLower();

        if (acceptEncoding.Contains("deflate") || acceptEncoding == "*")
        {
            app.Response.Filter = new System.IO.Compression.DeflateStream(prevUncompressedStream,
                System.IO.Compression.CompressionMode.Compress);
            app.Response.AppendHeader("Content-Encoding", "deflate");
        }
        else if (acceptEncoding.Contains("gzip"))
        {
            app.Response.Filter = new System.IO.Compression.GZipStream(prevUncompressedStream,
                System.IO.Compression.CompressionMode.Compress);
            app.Response.AppendHeader("Content-Encoding", "gzip");
        }
    }

</script>

To make sure our code is working properly, an external tool like this will inform you if GZip is enabled or not.

Summary

While there are many ways of decreasing the load time of a website, most are common and expensive. However, with a few minor tweaks, we can offer a better user experience in addition to improve our position in the search engine results. Every bit of optimization counts towards the goal with SEO. Load time is a very important factor (to both the developer and the user), especially on mobile platforms where users expect to get what they want instantly.

The image below is a Google Analytics report from one of my websites where, over several months, I implemented most of these formulas. A month ago, I made the latest change of deferring ad loading, which had an observable impact on the average loading speed of the page:

Do you have any other page load optimization techniques? Leave a comment below!

Roundup of some useful websites

2018-12-21T00:00:00+00:00

The world is a big place, and the Internet has gotten pretty big too. There are always new projects being created, and I want to share some useful and interesting ones from my growing list:

Squoosh image compressor

Squoosh, hosted at squoosh.app, is an open source in-browser tool for experimenting with image compression, made by the Chrome development team.

With Squoosh you can load an image in your browser, convert it to different image file formats (JPEG, WebP, PNG, BMP) using various compression algorithms and settings, and compare the result side-by-side with either the original image or the image compressed using other options.

The screenshot above demonstrates Squoosh running in Firefox 64 on Linux. Click on it to see a larger, lossless PNG screenshot. The photo was taken by my son Phin in northern Virginia, and is a typical imperfect mobile phone photo. On the left is the original, and on the right I am showing how bad gradients in the sky can look when compressed too much—maybe a quality level of 12 (out of 100) was too low. It does make for a very compact file size, though. 😄

Squoosh’s interface has a convenient slider bar so you can compare any part of the two versions of the image side by side. You can zoom and pan the image as well.

It is neat to see JavaScript tools (in this case TypeScript specifically) doing work in the browser that has traditionally been done by native apps.

Nerd Fonts

If you want access to an amazing number of symbols in a font, check out nerdfonts.com. There you can mix and match symbols from many popular developer-oriented fonts such as Font Awesome, Powerline Symbols, Material Design, etc.

I probably should have chosen some fun symbols to demonstrate it here, but I could tell that was a rabbit hole I would not soon emerge from!

glot.io code pastebin

There are many public pastebins these days, but glot.io distinguishes itself by allowing you to run real code on their server in nearly 50 languages.

It offers both public and private pastes, has an API, and is open source.

Firefox Send

Firefox Send at send.firefox.com is a browser-based service for securely sharing files temporarily, for only one download during a maximum of 24 hours.

Handy for keeping unwanted bloat out of email, chat, or shared file storage for ephemeral files.

Similarly, transfer.sh is a terminal-based file upload and download tool.

As a command-line tool it easily integrates with other standard tools, so you can pipe output from other programs directly to it. If you have sensitive data to share you don’t need to trust the service—you can pipe your data through gpg or some other encryption tool before it leaves your computer.

transfer.sh is open source and can be self-hosted too.

It even has a Tor onion service so uploads and/or downloads can be as private as possible in hostile environments.

Doing what you don’t want to do

And finally, some timeless tips for making our human “software” work.

Often just one or two annoying little things can block us from making progress on larger projects that overall we really enjoy. How can you motivate yourself to push ahead when you have work that needs to be done, but you don’t want to do it?

Read the brief but helpful article 10 Ways to Do What You Don’t Want to Do by Leo Babauta to get some good ideas. A few of the points mentioned especially resonate with me:

Why do I need to do it?
What is stopping me?
Embrace that it won’t be fun and do it anyway.
Set constraints.

Then do at least a little bit of the work to get started. As our co-worker Mike Heins has said to me on a few occasions over the years, you’ll never finish until you start.

Postgres WAL files: best compression methods

2017-03-28T00:00:00+00:00

Turtle turtle by WO1 Larry Olson from US Army

The PostgreSQL database system uses the write-ahead logging method to ensure that a log of changes is saved before being applied to the actual data. The log files that are created are known as WAL (Write Ahead Log) files, and by default are 16 MB in size each. Although this is a small size, a busy system can generate hundreds or thousands of these files per hour, at which point disk space becomes an issue. Luckily, WAL files are extremely compressible. I examined different programs to find one that offered the best compression (as indicated by a smaller size) at the smallest cost (as indicated by wall clock time). All of the methods tested worked better than the venerable gzip program, which is suggested in the Postgres documentation for the archive_command option. The best overall solution was using the pxz program inside the archive_command setting, followed closely by use of the 7za program. Use of the built-in wal_compression option was an excellent solution as well, although not as space-saving as using external programs via archive_command.

A database system is a complex beast, involving many trade-offs. An important issue is speed: waiting for changes to get written to disk before letting the client proceed can be a very expensive solution. Postgres gets around this with the use of the Write Ahead Log, which generates WAL files indicating what changes were made. Creating these files is much faster than performing the actual updates on the underlying files. Thus, Postgres is able to tell the client that the work is “done” when the WAL file has been generated. Should the system crash before the actual changes are made, the WAL files are used to replay the changes. As these WAL files represent a continuous unbroken chain of all changes to the database, they can also be used for Point in Time Recovery—in other words, the WAL files can be used to rewind the database to any single point in time, capturing the state of the database at a specific moment.

Postgres WAL files are exactly 16 MB in size (although this size may be changed at compilation time, it is extremely unheard of to do this). These files primarily sit around taking up disk space and are only accessed when a problem occurs, so being able to compress them is a good one-time exchange of CPU effort for a lower file size. In theory, the time to decompress the files should also be considered, but testing revealed that all the programs decompressed so quickly that it should not be a factor.

WAL files can be compressed in one of two ways. As of Postgres 9.5, the wal_compression feature can be enabled, which instructs Postgres to compress parts of the WAL file in-place when possible, leading to the ability to store much more information per 16 MB WAL file, and thus reducing the total number generated. The second way is to compress with an external program via the free-form archive_command parameter. Here is the canonical example from the Postgres docs, showing use of the gzip program for archive_command:

archive_command = 'gzip < %p > /var/lib/pgsql/archive/%f'

It is widely known that gzip is no longer the best compression option for most tasks, so I endeavored to determine which program was the best at WAL file compression—in terms of final file size versus the overhead to create the file. I also wanted to examine how these fared versus the new wal_compression feature.

To compare the various compression methods, I examined all of the compression programs that are commonly available on a Linux system, are known to be stable, and which perform at least as good as the common utility gzip. The contenders were:

gzip — the canonical, default compression utility for many years
pigz — parallel version of gzip
bzip2 — second only to gzip in popularity, it offers better compression
lbzip2 — parallel version of bzip
xz — an excellent all-around compression alternative to gzip and bzip
pxz — parallel version of xz
7za — excellent compression, but suffers from complex arguments
lrzip — compression program targeted at “large files”

For the tests, 100 random WAL files were copied from a busy production Postgres system. Each of those 100 files were compressed nine times by each of the programs above: from the “least compressed” option (e.g. -1) to the “best compressed” option (e.g. -9). The tests were performed on a 16-core system, with plenty of free RAM and nothing else running on the server. Results were gathered by wrapping each command with /usr/bin/time -verbose, which produces a nice breakdown of results. To gather the data, the “Elapsed (wall clock) time” was used, along with size of the compressed file. Here is some sample output of the time command:

  Command being timed: "bzip2 -4 ./0000000100008B91000000A5"
  User time (seconds): 1.65
  System time (seconds): 0.01
  Percent of CPU this job got: 99%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.66
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 3612
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 0
  Minor (reclaiming a frame) page faults: 938
  Voluntary context switches: 1
  Involuntary context switches: 13
  Swaps: 0
  File system inputs: 0
  File system outputs: 6896
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0

The wal_compression feature was tested by creating a new Postgres 9.6 cluster, then running the pgbench program twice to generate WAL files—once with wal_compression enabled, and once with it disabled. Then each of the resulting WAL files was compressed using each of the programs above.

Table 1.
Results of compressing 16 MB WAL files—average for 100 files

Command	Wall clock time (s)	File size (MB)
gzip -1	0.271	4.927
gzip -2	0.292	4.777
gzip -3	0.366	4.667
gzip -4	0.381	4.486
gzip -5	0.492	4.318
gzip -6	0.734	4.250
gzip -7	0.991	4.235
gzip -8	2.042	4.228
gzip -9	3.626	4.227

Command	Wall clock time (s)	File size (MB)
bzip2 -1	1.540	3.817
bzip2 -2	1.531	3.688
bzip2 -3	1.570	3.638
bzip2 -4	1.625	3.592
bzip2 -5	1.667	3.587
bzip2 -6	1.707	3.566
bzip2 -7	1.731	3.559
bzip2 -8	1.752	3.557
bzip2 -9	1.784	3.541

Command	Wall clock time (s)	File size (MB)
xz -1	0.962	3.174
xz -2	1.186	3.066
xz -3	5.911	2.711
xz -4	6.292	2.682
xz -5	6.694	2.666
xz -6	8.988	2.608
xz -7	9.194	2.592
xz -8	9.117	2.596
xz -9	9.164	2.597

Table 2.
Results of compressing 16 MB WAL file—average for 100 files

Command	Wall clock time (s)	File size (MB)
lrzip -l -L1	0.296	5.712
lrzip -l -L2	0.319	5.686
lrzip -l -L3	0.341	5.645
lrzip -l -L4	0.370	5.639
lrzip -l -L5	0.389	5.627
lrzip -l -L6	0.901	5.501
lrzip -l -L7	2.090	5.462
lrzip -l -L8	2.829	5.471
lrzip -l -L9	5.983	5.463

Command	Wall clock time (s)	File size (MB)
lrzip -z -L1	3.582	3.353
lrzip -z -L2	3.577	3.342
lrzip -z -L3	3.601	3.326
lrzip -z -L4	11.971	2.799
lrzip -z -L5	11.890	2.741
lrzip -z -L6	11.971	2.751
lrzip -z -L7	12.861	2.750
lrzip -z -L8	30.080	2.483
lrzip -z -L9	33.171	2.482

Command	Wall clock time (s)	File size (MB)
7za -bd -mx=1 a test.7za	0.128	3.182
7za -bd -mx=2 a test.7za	0.139	3.116
7za -bd -mx=3 a test.7za	0.301	3.059
7za -bd -mx=4 a test.7za	1.251	3.001
7za -bd -mx=5 a test.7za	3.821	2.620
7za -bd -mx=6 a test.7za	3.841	2.629
7za -bd -mx=7 a test.7za	4.631	2.591
7za -bd -mx=8 a test.7za	4.671	2.590
7za -bd -mx=9 a test.7za	4.663	2.599

Table 3.
Results of compressing 16 MB WAL file—average for 100 files

Command	Wall clock time (s)	File size (MB)
pigz -1	0.051	4.904
pigz -2	0.051	4.755
pigz -3	0.051	4.645
pigz -4	0.051	4.472
pigz -5	0.051	4.304
pigz -6	0.060	4.255
pigz -7	0.081	4.225
pigz -8	0.140	4.212
pigz -9	0.251	4.214

Command	Wall clock time (s)	File size (MB)
lbzip2 -1	0.135	3.801
lbzip2 -2	0.151	3.664
lbzip2 -3	0.151	3.615
lbzip2 -4	0.151	3.586
lbzip2 -5	0.151	3.562
lbzip2 -6	0.151	3.545
lbzip2 -7	0.150	3.538
lbzip2 -8	0.151	3.524
lbzip2 -9	0.150	3.528

Command	Wall clock time (s)	File size (MB)
pxz -1	0.135	3.266
pxz -2	0.175	3.095
pxz -3	1.244	2.746
pxz -4	2.528	2.704
pxz -5	5.115	2.679
pxz -6	9.116	2.604
pxz -7	9.255	2.599
pxz -8	9.267	2.595
pxz -9	9.355	2.591

Table 4.
Results of Postgres wal_compression option

Modifications	Total size of WAL files (MB)
No modifications	208.1
wal_compression enabled	81.0
xz -2	8.6
wal_compression enabled PLUS xz -2	9.4

Table 1 shows some baseline compression values for the three popular programs gzip, bzip2, and xz. Both gzip and bzip2 show little change in the file sizes as the compression strength is raised. However, xz has a relatively large jump when going from -2 to -3, although the time cost increases to an unacceptable 5.9 seconds. As a starting point, something well under one second is desired.

Among those three, xz is the clear winner, shrinking the file to 3.07 MB with a compression argument of -2, and taking 1.2 seconds to do so. Both gzip and bzip2 never even reach this file size, even when using a -9 argument. For that matter, the best compression gzip can ever achieve is 4.23 MB, which the other programs can beat without breakng a sweat.

Table 2 demonstrates two ways of invoking the lrzip program: the -l option (lzo compression - described in the lrzip documentation as “ultra fast”) and the -z option (zpaq compression - “extreme compression, extreme slow”). All of those superlatives are supported by the data. The -l option runs extremely fast: even at -L5 the total clock time is still only .39 seconds. Unfortunately, the file size hovers around an undesirable 5.5 MB, no matter what compression level is used. The -z option produces the smallest file of all the programs here (2.48 MB) at a whopping cost of over 30 seconds! Even the smallest compression level (-L1) takes 3.5 seconds to produce a 3.4 MB file. Thus, lrzip is clearly out of the competition.

Compression in action (photo by Eric Armstrong)

The most interesting program is without a doubt 7za. Unlike the others, it is organized around creating archives, and thus doesn’t do in-place compression as the others do. Nevertheless, the results are quite good. At the lowest level, it takes a mere 0.13 seconds to produce a 3.18 MB file. As it takes xz 1.19 seconds to produce a nearly equivalent 3.10 MB file, 7za is the clear winner … if we had only a single core available. :)

It is rare to find a modern server with a single processor, and a crop of compression programs have appeared to support this new paradigm. First up is the amazing pigz, which is a parallel version of gzip. As Table 3 shows, pigz is extraordinarily fast on our 16 core box, taking a mere 0.05 seconds to run at compression level -1, and only 0.25 seconds to run at compression level -9. Sadly, it still suffers from the fact that gzip is simply not good at compressing WAL files, as the smallest size was 4.23 MB. This rules out pigz from consideration.

The bzip2 program has been nipping at the heels of gzip for many years, so naturally it has its own parallel version, known as lbzip2. As Table 3 shows, it is also amazingly fast. Not as fast as pigz, but with a speed of under 0.2 seconds—even at the highest compression level! There is very little variation among the compression levels used, so it is fair to simply state that lbzip2 takes 0.15 seconds to shrink the WAL file to 3.5 MB. A decent entry.

Of course, the xz program has a parallel version as well, known as pxz. Table 3 shows that the times still vary quite a bit, and reach into the 9 second range at higher compression levels, but does very well at -2, taking a mere 0.17 seconds to produce a 3.09 MB file. This is comparable to the previous winner, 7za, which took 0.14 seconds to create a 3.12 MB file.

So the clear winners are 7za and pxz. I gave the edge to pxz, as (1) the file size was slightly smaller at comparable time costs, and (2) the odd syntax for 7za for both compressing and decompressing was annoying compared with the simplicity of “xz -2” and “xz -d”.

Now, what about the built-in compression offered by the wal_compression option? As Table 4 shows, the compression for the WAL files I tested went from 208 MB to 81 MB. This is a significant gain, but only equates to compressing a single WAL file to 6.23 MB, which is a poor showing when compared to the compression programs above. It should be pointed out that the wal_compression option is sensitive to your workload, so you might see reports of greater and lesser compressions.

Of interest is that the WAL files generated by turning on wal_compression are capable of being further compressed by the archive_command option, and doing a pretty good job of it as well—going from 81 MB of WAL files to 9.4 MB of WAL files. However, using just xz in the archive_command without wal_compression on still yielded a smaller overall size, and means less CPU because the data is only compressed once.

It should be pointed out that wal_compression has other advantages, and that comparing it to archive_command is not a fair comparison, but this article was primarily about the best compression option for storing WAL files long-term.

Thus, the overall winner is “pxz -2”, followed closely by 7za and its bulky arguments, with honorable mention given to wal_compression. Your particular requirements might guide you to another conclusion, but hopefully nobody shall simply default to using gzip anymore.

Thanks to my colleague Lele for encouraging me to try pxz, after I was already happy with xz. Thanks to the authors of xz, for providing an amazing program that has an incredible number of knobs for tweaking. And a final thanks to the authors of the wal_compression feature, which is a pretty nifty trick!

HTTP/2 is on the way!

2015-03-13T00:00:00+00:00

HTTPS and SPDY

Back in August 2014, we made our websites www.endpoint.com and liquidgalaxy.endpoint.com HTTPS-only, which allowed us to turn on HTTP Strict Transport Security and earn a grade of A+ from Qualys’ SSL Labs server test.

Given the widely-publicized surveillance of Internet traffic and injection of advertisements and tracking beacons into plain HTTP traffic by some unscrupulous Internet providers, we felt it would be good to start using TLS encryption on even our non-confidential public websites.

This removed any problems switching between HTTP for most pages and HTTPS for the contact form and the POST of submitted data. Site delivery over HTTPS also serves as a ranking signal for Google, though presumably still a minor one.

Doesn’t SSL/TLS slow down a website? Simply put, not really these days. See Is TLS Fast Yet? for lots of details. And:

Moving to HTTPS everywhere on our sites also allowed us to take advantage of nginx’s relatively new SPDY (pronounced “speedy”) capability. SPDY is an enhancement to HTTPS created by Google to increase web page delivery time by compressing headers and multiplexing many requests in a single TCP connection. It is only available on HTTPS, so it also incentivizes sites to stop using unencrypted HTTP in order to get more speed, with security as a bonus. Whereas people once avoided HTTPS because SSL/TLS was slower, SPDY turned that idea around. We began offering SPDY for our sites in October 2014.

On the browser side, SPDY was initially only supported by Chrome and Firefox. Later support was added to Opera, Safari 8, and partially in IE 11. So most browsers can use it now.

There is only partial server support: In the open source world, nginx fully supports SPDY now, but Apache’s mod_spdy is incomplete and development on it has stalled.

Is SPDY here to stay? After all it was an experimental Google protocol. Instead of getting on track to become an Internet standard protocol as is, it was used as the starting point for the next version of HTTP, HTTP/2. That sounded like good news, except that the current version HTTP/1.1 was standardized in 1999 and hadn’t really changed since then. Many of us wondered if HTTP/2 would get mired in the standardization process and take years to see the light of day.

However, the skeptics were wrong! HTTP/2 was completed over about 3 years, and its official RFC form is now being finalized. Having it be the next version of HTTP will go a long way toward getting more implementation and adoption, since it is no longer a single company’s project. On the other hand, basing HTTP/2 on SPDY meant that there was a widely-used proof of concept out there already, so discussions didn’t get lost in the purely theoretical. The creators of SPDY at Google were heavily involved in the HTTP/2 standardization process, so their lessons were not lost, and it appears that HTTP/2 will be even better.

What is different in HTTP/2?

Request and response multiplexing in a single TCP connection (no need for 6+ connections to the same host!)
Stream prioritization (prioritizing files that the client most needs first)
Server push (of files the server expects the client to need, before the client knows it), and client stream cancellation (in case the server or the client is wrong and wants to abort a stream)
Binary framing (no more hand-typing requests via telnet, sadly)
Header compression (greatly reducing the bloat of large cookies)
Backward-compatibility with HTTP/1.1 and autodiscovery of HTTP/2 support (transparent upgrading for users)
When TLS is used, require TLS 1.2 and minimum acceptable cipher strength (to help retire weak TLS setups)

For front-end web developers, these back-end plumbing changes have some very nice consequences. As described in HTTP2 for front-end web developers, you will soon be able to stop using many of the annoying workarounds for HTTP/1.1’s weaknesses: no more sprites, combining CSS & JavaScript files, inlining images in CSS, sharding across many subdomains, etc.

This practically means that the web can largely go back to working the way it was designed, with different files for different things, independent caching of small files, serve assets from the same place.

What is not changing?

Most of HTTP/1.1 basic semantics remain the same, with most of the changes being to the “wrapping” or transport of the data. All this stays the same:

built on TCP
stateless
same request methods
same request headers (including cookies)
same response headers and body
may be unencrypted or layered on TLS (although so far, Chrome and Firefox have stated that they will only support HTTP/2 over TLS, and IE so far only supports HTTP/2 over TLS as well)
no changes in HTML, CSS, client-side scripting, same-origin security policy, etc.

The real point: speed

Speed and efficiency are the main advantages of HTTP/2. It will use less data transfer for both requests and responses. It will use fewer TCP connections, lightening the load on clients, servers, firewalls, and routers.

As clients adapt more to HTTP/2, it will probably provide a faster perceived experience as servers push the most important CSS, images, and JavaScript proactively to the client before it has even parsed the HTML.

See these simple benchmarks between HTTP/1.1, SPDY, and HTTP/2.

When can we use it?

Refreshingly, Google has announced that they are happy to kill their own creation SPDY: they will drop support for SPDY from Chrome in early 2016 in favor of HTTP/2.

Firefox uses HTTP/2 by default where possible, and Chrome has an option to enable HTTP/2. IE 11 for Windows 10 beta supports HTTP/2. You can see if your browser supports HTTP/2 now by using the Go language HTTP/2 demo server.

On the server side, Google and Twitter already have been opportunistically serving HTTP/2 for a while. nginx plans to add support this year, and an experimental Apache module mod_h2 is available now. The H2O open-source C-based web server supports HTTP/2 now, as does Microsoft IIS for Windows beta 10.

So we probably have at least a year until the most popular open source web servers easily support HTTP/2, but by then most browsers will probably support it and it should be an easy transition, as SPDY was. As long as you’re ready to go HTTPS-only for your site, anyway. :)

I think HTTP/2 will be a good thing!

Give me more details!

I highly recommend that system administrators and developers read the excellent http2 explained PDF book by Daniel Stenberg, Firefox developer at Mozilla, and author of curl. It explains everything simply and well.

Other reference materials:

HTTP/2 Approved on the IETF blog, by Mark Nottingham, chair the IETF HTTP Working Group
HTTP/2 home page with specifications and FAQs
High Performance Browser Networking chapter 12 on HTTP/2 by Ilya Grigorik
HTTP/2 on Wikipedia
Nine Things to Expect from HTTP/2 by Mark Nottingham
Making the Web Faster with HTTP 2.0: HTTP continues to evolve by Ilya Grigorik of Google
TLS in HTTP/2: Daniel Stenberg on TLS in HTTP/2 being mandatory in effect if not in the specification, and discusses opportunistic encryption
HTTP/2 and the Internet of Things by Robby Simpson of GE Digital Energy
HTTP/2.0—The IETF is Phoning It In: Bad protocol, bad politics by Poul-Henning Kamp, author of Varnish

Supporting Apple Retina displays on the Web

2014-05-27T00:00:00+00:00

Apple’s Retina displays (on Mac desktop & laptop computers, and on iPhones and iPads) have around twice the pixel density of traditional displays. Most recent Android phones and tablets have higher-resolution screens as well.

I was recently given the task of adding support for these higher-resolution displays to our End Point company website. Our imagery had been created prior to Retina displays being commonly used, but even now many web developers still overlook supporting high-resolution screens because it hasn’t been part of the website workflow before, because they aren’t simple to cope with, and since most people don’t notice any lack of sharpness without comparing low & high-resolution images side by side.

Most images which are not designed for Retina displays look blurry on them, like this:

The higher-resolution image is on the left, and the lower-resolution image is on the right.

Now, to solve this problem, you need to serve a larger, higher quality image to Retina displays. There are several different ways to do this. I’ll cover a few ways to do it, and explain how I implemented it for our site.

Retina.js

As I was researching ways to implement support for Retina displays, I found that a popular suggestion is the JavaScript library Retina.js. Retina.js automatically detects Retina screens, and then for each image on the page, it checks the web server for a Retina image version under the same name with @2x before the suffix. For example, when fetching the image background.jpg on a Retina-capable system, it would afterward look for background@2x.jpg and serve that if it’s available.

Retina.js makes it relatively painless to deal with serving Retina images to the correct people, but it has a couple of large problems. First, it fetches and replaces the Retina image after the default image, serving both the normal and Retina images to Retina users, greatly increasing download size and time.

Second, Retina.js does not use the correct image if the browser window is moved from a Retina display to a non-Retina display or vice versa when using multiple monitors. For example, if an image is loaded on a standard 1080p monitor and then the browser is moved to a Retina display, it will show the incorrect, low-res image.

Using CSS for background images

Doesn’t the “modern web” have a way to handle this natively in HTML & CSS? For sites using CSS background images, CSS media queries will do the trick:

@media only screen and (-webkit-min-device-pixel-ratio: 2), (min-resolution: 192dpi) {
  .icon {
    background-image: url(icon@2x.png);
    background-size: 20px 20px;
  }
}

But this method only works with CSS background images, so for our site and a lot of other sites, it will only be useful for a small number of images.

Take a look at this CSS-Tricks page for some excellent examples of Retina (and other higher-res display) support.

Server-side checks for Retina images

A very efficient way to handle all types of images is to have the browser JavaScript set a cookie that tells the web server whether to serve Retina or standard images. That will keep data transfer to a minimum, with a minimum of trickery required in the browser. You’ll still need to create an extra Retina-resolution image for every standard image on the server. And you’ll need to have a dynamic web process run for every image served. The Retina Images open source PHP program shows how to do this.

Why we didn’t use these methods

There is one reason common to all of these methods which made us decide against them: All of them require you to maintain multiple versions of each image. This ends up taking a lot of time and effort. It also means your content distribution network (CDN) or other HTTP caches will have twice as many image files to load and cache, increasing cache misses and data transfer. It also uses more disk space, which isn’t a big problem for the small number of images on our website, but on an ecommerce website with many thousands of images, it adds up quickly.

We would feel compelled to have the separate images if it were necessary if the Retina images were much larger and slow down the browsing experience for non-Retina users for no purpose. But instead we decided on the following solution that we saw others describe.

Serving Retina images to everybody (how we did it)

We read that you can serve Retina images to everyone, but we immediately thought that wouldn’t work out well. We were sure that the Retina images would be several times larger than the normal images, wasting a ton of bandwidth for anyone not using a Retina screen. We were very pleasantly surprised to find out that this wasn’t the case at all.

After testing on a few images, I found I could get Retina images within 2-3 KB of the normal images while keeping the visual fidelity, by dropping the JPEG compression rate. How? Because the images were being displayed at a smaller size than they were, the compression artifacts weren’t nearly as noticeable.

These are the total file sizes for each image on our team page:

Retina  Normal  Filename
 10K    9.3K    adam_spangenthal.jpg
 13K     13K    adam_vollrath.jpg
 12K     11K    benjamin_goldstein.jpg
7.6K    4.2K    bianca_rodrigues.jpg
 14K     13K    brian_buchalter.jpg
 13K     15K    brian_gadoury.jpg
7.5K    8.0K    brian_zenone.jpg
9.8K    6.6K    bryan_berry.jpg
 12K     11K    carl_bailey.jpg
6.9K     15K    dave_jenkins.jpg
 13K     13K    david_christensen.jpg
7.7K     21K    emanuele_calo.jpg
 16K     16K    erika_hamby.jpg
 13K     11K    gerard_drazba.jpg
 14K     14K    greg_davidson.jpg
 14K     12K    greg_sabino_mullane.jpg
 14K     15K    jeff_boes.jpg
 14K     12K    jon_jensen.jpg
 13K     12K    josh_ausborne.jpg
 13K     14K    josh_tolley.jpg
 13K     11K    josh_williams.jpg
8.9K    9.5K    kamil_ciemniewski.jpg
 13K     21K    kent_krenrich.jpg
 15K     12K    kiel_christofferson.jpg
9.9K     11K    kirk_harr.jpg
7.7K     13K    marco_manchego.jpg
 12K     13K    marina_lohova.jpg
 14K     11K    mark_johnson.jpg
7.3K     13K    matt_galvin.jpg
 15K     12K    matt_vollrath.jpg
6.6K     14K    miguel_alatorre.jpg
 13K     14K    mike_farmer.jpg
7.1K     19K    neil_elliott.jpg
9.9K    9.0K    patrick_lewis.jpg
 13K    5.6K    phin_jensen.jpg
 12K     14K    richard_templet.jpg
 12K    9.9K    rick_peltzman.jpg
 14K     13K    ron_phipps.jpg
9.7K     14K    selvakumar_arumugam.jpg
9.3K     15K    spencer_christensen.jpg
 12K     12K    steph_skardal.jpg
 15K     18K    steve_yoman.jpg
6.7K     15K    szymon_guz.jpg
7.5K    6.8K    tim_case.jpg
 15K     21K    tim_christofferson.jpg
9.3K     12K    will_plaut.jpg
 12K     14K    wojciech_ziniewicz.jpg
 12K    9.9K    zed_jensen.jpg

TOTALS

Retina: 549.4K
Normal: 608.8K

This is where I found the biggest, and best, surprise. The cumulative size of the Retina image files was less than that of the original images. So now we have support for Retina displays, making our website look nice on modern screens, while actually using less data transfer. We don’t need JavaScript, cookies, or any extra server-side trickery to do this. And best of all, we don’t have to maintain a separate set of Retina images.

Once you’ve seen the difference in quality on a Retina screen or a new Android phone, you’ll wonder how you ever were able to tolerate the lower-resolution images. And at least for our selection of JPEG images, there’s not even a file size penalty to pay!

Reference reading

A guide for creating a better retina web by Ivo Mynttinen
5 Things I Learned Designing For High-Resolution Retina Displays by Lee Munroe
About Proper Image Delivery on the Web on the Safari Developer Library
Serving Images Efficiently to Displays of Varying Pixel Density on the Safari Developer Library

WebP images experiment on End Point website

2014-01-28T00:00:00+00:00

WebP is an image format for RGB images on the web that supports both lossless (like PNG) and lossy (like JPEG) compression. It was released by Google in September 2010 with open source reference software available under the BSD license, accompanied by a royalty-free public patent license, making it clear that they want it to be widely adopted by any and all without any encumbrances.

Its main attraction is smaller file size at similar quality level. It also supports an alpha channel (transparency) and animation for both lossless and lossy images. Thus it is the first image format that offers the transparency of PNG in lossy images at much smaller file size, and animation only available in the archaic limited-color GIF format.

Comparing quality & size

While considering WebP for an experiment on our own website, we were very impressed by its file size to quality ratio. In our tests it was even better than generally claimed. Here are a few side-by-side examples from our site. You’ll only see the WebP version if your browser supports it:

13,420 bytes JPEG	2776 bytes WebP
14,734 bytes JPEG	3386 bytes WebP

The original PNG images were converted by ImageMagick to JPEG, and by cwebp -q 80 to WebP. I think we probably should increase the WebP quality a bit to keep a little of the facial detail that flattens out, but it’s amazing how good these images look for file sizes that are only 17% and 23% of the JPEG equivalent.

One of our website’s background patterns has transparency, making the PNG format a necessity, but it also has a gradient, which PNG compression is particularly inefficient with. WebP is a major improvement there, at 13% the size of the PNG. The image is large so I won’t show it here, but you can follow the links if you’d like to see it:

337,186 bytes	container-pattern.png
43,270 bytes	container-pattern.webp

Browser support

So, what is the downside? WebP is currently natively supported only in Chrome and Opera among the major browsers, though amazingly, support for other browsers can be added via WebPJS, a JavaScript WebP renderer.

Update! As of 2021, all current major browsers support WebP image rendering.

Why don’t the other browsers add support given the liberal license? Especially Firefox you’d expect to support it. In fact a patch has been pending for years, and a debate about adding support still smolders. Why?

WebP does not yet support progressive rendering, Exif tagging, non-RGB color spaces such as CMYK, and is limited to 16,384 pixels per side. Some Firefox developers feel that it would do the Internet community a disservice to support an image format still under development and cause uncertain levels of support in various clients, so they will not accept WebP in its current state.

Many batch image-processing tools now support WebP, and there is a free Photoshop plug-in for it. Some websites are quietly using it just because of the cost savings due to reduced bandwidth.

For our first experiment serving WebP images from the End Point website, I decided to serve WebP images only to browsers that claim to be able to support it. They advertise that support in this HTTP request header:

Accept: image/webp,*/*;q=0.8

That says explicitly that the browser can render image/webp, so we just need to configure the server to send WebP images. One way to do that is in the application server, by having it send URLs pointing to WebP files.

Let’s plan to have both common format (JPEG or PNG) and WebP files side by side, and then try a way that is transparent to the application and can be enabled or disabled very easily.

Web server rewrites

It’s possible to set up the web server to transparently serve WebP instead of JPEG or PNG if a matching file exists. Based on some examples other people posted, we used this nginx configuration:

set $webp "";
set $img "";
if ($http_accept ~* "image/webp") { set $webp "can"; }
if ($request_filename ~* "(.*)\.(jpe?g|png)$") { set $img $1.webp; }
if (-f $img) { set $webp "$webp-have"; }
if ($webp = "can-have") {
    add_header Vary Accept;
    rewrite "(.*)\.\w+$" $1.webp break;
    break;
}

It’s also good to add to /etc/nginx/mime.types:

image/webp .webp

so that .webp files are served with the correct MIME type instead of the default application/octet-stream, or worse, text/plain with perhaps a bogus character set encoding.

Then we just make sure identically-named .webp files match .png or .jpg files, such as those for our examples above:

-rw-rw-r-- 337186 Nov  6 14:10 container-pattern.png
-rw-rw-r--  43270 Jan 28 08:14 container-pattern.webp
-rw-rw-r--  14734 Nov  6 14:10 josh_williams.jpg
-rw-rw-r--   3386 Jan 28 08:14 josh_williams.webp
-rw-rw-r--  13420 Nov  6 14:10 marina_lohova.jpg
-rw-rw-r--   2776 Jan 28 08:14 marina_lohova.webp

A request for a given $file.png will work as normal in browsers that don’t advertise WebP support, while those that do will instead receive the $file.webp image.

The image is still being requested with a name ending in .jpg or .png, but that’s just a name as far as both browser and server are concerned, and the image type is determined by the MIME type in the HTTP response headers (and/or by looking at the file’s magic numbers). So the browser will have a file called $something.jpg in the DOM and in its cache, but it will actually be a WebP file. That’s ok, but could be confusing to users who save the file for whatever reason and find it isn’t actually the JPEG they were expecting.

301/302 redirect option

One remedy for that is to serve the WebP file via a 301 or 302 redirect instead of transparently in the response, so that the browser knows it’s dealing with a different file named $something.webp. To do that we changed the nginx configuration like this:

rewrite "(.*)\.\w+$" $1.webp permanent;

That adds a little bit of overhead, around 100-200 bytes unless large cookies are sent in the request headers, and another network round-trip or two, though it’s still a win with the reduced file sizes we saw. However, I found that it isn’t even necessary right now due to an interesting behavior in Chrome that may even be intentional to cope with this very situation. (Or it may be a happy accident.)

Chrome image download behavior

Versions of Chrome I tested only send the Accept: image/webp [etc.] request header when fetching images from an HTML page, not when you manually request a single file or asking the browser to save the image from the page by right-clicking or similar. In those cases the Accept header is not sent, so the server doesn’t know the browser supports WebP, so you get the JPEG or PNG you asked for. That was actually a little confusing to hunt down by sniffing the HTTP traffic on the wire, but it may be a nice thing for users as long as WebP is still less-known.

Batch conversion

It’s fun to experiment, but we needed to actually get all the images converted for our website. Surprisingly, even converting from JPEG isn’t too bad, though you need a higher quality setting and the file size will be larger. Still, for best image quality at the smallest file size, we wanted to start with original PNG images, not recompress JPEGs.

To make that easy, we wrote two shell scripts for Linux, bash, and cwebp. We found a few exceptional images that were larger in WebP than in PNG or JPEG, so the script deletes any WebP file that is not smaller, and our nginx configuration will in that case not find a .webp file and will serve the original PNG or JPEG.

Full-page download sizes compared

Here are performance tests run by WebPageTest.org using Chrome 32 on Windows 7 on a simulated cable Internet connection. The total download size difference is most impressive, and on a slower mobile network or with higher latency (greater distance from the server) would affect the download time more.

Page URL	With WebP	Without WebP
Bytes	Time	Bytes	Time
https://www.endpointdev.com/	374 KB	2.9s	850 KB	3.4s
https://www.endpointdev.com/team/	613 KB	3.6s	1308 KB	4.1s

Conclusion

This article is not even close to a comprehensive shootout between WebP and other image types. There are other sites that consider the image format technical details more closely and have well-chosen sample images.

My purpose here was to convert a real website in bulk to WebP without hand-tuning individual images or spending too much time on the project overall, and to see if the overall infrastructure is easy enough to set up, and the download size and speed improved enough to make it worth the trouble, and get real-world experience with it to see if we can recommend it for our clients, and in which situations.

So far it seems worth it, and we plan to continue using WebP on our website. With empty browser caches, visit www.endpointdev.com using Chrome and then one of the browsers that doesn’t support WebP, and see if you notice a speed difference on first load, or any visual difference.

I hope to see WebP further developed and more widely supported.

A Performance Case Study

2011-02-09T00:00:00+00:00

I’m a sucker for a good performance case study. So, when I came across a general request for performance improvement suggestions at Inspiredology, I couldn’t help but experiment a bit.

The site runs on WordPress and is heavy on the graphics as it’s a site geared towards web designers. I inquired to the site administrators about grabbing a static copy of their home page and using it for a case study on our blog. My tools of choice for optimization were webpagetest.org and YSlow.

Here are the results of a 4-step optimization in visual form:

Inspiredology’s complete homepage.

The graph on the left shows the page load time in seconds for a first time view. Throughout optimization, page load time goes from 13.412 seconds to 9.212 seconds. Each step had a measurable impact. The graph on the right shows the page load time in seconds for a repeated view, and this goes from 7.329 seconds to 2.563 seconds throughout optimization. The first optimization step (CSS spriting and file combination) yielded a large performance improvement. I’m not sure why there’s a slight performance decrease between step 3 and step 4.

And here’s a summary of the changes involved in each step:

Step 1
- Addition of CSS Sprites: I wrote about CSS Sprites a while back and A List Apart has an older but still relevant article on CSS Sprites here. Repeating elements like navigation components, icons, and buttons are suitable for CSS sprites. Article or page-specific images are not typically suitable for CSS sprites. For Inspiredology’s site, I created two sprited images—one with a large amount of navigation components, and one with some of their large background images. You can find a great tool for building CSS rules from a sprited image here.
- Combination of JS and CSS files, where applicable. Any JavaScript or CSS files that are included throughout the site are suitable for combination. Files that can’t be combined include suckerfish JavaScript like Google Analytics or marketing service scripts.
- Moved JavaScript requests to the bottom of the HTML. This is recommended because JavaScript requests block parallel downloading. Moving them to the bottom allows page elements to be downloaded and rendered first, followed by JavaScript loading.
Step 2
- Image compression with jpegtran, pngcrush, convert. I use pngcrush often. I read about jpegtran in Yahoo’s Best Practices for Speeding Up Your Web Site. I wrote a bit about image compression a while ago and briefly experimented with image compression using imagemagick on Inspiredology’s images.
Step 3
- Addition of expires headers and disabling ETags: These are standard optimization suggestions. Jon Jensen wrote about using these a bit here and here.
Step 4
- Serving gzipped content with mod_deflate: Also a fairly standard optimization suggestion. Although, I should note I had some issues gzipping a couple of the files and since the site was in a temporary location, I didn’t spend much time troubleshoo
- A bit more cleanup of rogue html and CSS files. In particular, there was one HTML file requested that didn’t have any content in it and another that had JavaScript that I appended to the combined JavaScript file (combined

A side-by-side comparison of webpagetest.org’s original versus step 4 results highlights the reduction of requests in the waterfall and the large reduction in requests on the repeat view:

What Next?

At this point, webpagetest.org suggests the following changes:

Gzipping the remaining components has a potential of reducing total bytes of the first request by ~10%.
Additional image compression has the potential of reducing total bytes of the first request by about ~6%. This metric is based on their image compression check: “JPEG—Within 10% of a photoshop quality 50 will pass, up to 50% larger will warn and anything larger than that will fail.” Quite a few of Inspiredology’s jpgs did not pass this test and could be optimized further.
Use a CDN. This is a common optimization suggestion, but the cost of a CDN isn’t always justified for smaller sites.

I would suggest:

Revisiting CSS spriting to further optimize. I only spent a short time spriting and didn’t work out all the kinks. There were a few requests that I didn’t sprite because they were repeating elements, but repeating elements can be sprited together. Another 5 requests might be eliminated with additional CSS spriting.
Server-optimization: Inspiredology runs on WordPress. We’ve used the wp-cache plugin for a couple of our clients running WordPress, which I believe helps. But note that the case study presented here is a static page with static assets, so there is obviously a huge gain to be had by optimizing serving images, CSS, and JavaScript.
Database optimization: Again, there’s no database in play in this static page experiment. But there’s always room for improvement on database optimization. Josh Tolley recently made performance improvements for one of our clients running on Rails with postgreSQL using pgsi, our open source postgreSQL performance reporting tool, and had outrageously impressive benchmarked improvements.
I just read an article about CSS selectors. The combined.css file I created for this case study has 2000 lines. Although there might be only a small win with optimization here, surely optimization and cleanup of that file can be beneficial.
I recently wrote about several jQuery tips, including performance optimization techniques. This isn’t going to improve the serving of static assets, but it would be another customer-facing enhancement that can improve the usability of the site.

I highly recommend reading Yahoo’s Best Practices on Speeding Up Your Web Site. They have a great summary of performance recommendations, covering the topics described in this article and lots more.

Postgres SQL Backup Gzip Shrinkage, aka Don’t Panic!!!

2010-01-09T00:00:00+00:00

I was investigating a recent Postgres server issue, where we had discovered that one of the RAM modules on the server in question had gone bad. Unsurprisingly, one of the things we looked at was the possibility of having to do a restore from a SQL dump, as if there had been any potential corruption to the data directory, a base backup would potentially have been subject to the same possible errors that we were trying to restore to avoid.

As it was already the middle of the night (anyone have a server emergency during the normal business hours?), my investigations were hampered by my lack of sleep.

If there had been some data directory corruption, the pg_dump process would likely fail earlier than in the backup process, and we’d expect the dumps to be truncated; ideally this wasn’t the case, as memory testing had not shown the DIMM to be bad, but the sensor had alerted us as well.

I logged into the backup server and looked at the backup dumps; from the alerts that we’d gotten, the memory was flagged bad on January 3. I listed the files, and noticed the following oddity:

 -rw-r--r-- 1 postgres postgres  2379274138 Jan  1 04:33 backup-Jan-01.sql.gz
 -rw-r--r-- 1 postgres postgres  1957858685 Jan  2 09:33 backup-Jan-02.sql.gz

Well, this was disconcerting. The memory event had taken place on the 3rd, but there was a large drop in size of the dumps between January 1st and January 2nd (more than 400MB of compressed output, for those of you playing along at home). This indicated that either the memory event took place earlier than recorded, or something somewhat catastrophic had happened to the database; perhaps some large deletion or truncation of some key tables.

Racking my brains, I tried to come up with an explanation: we’d had a recent maintenance window that took place between January 1 and January 2; we’d scheduled a CLUSTER/REINDEX to reclaim some of the bloat which was in the database itself. But this would only reduce the size of the data directory; the amount of live data would have stayed the same or with a modest increase.

Obviously we needed to compare the two files in order to determine what had changed between the two days. I tried:

 diff <(zcat backup-Jan-01.sql.gz | head -2300) <(zcat backup-Jan-02.sql.gz | head -2300)

Based on my earlier testing, this was the offset in the SQL dumps which defined the actual schema for the database excluding the data; in particular I was interested to see if there had been (say) any temporarily created tables which had been dropped during the maintenance window. However, this showed only minor changes (updates to default sequence values). It was time to do a full diff of the data to try and see if some of the aforementioned temporary tables had been truncated or if some catastrophic deletion had occurred or…you get the idea. I tried:

 diff <(zcat backup-Jan-01.sql.gz) <(zcat backup-Jan-02.sql.gz)

However, this approach fell down when diff ran out of memory. We decided to unzip the files and manually diff the two files in case it had something to do with the parallel unzips, and here was a mystery; after unzipping the dumps in question, we saw the following:

 -rw-r--r-- 1 root root 10200609877 Jan  8 02:19 backup-Jan-01.sql
 -rw-r--r-- 1 root root 10202928838 Jan  8 02:24 backup-Jan-02.sql

The uncompressed versions of these files showed sizes consistent with slow growth; the Jan 02 backup was slightly larger than the Jan 01 backup. This was really weird! Was there some threshold in gzip where given a particular size file it switched to a different compression algorithm? Had someone tweaked the backup script to gzip with a different compression level? Had I just gone delusional from lack of sleep? Since gzip can operate on streams, the first option seemed unlikely, and something I would have heard about before. I verified that the arguments to gzip in the backup job had not changed, so that took that choice off the table. Which left the last option, but I had the terminal scrollback history to back me up.

We finished the rest of our work that night, but the gzip oddity stuck with me through the next day. I was relating the oddity of it all to a co-worker, when insight struck: since we’d CLUSTERed the table, that meant that similar data (in the form of the tables’ multi-part primary keys) had been reorganized to be on the same database pages, so when pg_dump read/wrote out the data in page order, gzip had that much more similarity in the same neighborhood to work with, which resulted in the dramatic decrease in the compressed gzip dumps.

So the good news was that CLUSTER will save you space in your SQL dumps as well (if you’re compressing), the bad news was that it took an emergency situation and an almost heart-attack for this engineer to figure it all out. Hope I’ve saved you the trouble… :-)

JPEG compression: quality or quantity?

2009-12-24T00:00:00+00:00

There are many aspects of JPEG files that are interesting to web site developers, such as:

The optimal trade off between quality and file size for any encoder and uncompressed source image.
Reducing size of an existing JPEG image when the uncompressed source is unavailable, but still finding the same optimal trade-off.
Comparison of different encoders and/or settings for quality at a given file size.

Two essential factors are file size and image quality. Bytes are objectively measurable, but image quality is much more nebulous. What to one person is a perfectly acceptable image is to another a grotesque abomination of artifacts. So the quality factor is subjective. For example, Steph sent me some images to compare compression artifacts. Here is the first one with three different settings in ImageMagick: 95, 50, and 8:

Compare the subtle (or otherwise) differences in the following images (mouseover shows the filesize and compression setting):

Mouseover each image for the file size and ImageMagick compression setting. Additional comparisons are below. Each image can be opened in a separate browser tab for easy A/B comparison. I think many would find the setting of 8 to have too many artifacts, even though it’s 10 times smaller than image compressed at a setting of 95. Some would find the setting of 50 to be an acceptable tradeoff between quality and size, since it sends 3.4 times fewer bytes.

Here is the code I wrote to make the comparison (shell script is great for this stuff):

#!/bin/bash
HTML_OUTFILE=comparison.html
echo '<html>' > $HTML_OUTFILE

write_img_html () {
    size=`du -h --apparent-size $1 | cut -f 1`
    if [ -n "$2" ]; then
       qual="setting: $2"
    fi
    cat <<EOF >>$HTML_OUTFILE
<a href="$1"><img src="$1" title="size: $size $qual"></a>
EOF
}

for name in image1 image2; do
    orig=$name-original.jpg
    resized=$name-300.png

    echo Resizing $orig to 300 on longest side: $resized...
    convert $orig -resize 300x300 $resized
    write_img_html $resized "lossless"

    for quality in 100 95 85 50 20 8 1; do
        echo Creating JPEG quality $quality...
        jpeg=$name-300-q-$quality.jpg
        convert $resized -strip -quality $quality $jpeg
        write_img_html $jpeg $quality
    done
done

Another factor that often comes into play is how artifacts in the image (e.g. aliasing, ringing, noise) combine with JPEG compression artifacts to exacerbate quality problems. So one way to get smaller file sizes is to reduce the other types of artifacts in the image, thereby allowing higher JPEG compression.

The most common source of artifacts is image resizing. If you are resizing the images, I strongly recommend using a program that has a high quality filter. Irfanview and ImageMagick are two good choices.

The ideal situation is this:

Uncompressed source image
Full-resolution if you will be handling the resize
Absent artifacts such as aliasing
Resize performed with good software like ImageMagick
JPEG compression chosen based on subjective quality assessment.

Choosing the trade-off between quality and file size is difficult in part because it varies by image content. Images with lots of small color details (e.g. bright fabric threads; AKA high spatial frequency chroma) stand less compression than images that only have medium sized details that do not have important and minute color information.

One of the settings that is important for small web images is removal of the color space profile (e.g. sRGB). The only time it is needed is when there is a good reason for using non-sRGB JPEG, such as when you are certain that your users will have color managed browsers. Removing it can shave off 5KB or so; software will assume images without profiles have an sRGB profile. It can be removed with the -strip parameter of ImageMagick.

As for choosing the specific compression settings, keep in mind that there are over 30 different types of options/techniques that can be used in compressing the image. Most image programs simplify that to a sliding scale from 0 to 100, 1 to 12, or something else. Keep in mind that even when programs use the same scale (e.g. 0 to 100), they probably have different ideas of what the numbers mean. 95 in one program may be very different than 95 in another.

If bandwidth is not an issue, then I use a setting of 95 on ImageMagick, because in normal images I can’t tell the difference between 95 and 100. But when file size in an important concern, I consider 85 to be the optimal setting. In this image, the difference should be clear, but I generally find that cutting filesize in half is worth it. Below 85, the artifacts are too onerous for my taste.

You don’t often hear about web site visitors’ dissatisfaction with compression artifacts, so you might be tempted to just reduce file sizes even beyond the point where it gets noticable. But I think there is a subliminal effect from the reduced image quality. Visitors may not stop visiting the site immediately, but my gut feeling is it leaves them with a certain impression in their mind or taste in their mouth. I would guess that user testing might result in comments such as “the X web site is not the same high-grade quality as the Y web site”, even if they don’t put it into words as specific as “the compression artifacts make X look uglier than Y”. Even if that pet theory is true, it still has to be balanced against the benefit of faster page loading times.

Ideally, the tradeoff between quality and page loading time would be a choice left to the user. Those who prefer fewer artifacts could set their browser to download larger, less-compressed image files than the default, while users with low bandwidth could set it for more compressed images to get a faster page load at the expense of quality. I could imagine an Apache module and corresponding Firefox add-on some day.

Regarding the situation where you want to reduce the file size of existing JPEGs, my advice is to first try (hard) to get the original source files. You can do better (for any given quality/size tradeoff) from those than you can by just manipulating the existing files. If that’s not possible, then the suboptimal workflows like jpegtran, jpegoptim, and doing a full decompress/recompress are the only alternative.

As far as comparing different encoders, I haven’t really looked into that except to compare ImageMagick and Photoshop, where I (subjectively) determined they both had about similar quality for file size (and vice-versa).

Here are all the comparison images. The file size and ImageMagick quality setting are in the rollover. I suggest opening images in browser tabs for easy A/B comparison.

XZ compression

2009-11-23T00:00:00+00:00

XZ is a new free compression file format that is starting to be more widely used. The LZMA2 compression method it uses first became popular in the 7-Zip archive program, with an analogous Unix command-line version called 7z.

We used XZ for the first time in the Interchange project in the Interchange 5.7.3 packages. Compared to gzip and bzip2, the file sizes were as follows:

interchange-5.7.3.tar.gz   2.4M
interchange-5.7.3.tar.bz2  2.1M
interchange-5.7.3.tar.xz   1.7M

Getting that tighter compression comes at the cost of its runtime being about 4 times slower than bzip2, but a bonus is that it decompresses about 3 times faster than bzip2. The combination of significantly smaller file sizes and faster decompression made it a clear win for distributing software packages, leading to it being the format used for packages in Fedora 12.

It’s also easy to use on Ubuntu 9.10, via the standard xz-utils package. When you install that with apt-get, aptitude, etc., you’ll get a scary warning about it replacing lzma, a core package, but this is safe to do because xz-utils provides compatible replacement binaries /usr/bin/lzma and friends (lzcat, lzless, etc.). There is also built-in support in GNU tar with the new –xz aka -J options.

Performance optimization of icdevgroup.org

2009-10-23T00:00:00+00:00

Some years ago Davor Ocelić redesigned icdevgroup.org, Interchange’s home on the web. Since then, most of the attention paid to it has been on content such as news, documentation, release information, and so on. We haven’t looked much at implementation or optimization details. Recently I decided to do just that.

Interchange optimizations

There is currently no separate logged-in user area of icdevgroup.org, so Interchange is primarily used here as a templating system and database interface. The automatic read/write of a server-side user session is thus unneeded overhead, as is periodic culling of the old sessions. So I turned off permanent sessions by making all visitors appear to be search bots. Adding to interchange.cfg:

RobotUA *

That would not work for most Interchange sites, which need a server-side session for storing mv_click action code, scratch variables, logged-in state, shopping cart, etc. But for a read-only content site, it works well.

By default, Interchange writes user page requests to a special tracking log as part of its UserTrack facility. It also outputs an X-Track HTTP response header with some information about the visit which can be used by a (to my knowledge) long defunct analytics package. Since we don’t need either of those features, we can save a tiny bit of overhead. Adding to catalog.cfg:

UserTrack No

Very few Interchange sites have any need for UserTrack anymore, so this is commonly a safe optimization to make.

HTTP optimizations

Today I ran the excellent webpagetest.org test, and this was the icdevgroup.org test result. Even though icdevgroup.org is a fairly simple site without much bloat, two obvious areas for improvement stood out.

First, gzip/deflate compression of textual content should be enabled. That cuts down on bandwidth used and page delivery time by a significant amount, and with modern CPUs adds no appreciable extra CPU load on either the client or the server.

We’re hosting icdevgroup.org on Debian GNU/Linux with Apache 2.2, which has a reasonable default configuration of mod_deflate that does this, so it’s easy to enable:

a2enmod deflate

That sets up symbolic links in /etc/apache2/mods-enabled for deflate.load and deflate.conf to enable mod_deflate. (Use a2dismod to remove them if needed.)

I added two content types for CSS & JavaScript to the default in deflate.conf:

AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css application/x-javascript

That used to be riskier when very old browsers such as Netscape 3 and 4 claimed to support compressed CSS & JavaScript but actually didn’t. But those browsers are long gone.

The next easy optimization is to enable proxy and browser caching of static content: images, CSS, and JavaScript files. By doing this we eliminate all HTTP requests for these files; the browser won’t even check with the server to see if it has the current version of these files once it has loaded them into its cache, making subsequent use of those files blazingly fast.

There is, of course, a tradeoff to this. Once the browser has the file cached, you can’t make it fetch a newer version unless you change the filename. So we’ll set a cache lifetime of only one hour. That’s long enough to easily cover most users’ browsing sessions at a site like this, but short enough that if we need to publish a new version of one of these files, it will still propagate fairly quickly.

So I added to the Apache configuration file for this virtual host:

ExpiresActive On
ExpiresByType image/gif  "access plus 1 hour"
ExpiresByType image/jpeg "access plus 1 hour"
ExpiresByType image/png  "access plus 1 hour"
ExpiresByType text/css   "access plus 1 hour"
ExpiresByType application/x-javascript "access plus 1 hour"
FileETag None
Header unset ETag

This adds the HTTP response header “Cache-Control: max-age=3600” for those static files. I also have Apache remove the ETag header which is not needed given this caching and the Last-modified header.

There are cases where the above configuration would be too broad, for example, if you have:

images that differ with the same filename, such as CAPTCHAs
static files that vary based on logged-in state
dynamically-generated CSS or JavaScript files with the same name

If the website is completely static, including the HTML, or identical for all users at the same time even though dynamically generated, we could also enable caching the HTML pages themselves. But in the case of icdevgroup.org, that would probably cause trouble with the Gitweb repository browser, live documentation searches, etc.

After those changes, we can see the results of a new webpagetest.org run and see that we reduced the bytes transferred, and the delivery time. It’s especially dramatic to see how much faster subsequent page views of the Hall of Fame are, since it has many screenshot thumbnail images.

Optimizing a simple non-commerce site such as icdevgroup.org is easy and even fun. With caution and practicing on a non-production system, complex ecommerce sites can be optimized using the same techniques, with even more dramatic benefits.

rsync and bzip2 or gzip compressed data

2009-10-06T00:00:00+00:00

A few days ago, I learned that gzip has a custom option --rsyncable on Debian (and thus also Ubuntu). This old write-up covers it well, or you can just man gzip on a Debian-based system and see the --rsyncable option note.

I hadn’t heard of this before and think it’s pretty neat. It resets the compression algorithm on block boundaries so that rsync won’t view every block subsequent to a change as completely different.

Because bzip2 has such large block sizes, it forces rsync to resend even more data for each plaintext change than plain gzip does, as noted here.

Enter pbzip2. Based on how it works, I suspect that pbzip2 will be friendlier to rsync, because each thread’s compressed chunk has to be independent of the others. (However, pbzip2 can only operate on real input files, not stdin streams, so you can’t use it with e.g. tar cj directly.)

In the case of gzip --rsyncable and pbzip2, you trade a little lower compression efficency (< 1% or so worse) for reduced network usage by rsync. This is probably a good tradeoff in many cases.

But even more interesting for me, a couple of days ago Avery Pennarun posted an article about his experimental code to use the same principles to more efficiently store deltas of large binaries in Git repositories. It’s painful to deal with large binaries in any version control system I’ve used, and most people simply say, “don’t do that”. It’s too bad, because when you have everything else related to a project in version control, why not some large images or audio files too? It’s much more convenient for storage, distribution, complete documentation, and backups.

Avery’s experiment gives a bit of hope that someday we’ll be able to store big file changes in Git much more efficiently. (Though it doesn’t affect the size of the initial large object commits, which will still be bloated.)

SDCH: Shared Dictionary Compression over HTTP

2009-07-27T00:00:00+00:00

Here’s something new in HTTP land to play with: Shared Dictionary Compression over HTTP (SDCH, apparently pronounced “sandwich”) is a new HTTP 1.1 extension announced by Wei-Hsin Lee of Google last September. Lee explains that with it “a user agent obtains a site-specific dictionary that then allows pages on the site that have many common elements to be transmitted much more quickly.” SDCH is applied before gzip or deflate compression, and Lee notes 40% better compression than gzip alone in their tests. Access to the dictionaries stored in the client is scoped by site and path just as cookies are.

The first client support was in the Google Toolbar for Internet Explorer, but it is now going to be much more widely used because it is supported in the Google Chrome browser for Windows. (It’s still not in the latest Chrome developer build for Linux, or at any rate not enabled by default if the code is there.)

Only Google’s web servers support it to date, as far as I know. Someone intended to start a mod_sdch project for Apache, but there’s no code at all yet and no activity since September 2008.

It is interesting to consider the challenge this will have on HTTP proxies that filter content, since the entire content would not be available to the proxy to scan during a single HTTP conversation. Sneakily-split malicious payloads would then be reassembled by the browser or other client, not requiring JavaScript or other active reassembly methods. This forum thread discusses this threat and gives an example of stripping the Accept-encoding: sdch request headers to prevent SDCH from being used at all. Though the threat is real, it’s hard to escape the obvious analogy with TCP filtering, which had to grow from stateless to more difficult stateful TCP packet inspection. New features mean not just new benefits but also new complexity, but that’s not reason to reflexively reject them.

SDCH references:

SDCH Google Group which includes the specification PDF and ongoing discussion
Wei-Hsin Lee’s presentation slides on SDCH
IETF mailing list announcement of SDCH and ensuing discussion thread
Velocity 2008 conference notes where the pronunciation of SDCH is given as “sandwich”
Vaporware Apache mod_sdch project

Optimizing media delivery with Cloudinary

Resizing and cropping

Gravity position

Automatic format

Other features

NuxtJS integration

Alternatives

Image compression: WebP presets, HEIC, AVIF, JPEG XL

Can I use it?

How do I create WebP images?

cwebp settings

Batch conversion

Competitors to WebP

HEIC

AVIF

JPEG XL

Use WebP now

Reference

Decreasing your website load time

Inline styles and scripts for the topmost content

Deferred loading of ads

Lazy load for images

Server-side caching

GZip compression

Summary

Roundup of some useful websites

Squoosh image compressor

Nerd Fonts

glot.io code pastebin

Firefox Send

transfer.sh command-​line file sharing

Doing what you don’t want to do

Postgres WAL files: best compression methods

HTTP/2 is on the way!

HTTPS and SPDY

What is different in HTTP/2?

What is not changing?

The real point: speed

When can we use it?

Give me more details!

Supporting Apple Retina displays on the Web

Retina.js

Using CSS for background images

Server-side checks for Retina images

Why we didn’t use these methods

Serving Retina images to everybody (how we did it)

Reference reading

WebP images experiment on End Point website

Comparing quality & size

Browser support

Web server rewrites

301/302 redirect option

Chrome image download behavior

Batch conversion

Full-page download sizes compared

Conclusion

Further reading

A Performance Case Study

What Next?

Postgres SQL Backup Gzip Shrinkage, aka Don’t Panic!!!

JPEG compression: quality or quantity?

XZ compression

Performance optimization of icdevgroup.org

rsync and bzip2 or gzip compressed data

SDCH: Shared Dictionary Compression over HTTP

transfer.sh command-line file sharing