How they Almost Worked and What We Need – A List Apart


It’s our job, as designers and developers, to pick apart even the seemingly most simple tasks to find ways to improve them. When Ethan Marcotte coined “responsive web design,” he said that a responsive website is made up of three things: a flexible grid, flexible images and media, and media queries. In doing so, he opened up a world of new and exciting things to obsess over. I chose flexible images.

Article Continues Below

It’s easy enough to style images so that they scale to fit within a parent container by adding img { max-width: 100%; } to one’s stylesheet. To use this effectively, though, the image must be large enough to scale up to whatever size we can reasonably expect on the largest possible display. This can mean a great deal of overhead. At best it’s just wasteful. At worst, the mobile browser stuffs its hands in its pockets and goes to sulk in the corner, leaving the page partially rendered. A handful of full-bleed images designed for a 13” display could bring a mobile device on an Edge connection to its knees.

Unfortunately, we can’t test bandwidth in any reliable way—not yet, at least. Testing would likely mean introducing a significant download to measure against, which is a lot like setting something on fire to see exactly how flammable it is. What we can determine with some reliability is the size of a device’s screen—and while we can’t necessarily use screen size to make assumptions about bandwidth, it does directly correlate to what we’re trying to accomplish: while a browser’s window size may change, we’ll never need an image larger than the user’s screen.

While we were working on the new Boston Globe website, we devised a technique to mitigate the size of requests for users that may have limited bandwidth. Before I describe it here, I should really warn you up front: it broke. But we planned for that.

How responsive images worked#section2

Scott Jehl brilliantly masterminded Filament Group’s “responsive images” technique. We began with the core philosophy that the technique should err on the side of mobile. With a mobile-first approach, if any part of the process should break down the user should still receive a representative image—even if it’s a bit smaller—and avoid an unnecessarily large request on a device that may have limited bandwidth. Progressive enhancement writ large.

There are three key components to our responsive images script: the markup, the JavaScript, and a server-side redirect rule.

We started with, perhaps unsurprisingly, an image tag:

With that as our basis, we’re ensuring that we default to a mobile-sized image. We store the path to the larger image in a data attribute for easy access via JS:

Now that we have both sources in our markup, we need a way to conditionally load the appropriate source. To do that we need to know the size of the user’s screen. Fortunately, there’s a relatively simple way to determine a device’s screen size through JavaScript by way of a property in the browsers’ window object: window.screen.width, though even that isn’t entirely reliable.

Here’s where we run up against a major challenge: we need to communicate this size to the server in time to defer the request for the image’s original src, if necessary. The server needs to know the client’s screen size before our images are displayed.

We eventually settled on setting the screen size in a cookie. A cookie set in the head of a document would be ready in time for the parsing of the document’s body, and included along with image requests.

I assume that you’re cringing at the idea of cookie-dependent functionality, and I understand—I do. Some of our first iterations involved

Our JavaScript’s second task was far more simple: if the screen was above the size we specified, we swapped the img tag’s original src for the path contained in the data-fullsrc attribute and displayed the larger image in place of the smaller one.

Since the screen size was now available to the server, we toyed with a server-side solution that would automatically resize the original image to suit the screen. We decided against this for several reasons:

  • We wanted to keep server-side dependency to an absolute minimum, and only implement something that could be easily recreated in various server environments.
  • Rather than simply resizing an existing image, we felt it was more important to have the flexibility to crop and zoom the larger image in a way that fully optimizes it for display on a smaller screen.
  • Any of the back-end solutions we experimented with involved scaling a large image down to suit the screen size, which creeped us right out. If the screen’s width was reported incorrectly or if the front-end scripting should break down, we would run the risk of subjecting users to a massive and unnecessary download.

The src-swapping part of the JavaScript handles the lion’s share of the work, but larger-screened devices still make redundant requests—first for the mobile image, then the full-size image. This results in a pretty jarring visual effect as well: the smaller image may be visible before the larger one snaps into place. Since the original src is fetched as the browser parses the img tag, we can’t really dodge that request from the client side. What we can do is mitigate those requests on the server side.

We wrote some simple Apache rewrite rules to intercept requests for an image and check for the cookie we set earlier. If the breakpoint conditions we specified were met, we redirected the request for the mobile-sized image to a 1—1 spacer gif. This kept the size of the redundant request low—especially once cached by the browser—and prevented the mobile-sized image from displaying before we swapped it for the full-size image. Since we didn’t want to apply this logic to every image site-wide, we later introduced a second rule that allowed us to flag images as responsive: the logic above only kicks in if the image’s filename contains “.r.”

Thanks to our mobile-first approach, we had ourselves a pretty scrappy little technique. If any part of the equation should fail, no users would be penalized for their context. A failure on the client side—if cookies or JavaScript were unavailable, for example—would result in a smaller, but perfectly representative image. A failure on the server side would mean a request for the smaller image prior to the full-size image, but in a context where we could at least assume greater bandwidth. No one would be left without images regardless of their device, browser, and features.

This is fortunate really, since within a month or so of launching BostonGlobe.com our responsive image approach broke.

Several newer browsers have implemented an “image prefetching” feature that allows images to be fetched before parsing the document’s body. While it’s hard to argue with a quicker overall loading scheme, it goes against the parsing behavior we’ve come to understand. For our purposes, this feature also invalidates all our methods for communicating the screen size to the server before the images are loaded and breaks our server-side redirect. You can see this on BostonGlobe.com right now: without that redirect you’ll briefly see the mobile-size image before the full-size image is loaded, but it may take sharp eyes and a few page refreshes. Fortunately, this additional overhead is only incurred on desktop browsers, where bandwidth is generally less of a concern.

Long after the Boston Globe site launched, we continued to iterate on our approach. Jason Grigsby has done an incredible job documenting the details of those trials and tribulations in a series of blog posts.

This brings us to the present day, with some of the brightest minds on the web looking for something—anything—that will get the job done. Some think that it isn’t a solvable problem right now, and are placing their bets on user agent detection as a temporary solution. While this is a perfectly viable answer in the short term, I maintain that it’s untenable going forward: with the ever-expanding range of mobile phones and tablets in circulation, we could never hope to maintain a reasonable list of browsers and devices for long.

I believe that the ultimate solution shouldn’t hinge on scripting or CSS—and certainly nothing like UA detection, cookies, custom scripting on the front end, or any server-side shenanigans. Our aim is to represent and serve content appropriately, and for that reason I believe that this should be solved in markup.

The img tag isn’t going to cut it for this, though. It’s effective at conveying the hilarious antics of house cats, but it isn’t well suited to complex logic. It does one thing, and it does it well: it takes a single image source, and it puts it on your screen. If we were to modify this behavior at the browser level, we would never be able to guarantee our changes wouldn’t introduce issues in older browsers. We also know from experience that img doesn’t leave us much (if any) room to polyfill this new behavior.

What we need is a new markup pattern—one that allows us to specify multiple source files, but still specify universally-recognized markup as “fallback content” for browsers that don’t recognize the new tag. This should sound familiar, as this pattern already exists: the video and audio tags.

We know that a video tag can contain references to multiple sources, and that we can specify fallback content within the tag that’s only visible to browsers that don’t support video natively—usually a Flash-based video. What you may not know is that there’s already a way to use media queries to determine which video source to use, though browser support is a little spotty.

From there, it doesn’t take much imagination to see how we could use a pattern like this.


        
        
        
        

We could have a limitless number of options by using source media queries—higher resolution images for high-res displays over a certain size, for example. If we could reliably detect connection speed, one day we may be able to add media=“connection-speed: edge” or media=“min-speed: 200kbps” to our source elements. If these source elements are implemented per the HTML5 spec, a request will only be sent for the ones that match our media query. What we get is a single, highly-tailored request, with conditional flexibility limited only by a constantly growing roster of media queries.

Once we’ve established that markup as our foundation, we may be able to polyfill the expected behavior for browsers that don’t yet support it. While it’s likely that the polyfills would still involve more than one request, starting with a tried-and-true fallback pattern would allow us to apply polyfills at our discretion.

While we’re at it, I’d also like a pony#section5

As things stand now, a number of developers—myself included—are talking with WHATWG and various browser teams about the details of this new element. A frustrated group of developers pitching a need for a new element is certainly nothing new; we’re not the first, and I’m certain we won’t be the last. In fact, we’re not even the first to reach the exact same conclusion on image delivery: after brainstorming, we learned that a solution much like our own was posted to the W3C’s public mailing list in July of 2007—similar right down to the semantics. This subject has come up multiple times on the WHATWG and W3C discussion lists and quietly died out each time, but never during such a radical shift in browsing context as we’ve experienced over the past year or so, and never in such an exciting context as responsive web design.

While we can’t guarantee that a picture element—or something similar, semantics aside—will ever see the light of day, we’ve recognized that there is a need for such a markup pattern at present, and tremendous potential to such an approach in the future. I’d love to hear your thoughts.

Scroll to Top