First, the source image is a photograph, saved as a PNG24 file! They then show how JPEG XR and WebP file sizes compare.
This is a worthless comparison. Spoiler alert: the lossless PNG24 isn't very good are storing photographic data, because its not designed to.
Had imgix known how to do a proper image comparison study, they would haved used a PNG24, generate a normal, non-progressive JPEG, at the "Save for web" quality setting of 70. That's you baseline. They should then generate the JPEG XR, WebP, and Progressive JPEG off the source PNG24, and compare those sizes to the size of the baseline, regular quality 70 JPEG.
I have seen great performance benefits from using WebP where it makes sense, and I discuss them in this video about Warby Parker [0]. But the imgix guys are going about this the wrong way to explain the benefits.
Second, the use of Content Negotiation is a terrible idea as well. You don't want to serve different file types from the same URL. Because then, the web server uses the Accept header, and potentially the User-Agent header, to determine the response. This means it must send a Vary: Accept or Vary:Accept, User-Agent header in this response, which renders the response essentially uncachable for shared cached. I discuss this problem here [1], but in the context of the User-Agent header.
Its clear the imgix is trying to help people, which is awesome. But its also clear from their advice and analysis they really don't understand what they are talking about, or can't express themselves properly. Either way, this is bad performance information, and we really don't need any more of that.
Vary: Accept does not make resources uncacheable. If you are experiencing this problem, either your caching headers are misconfigured or your client is behaving incorrectly.
You're forgetting that Accept headers vary among clients — each non-bit-for-bit identical version will be cached separately. Vary:User-Agent takes that problem and raises it exponentially. You can play games trying to normalize things but that makes life hard using a CDN and increases the risk of buggy proxies creating very hard to diagnose problems.
The alternative of creating unique URLs is incredibly simple and has worked perfectly since the start of the web. Content negotiation is an interesting idea but it's just not worth the support cost.
We deliver images out of a CDN where we already have handled the proper request normalization. There is no support cost to implementing content negotiation in this case unless you want to put us behind a proxy. At that point, we can work with you to vary correctly without incurring the complexity you are focusing on.
Normalizing the values used in Vary at the CDN level is definitely the right way to go. However, that still leaves problems with transparent proxies at large companies, ISPs, mobile carriers, etc. unless you also have something like Cache-Control:private which is correctly handled.
Founder of imgix here. The comparison image is just meant to be moderately informative. We will be following up with a detailed performance outline as more data comes in. You can compare the progressive JPEG image against the WebP or JPEG XR to get a sample of what the ratios might be for a standard JPEG. Besides, a surprising number of websites still serve wildly uncompressed and unoptimized imagery. We help websites that are serving uncompressed JPEG and PNGs all of the time.
Furthermore, what should be understood (and I will clarify in the post) is that apart from the example image, which is designed to show the comparative compression ratios of the file formats, the data we report is based on the images that these companies are ALREADY serving. We analyzed our logs for what the image size is that they are currently serving as JPEGs and PNGS at the same size and simply enabled content negotiation for those same sized images.
With regards to the Vary header. You are right that the cache fragmentation of varying by Accept or User Agent would be extreme. This is why we do not do this. Instead, we perform normalization on the request and generate a specific "content ID" that takes into account any number of input signals that should be normalized and varied on. We can expose this as a separate header to folks who want to Vary on if folks want it. Finally, if you are serving directly out of us, none of this really matters. There is no web server involved. We handle all of the hard work on our end.
I disagree that content negotiation is the wrong approach though. The fragmentation we see across all of the input signals (e.g. format support, device pixel ratio, user agent, etc) is already so extreme that the best answer is to serve dynamic responses for images. If you want to stay within in the browser prefetch stage, which is critical for front-end performance, you have to make decisions about the content you want to serve at the server. This means potentially serving variant content under a single URL. We serve targeted responses for text content all the time across the web. It is not clear from your argument as to why this same treatment should not apply to imagery.
> On average, our customers have seen images that are
> half of their JPEG equivalent file size using these
> formats! The same quality image, half of the size.
Other than this textual statement, I don't see any comparisons. I only see absolute file sizes.
We used the existing image requests as a baseline. So the numbers we are reporting are specifically for the images that are being served in established formats (e.g. PNG and JPEG) measured against their equivalent requested WebP and JPEG XR variants. We then measured the percentiles of savings for these images. 98% of images in our test customer base saw a 18-74% improvement in compression per image, with an average of 41% savings per image. The data itself is a deduplicated set of hundreds of millions of image requests across several test customers. The one piece of data we should also report is the aggregate total savings per customer across all of their image requests. I will look into calculating that data point.
First, the source image is a photograph, saved as a PNG24 file! They then show how JPEG XR and WebP file sizes compare.
This is a worthless comparison. Spoiler alert: the lossless PNG24 isn't very good are storing photographic data, because its not designed to.
Had imgix known how to do a proper image comparison study, they would haved used a PNG24, generate a normal, non-progressive JPEG, at the "Save for web" quality setting of 70. That's you baseline. They should then generate the JPEG XR, WebP, and Progressive JPEG off the source PNG24, and compare those sizes to the size of the baseline, regular quality 70 JPEG.
I have seen great performance benefits from using WebP where it makes sense, and I discuss them in this video about Warby Parker [0]. But the imgix guys are going about this the wrong way to explain the benefits.
Second, the use of Content Negotiation is a terrible idea as well. You don't want to serve different file types from the same URL. Because then, the web server uses the Accept header, and potentially the User-Agent header, to determine the response. This means it must send a Vary: Accept or Vary:Accept, User-Agent header in this response, which renders the response essentially uncachable for shared cached. I discuss this problem here [1], but in the context of the User-Agent header.
Its clear the imgix is trying to help people, which is awesome. But its also clear from their advice and analysis they really don't understand what they are talking about, or can't express themselves properly. Either way, this is bad performance information, and we really don't need any more of that.
[0] - http://zoompf.com/blog/2013/07/how-fast-is-warby-parker
[1] - http://zoompf.com/blog/2012/02/lose-the-wait-http-compressio...