Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Hello! I am Venkat Kudallur, development lead for Networking in Internet Explorer. We have made several improvements in Internet Explorer in Networking, and in this post, I would like to introduce you to some of the improvements in content caching and decompression in IE, two features that play a key role in speeding up the delivery of pages from a remote web server. If you’re a webmaster, developer using the IE Networking API, or just curious about IE Networking, I think you’ll find these details interesting.
Content caching eliminates a round-trip to the server (or reduces traffic with conditional GETs), and compression, of course, effectively increases throughput by compressing data. Compression (through standard algorithms such as gzip) plays a role in the dial up speedup services offered by several ISPs such as MSN, AOL, Netzero who offer a premium service that ‘speeds up’ dialup or broadband. Most of these services use dedicated servers and a combination of standard and proprietary algorithms for compression, and/or tune TCP/IP parameters on the machine for speeding up data transfer. Compression is likely to be a key part of the perceived speed up as most web content makes for good compression candidates: typically ASP for HTML compresses 2X (two-fold), JS files for JavaScript by 2-4X and CSS files for style sheets compresses by 2-5X. Proprietary algorithms are typically used for other media content, which these IE changes don’t impact.
A quick introduction to the key IE modules used in Networking is called for:
- WinInet.dll offers a Win32 API for http, https, and ftp downloads combined with other API for caching and parsing. It’s a very popular binary, and in addition to being part of the IE platform, is widely used in Windows client applications for its Networking services.
- UrlMon.dll is a utility layer that wraps and generalizes the WinInet API into a more generic and extensible pluggable protocol layer. It provides a COM interface to the HTTP Win32 API offered by WinInet, and has COM-based support for incorporating other protocol implementations into the IE stack. Several download managers available for download on the web commonly use this mechanism to tap into IE’s download space and pick off certain types of content (such as binaries) to be downloaded within the manager.
The key takeaway is that the bulk of http implementation, including caching, lies within WinInet, while UrlMon provides a COM wrapper around it and allows extension and filtering.
Prior to IE7, decompression happened in the UrlMon layer as a pluggable layer. The IE gzip and decompression was exposed through COM, and generically plugged in by the UrlMon implementation to work on the compressed data stream exposed by the WinInet Win32 API. The model was nice because any new decompression formats could be nicely plugged in as a COM implementation and registered with UrlMon to use on the compressed data stream. In practice, there were conditions under which this logical separation of decompression from the download complicated the model. For IE7, we have moved the decompression to logically sit above the download implementation within WinInet. This approach gives us several benefits:
- It reduces a round of file system read/writes.
- It avoids double parsing of caching directives.
- It centralizes and makes consistent caching decisions and timing considerations for compressed and decompressed content.
- It removes the need for COM-related synchronization in the default compression scenarios.
I expect that these changes fix a set of issues commonly seen in IE and IE-hosted applications when compression is used, particularly when there is dependence on the cache file used to store the content on the browsing machine. Developers consuming UrlMon and WinInet API need not be concerned about any changes in API behavior resulting from this change in IE7 – the UrlMon API continues to decompress compressed data transparently, and the WinInet API, by default, returns compressed data as in prior versions.
WinInet.dll is responsible for a cache, which is loaded and synchronized across all the processes and services using it. In addition to serving as a cache for various types of content downloaded by WinInet, it’s also exercised through the use of the WinInet caching API which provides a URL-based index for storage and retrieval. Its popularity, however, brings with it the downside of any instability (e.g. corruption of the index from a sudden reboot in the middle of a write-through operation) impacting all the processes that use it. We have significantly rewritten the WinInet cache index manager IE7 to ensure that it can gracefully recover from corruption or failure to grow the memory mapping of the index file. In addition we have improved the caching heuristics, extensively scrubbed API for parameter validation, and now handle Internationalized Resource Identifiers (IRI) more consistently in the API. I expect huge stability and functionality gains from the caching changes made in this release.
To read more on the impact of caching and compression on HTTP performance, check out this article by Eric Lawrence, IE Networking Program Manager. I welcome your feedback and suggestions for IE Networking features or for topics you would like us to blog about.
- Venkat
Comments
Anonymous
January 01, 2003
Thanks for the excellent posting! It's amazing all the under-the-hood stuff that is going on for IE7. Keep up the good work!Anonymous
January 01, 2003
Hey Venkat, it's great to hear this terrific news. IE7 sounds more and more promising with every post to this blog!Anonymous
January 01, 2003
Hey you guys! You release a cool new product and "I" a Microsoft Consultant in Los Angeles need to blog about it first?
http://spaces.msn.com/members/bhandler/Blog/cns!1pt1v0Q4vD8jSvNS4lqdAuug!507.entry
Silly Developers :-)Anonymous
January 01, 2003
Blake: Thanks for the mention of the developer toolbar. We mentioned it was about to be released back in September (http://blogs.msdn.com/ie/archive/2005/09/13/465338.aspx) but we haven't published the full post about the toolbar (still in beta) just yet. Stay tuned. :-)Anonymous
January 01, 2003
Oops. We actually did publish one post about it on 16 Sept: https://blogs.msdn.com/ie/archive/2005/09/16/469686.aspxAnonymous
January 01, 2003
Does this mean that gzip encoding will actually work in IE7?
That would be cool.(and it's about time)Anonymous
January 01, 2003
> I expect that these changes fix a set of issues commonly seen in IE
Does this mean that the bug with compressed content not sending the If-None-Match header (see http://jpdeckers.blogspot.com/2005/05/ie-still-broken-with-gzipped-content.html) in the request has been fixed ? That would be very nice!Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
<<seems to me it should be reading the compressed bytes as that would be much faster>>
Hello Sean,
Whenever the http data is decompressed into the IE cache, all consumers of the data in IE read the decompressed data. There are 2 good reasons for this approach.
1. That is the fastest approach. Compressing the data enhances the effective download rate. However, there is a cost to decompressing the compressed data. Decompressing the data once amortizes the cost across all consumers of the data, (versus decompressing on demand each time).
2. Another good reason is to leave the compression as a transport detail. Most consumers of the data (especially with the IMoniker model - http://msdn.microsoft.com/library/default.asp?url=/library/en-us/com/html/17f4c1df-7a9c-42ef-a888-70cd8d85f070.asp) don't want to know or care where the data came from or how it was delivered - as far as they are concerned, it could just as easily be streamed from a file on disk or a stream in memory.Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
<<Does this mean that the bug with compressed content not sending the If-None-Match header (see http://jpdeckers.blogspot.com/2005/05/ie-still-broken-with-gzipped-content.html) in the request has been fixed ?>>
Yes, it is :).Anonymous
January 01, 2003
The main issue I hope you've fixed is the cache indexes getting out of sync with the cache contents. This causes all manner of odd behaviour in IE - which can of course be cleaned up by emptying the cache, but that wastes the benefits of doing it.
I recall someone (was it Jeff Davis?) posting a suggested fix to one of the numerous problems of content that had just been downloaded disappearing from the cache, causing either a failed rendering (missing images, incorrect styles due to missing stylesheet) or the inability to save an image in its original format using Save Picture As from the context menu. The fix was apparently to reduce the size of your cache to 60MB or less, because there was a limited number of cache index entries and these were exhausted long before the actual cache capacity was reached. Has this been fixed?
I also hope that, should a power failure occur during a cache management operation, you will no longer invalidate and dump the entire cache! Basically I think I'm saying that the cache should be treated as valuable, since it represents saved bandwidth and time. Cookies are even more valuable (IMO, I know anti-spyware tools don't agree) as they represent saved state data.Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
I'm not sure if this is the correct thread to ask in, so apologies if it is not, but could I ask if the image caching problem when using CSS pseudo code has been fixed?
For example, when using the :hover pseudo element to change an image background, IE will request the image every time a hover is made use of. An example bit of code can simply be:
a:hover {
background: url(picture.jpg);
}
... and checking the access logs, the picture would have been requested and downloaded for every hover that takes place.
It can also cause flickering when this is used for web sites navigation. For example, when the mouse move over a link the old image will disappear instantly, while the hover image is still waiting to download (again and again).
If this has been fixed, then that's fantastic, but could I also put forward the suggestion of pre-downloading images used within a CSS file, so that the initial flickering is less likely to be seen.
Great post, and great blog by the way. I, along with so many others it seems, really do appreciate the openness and information being presented.Anonymous
January 01, 2003
Do you have any plans on addressing the fact that IE have one idea about what "deflate" mean while most other browsers have another opinion? As far as I understand it is only a couple of header-bytes that differs, but the net result is that no-one can safely use "deflate".Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
To the networking team: any idea's about if you will finally properly implement HTTP/1.1 pipelining?Anonymous
January 01, 2003
ChrisH-- There's a very specific timing issue in play with IE's download scheduler which is likely causing problems for your scenario. Your best bet is to ensure that the image has proper caching headers (see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwebgen/html/ie_introfiddler2.asp?frame=true).
If you'd like, you can trigger a prefetch of the CSS image by dynamically generating an IMG tag via script and setting its source to the image to be prefetched.
jbg-- I've not seen any indications that DEFLATE is unsafe; in fact, several popular HTTP compression hardware devices deliver deflated content that works in both IE and Firefox.
I believe IE and IIS both match the RFC specification for deflate. I'd love to learn more about any failing scenarios. Do you have a URL the reproduces the problem?
Ian-- We have not implemented pipelining (either properly or improperly) in the IE7 WinINET HTTP stack.
While pipelining can offer significant performance improvements when the end-to-end network path correctly implements pipelining, it fails when the server or intermediary proxies do not support pipelining. We expect to take another look in the IE8 timeframe.Anonymous
January 01, 2003
With regard to pipelining, I’ve got it enabled in Firefox for a long time and the last time I saw a problem with regard to pipelining was also a long time ago. Which is not really surprising, because even the laziest web host owners are at some point forced to upgrade their software because of customer demand and to avoid falling victim to hackers.
I think the situation is similar to SSLv2, and browsers can by now safely start implementing (and using) pipelining. I’d say it’s quite an improvement to the ‘perceived speed up’.
~GrauwAnonymous
January 01, 2003
So, um, the untitled.bmp bug is not yet fully resolved? I have seen it this month using an up-to-date IE6 on XP SP2 with certain images...
I dare to call it the worst bug in Internet Explorer.
I've linked to a few images that don't seem to work (one of them appears to work now) and listed some other bugs here:
http://www.livejournal.com/users/tmaster/32935.html
Look for item D.
And thanks for all the improvements, I'm waiting for IE7! Take your time, though ;-)Anonymous
January 01, 2003
Good list of bugs there, hope someone on the team looks at it. On my system (with all available updates) the second and third images seem to save as jpegs but the others do still come up as bitmaps.Anonymous
January 01, 2003
Thanks, frandom.
I think the second and third might have been fixed by the August updates. I think I'll be replacing those links when I find more images. If I'd insert all images that appear to be broken, I could replace all words in the entry with links.Anonymous
January 01, 2003
Thanks for the reply r.e. pipelining Eric Law.
Quote:"With regard to pipelining, I’ve got it enabled in Firefox for a long time and the last time I saw a problem with regard to pipelining was also a long time ago."
There are still problems lurking about; Opera has to implement heuristics to enable/disable it, and has felt that some recent load-balancing proxies don't handle pipelining properly. Still adoption by IE would certainly add pressure for proxies/servers etc. to finally work properly with HTTP/1.1 (all they have to do is ensure order in returning their requests).Anonymous
January 01, 2003
Any word how this decompression will be exposed in WinInet? Any docs for this yet?Anonymous
January 01, 2003
Documentation will be published on MSDN around December.
Essentially, you'll need only InternetSetOption an additional flag on your hInternet and WinINET will automatically decompress and remove the Content-Encoding header.Anonymous
January 01, 2003
Thanks Eric,
Although, could I ask anybody here how I may go about setting the HTTP headers for images on a web server running Apache which I can't control (as my web site is virtual hosted).
Thanks.Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
Chris H: The ability to set up cache control headers on images depends on your access level. You don't need to be able to write to the server configuration, but the server needs to be configured to give you a certain level of control via .htaccess files. Some servers are, some are not.
If your server permits you to use the appropriate options in .htaccess files, then a Google search for:
htaccess "cache control"
will tell you what you need to know.
Some admin bribery may be required to get them to enable mod_expires and give appropriate control rights in .htaccess .Anonymous
January 01, 2003
Craig Ringer, thank you for the heads up. I seems my host does support .htaccess and mod_expires, so I should be able to make use of the cache headers properly.
Thanks again.Anonymous
February 06, 2008
PingBack from http://ophir.wordpress.com/2008/02/07/where-in-the-world-are-the-great-user-interface-developers-part-iii-the-conclusions/Anonymous
May 29, 2009
PingBack from http://paidsurveyshub.info/story.php?title=ieblog-ie7-networking-improvements-in-content-caching-and-decompressionAnonymous
June 15, 2009
PingBack from http://einternetmarketingtools.info/story.php?id=2708