Using Cache Control

There are three initial questions most users have when evaluating CacheRight and cache control policies:


What is a cache control policy?

Minimizing round trips over the Web to revalidate cached items can make a huge difference in browser page load times. Perhaps the most dramatic illustration of this occurs when a user returns to a site for the second time, after an initial browser session. In this case, all page objects will have to be revalidated, each costing valuable fractions of a second (not to mention consuming bandwidth and server cycles). On the other hand, utilizing proper cache control allows each of these previously viewed objects to be served directly out of the browser's cache without going back to the server. The effect of adding cache control rules to page objects is often visible at page load time, even with a high bandwidth connection, and users may note that your sites appear to paint faster and that "flashing" is reduced between subsequent page loads. Besides improved user perception, the Web server will be offloaded from responding to cache revalidation requests, and thus will be able to better serve new traffic.

However, in order to enjoy the benefits of caching, a developer needs to take time to write out a set of carefully crafted cache control policies that categorize a site's objects according to their intended lifetimes. Here is an example of a complete set of cache control policies for a simple e-commerce Web site:

Object Type Duration
Navigational, logo images One year
CSS, JavaScript Six months
Main header images Three months
Monthly special offer image One month
Personalized special offer image (private) Two weeks from first access
All other content (HTML, ASP, etc.) Do not cache

As you can see, from the point of view of cache control, this site has six different types of objects. Since logos and other corporate branding is unlikely to change, navigational and logo images are treated as virtually permanent. CSS and JavaScript files are given freshness lifetimes to support a regular, semi-annual update schedule. As fresh site content is important in terms of search engine optimization and user experience, the main header images are set up to be changed a bit more frequently. The monthly "special offer" image is, of course, designed to stay fresh for one month. There is also a personalized special offer image that remains fresh in a user's cache for two weeks after the initial visit; note that this category is marked "private" to indicate that it is not to be cached by a shared/proxy cache. Finally, the default policy for everything else on the site states that nothing else should be cached, thus guaranteeing that text and dynamic content is served fresh for each request.

You need to be very careful not to cache HTML pages, whether or not they are statically generated, unless you really know what you are doing, and if anything, you should use cache control to make sure these pages are not cached. If a user caches your HTML page, and you set a lengthy expiration time, they will not see any content changes you may make until the cached object expires. On the other hand, if you focus on caching dependent objects, such as images, Flash files, JavaScript, and style sheets, you can replace that cached content simply by renaming objects. For example, let's say you have a policy to change your site's logo files once a year, but in the middle of that year, your company makes a significant branding change that needs to be reflected on the site. Fortunately, if you have not set your HTML files to be cached, you can still serve the new logo by renaming the file from logo.gif to newlogo.gif and changing the associated HTML <img> references. As the HTML is parsed, the browser will note that it does not have the new image in its cache and will download it. Of course, the old image will still be in the user's cache for quite some time, but it will no longer be used.

A note on browser and proxy caches...

As CacheRight is designed to manage the HTTP header data that browser and proxy caches use to determine the freshness of a file, please make sure to be aware of browser and proxy cache features when testing and using CacheRight in production. Clear your caches manually, especially during evaluation, to make sure you are getting fresh requests from the Web server with expiry-based cache control data in the HTTP headers once CacheRight is enabled and your rules have been created on a rules.cr file for your unique Web site or application.


How can I tell when caching is happening?

Examine the conversation between a browser and a server at the HTTP level to see if caching is occurring. Since browsers generally do not display protocol-level data, you will need a tool that lets you see what the browser usually keeps hidden. Fortunately, there are a number of options here:

  1. Use a protocol analyzer. Tools such as Trace Plus from WebDetective are like specialized network monitors that focus on specific applications and protocols. WebDetective lets you trace the HTTP requests from, and responses to, your browser, providing much the same view as a full-blown packet sniffer while running on the client side and automatically filtering out irrelevant network traffic.
  2. Use a browser extension such as LiveHTTPHeaders for Firefox or Chrome. This tool provides functionality similar to that found in WebDetective, but they reside directly in the browser itself. These tools are probably the most convenient way to inspect response headers as you navigate a site. We also wrote a post on other HTTP inspection tools of note.
  3. Use an HTTP troubleshooting tool such as the WebFetch utility from Microsoft or the free ieHTTPHeaders tool to simulate a browser's interactions with the server.
  4. Use a network packet sniffer. Network packet sniffers allow one to inspect network traffic at the protocol and data levels. By focusing on the HTTP portion of this traffic, one can examine the relevant headers and as they are exchanged between browser and server in the course of cache testing.

Once you have decided on a header inspection tool, the next step is to use it to trace an HTTP request/response for a cacheable file and inspect the response to verify that the cache enabling headers are present. In this example we've chosen to use the HttpWatch tool to examine the response headers. We will make a request for a company logo image named logo.gif, both before and after a CacheRight rule is in effect.

Without CacheRight

An initial request for logo.gif displays the headers that are returned without CacheRight applying the appropriate cache control headers.

Notice that the only cache related header sent by default without CacheRight is the Last-Modified header. With only the Last-Modified value available, a browser or proxy cache is forced to make a comparison against the modification timestamp of the file on the server, necessitating a round trip to the server every time.

With CacheRight

Since logo.gif is a company logo that, because of branding consistency is not likely to change and one that is used widely across our web site, we want this file to follow the caching policy laid out above making this type of item cacheable for one year. The following ExpiresByPath rule will do the trick:

ExpiresByPath /images/logo.gif: 1 year after access public

A request for logo.gif with this rule in effect will issue a response with a much richer set of cache control headers.

Now each response to a request for logo.gif will include a Cache-control and Expires header with values calculated from the rules in the CacheRight rules file. Armed with appropriate Cache-control and Expires header values, a browser or proxy cache can avoid unnecessary round trips to the server as long as the file remains fresh, which in this case is for one year or until Tue, 07 Jun 2005.


How do I use the CacheRight rules file?

CacheRight is installed and enabled at the IIS Web server as a filter, but developers write the rules that manage how cache control policies are applied. These rules should be customized to the specific Web application that you want to control browser and proxy caching for. Upon installation of CacheRight, a sample "rules.cr" file is placed in the root of each Web site or virtual server. You can begin to draft your own cache control rules and cache right by editing this rules.cr file. Please review the sample rules file below to get a better idea of how CacheRight works:

CacheRight Sample Rules File (rules.cr)

Welcome to CacheRight, the IIS Web server tool that makes it easy to implement effective cache control policies for your entire Web site.

This is an example of a rules.cr file, where you write the rules that CacheRight then uses to implement your cache control policies. The rules in this file control the HTTP headers (especially "Expires" and "Cache-Control") that tell browser and proxy caches whether and for how long a given file should be cached. This file must be located in the root directory of your Web site. Everything written in this rules file between curly braces is a comment and will be ignored by CacheRight.

A Note on Browser and Proxy Caches

CacheRight is designed to manage the HTTP header data that browser and proxy caches use to determine the freshness of a file. Please make sure to be aware of browser and proxy cache features when testing and using CacheRight in production. Clear your caches manually, especially during evaluation, to make sure you are getting fresh requests from the Web server with expiry-based cache control data in the HTTP headers once CacheRight is enabled and your rules have been created on a rules.cr file for your unique Web site or application.

Using the Example Rules

The CacheRight rules given here are intended as examples. A good way to get started with CacheRight is to uncomment some example rules and examine the effect of the changes in the HTTP response headers (you may need to create some test files or change some paths in the rules to make this work). To uncomment an example rule, remove the curly braces that surround it.

Validating the Rules File

Whenever you make any changes to the rules file, you should validate it using the syntax checker provided with CacheRight (cr_syntax.exe is installed with CacheRight by default; a shortcut to cr_syntax.exe is located under Start, Programs, Port80, CacheRight, CacheRight Syntax Checker). If you are working directly on the IIS server, click the Validate button on the CacheRight user interface to launch the Syntax Checker. If you are editing the rules file on another computer, you should keep a copy of cr_syntax.exe on that computer.

Reloading the Rules File

For rule changes to take effect, you must tell CacheRight to reload the rules file. If you are editing rules.cr directly on the IIS server, do this by clicking the Apply button in the CacheRight user interface. If you are editing the rules file remotely and uploading it to your site's root directory, you can reload it by requesting any file in the site using the cr_reset query parameter:

http://hostname/?cr_reset

More: CacheRight Example Rules »