Understanding Web Caching ASP.NET

Let’s get into the details of the various ways to make web caching work in your favor. Web caching needs careful thinking and planning. Sometimes it requires architectural changes to make various parts of the site cacheable. In this section, we will discuss in detail how web caching works and how it can be used for a faster down load experience. If you plan your site well, caches can help your web site load faster and reduce the load on your server and Internet link. The difference can be dramatic—a site that is difficult to cache may take several seconds to load, while one that takes advantage of caching can seem instantaneous in comparison. Users will appreciate a fast-loading site and will visit more often.

Basics of Web Caching

Web caches preserve a local version of responses served from origin servers to the client browser. The web cache keeps track of responses served for specific URL requests, and if there are instructions to store the response in the cache, it remembers them for a certain period. Next time when the same URL is requested, the web cache intercepts the request and returns the stored response from its storage (cached response) to the client browser.

Web caching has three benefits:

Reduces latency between request and response
Because content is served from the local store, there’s no need to go to the origin server to fetch the response. The delay between the request and response is always lower than making a call to the origin server.

Saves network bandwidth
The response is served from the web cache, not from the origin server. If your proxy acts as the web cache, then there’s no data transfer between your proxy and origin server. If your browser acts as the web cache, then there’s no net work activity at all.

Reduces server load
The origin server doesn’t have to execute the request and produce the response for the same request repeatedly.

Types of Web Caches

There are a number of web caches.

Browser caches
The browser cache is the fastest cache of all because the response is stored right on your computer. All modern browsers have finite storage dedicated for cache, usually about 100 MB. This means the browser can store 100 MB worth of data locally on a user’s computer and not request it again from the origin server. However, this 100 MB is shared among all the web sites user visits. So, you need to only store critical information that is accessed frequently and takes the most down load time, e. g. , ASP. NET AJAX Frame work Java Script files.

Proxy caches
Proxies serve hundreds or thousands of users in the same way and large corporations and ISPs often set them up behind their fire walls or as stand alone devices(also known as inter mediaries). Proxy caches are a type of shared cache. This means if you have five users coming from the same proxy, the content is delivered to the proxy from the origin server just once for the first user hitting the site. For the other users, the content is served directly from the proxy server although it may be their very first visit to your site.

Gateway caches
Gateway caches are deployed by webmasters in the user’s network and the origin server. They are not part of your production environment nor are they part of end user’s network. They work as inter mediaries between user (or proxy servers) and origin server. Requests are routed to gateway caches by a number of methods, but typically some form of load balancer is used to make one or more of them look like the origin server to clients.

Content delivery net works(CDNs)have cache servers in many different locations. Whenever a client requests a resource that is cached by a CDN, the request goes to nearest server, which saves network roundtrip time and delivers the resource faster. Moreover, CDNs have very high-speed networks optimized to deliver content as fast as possible. So, storing content on the CDN significantly increases site load time.

Web Cache Problems

When you implement effective caching, users don’t hit your site like they used to. This means you are being hit less than you should be, so you get an inaccurate traffic report.

Another concern is that caches can serve out-of-date or stale content. Caching requires very careful planning or your users will see old content instead of what you want to serve them. You will learn how to control caching and make sure this does not happen in the upcoming section “Controlling Response Cache. ”

How Web Caches Work

Web caches work based on the following conditions:

  • If the response’s headers tell the cache not to keep it, it won’t. The “no cache”mode is the default. But sometimes proxies and browsers cache content if there’s no explicit cache header present.
  • If the request is authenticated or secure, it won’t be cached. HTTPS content is never cached.
  • A cached representation is considered fresh (that is, able to be sent to a client without checking with the origin server) if:
    • it has an expiry time or other age-controlling header set but is still within the fresh period.
    • If a browser cache has recently cached the content but does not need to check it until the next launch.
    • If a proxy cache has seen the content recently but it was modified a relatively long time ago.
  • Cached content is directly served from the web cache.

There’s no communication bet ween origin and client.

• If a cached content has become stale, the web cache will forward the request to the origin server transparently and serve fresh content from origin server. he client browser will not notice what the web cache is doing. It will only experience a delay.

Controlling Response Cache

You can define caching at the server level (through IIS Administration) or at the page level using some special tags in the HTML files. With dynamic content, you have complete control on how and when to cache a particular response.

HTML metatags and HTTP headers

HTML authors can put metatags in a document’s <HEAD> section that describe its attributes. Besides describing what’s in the content, metatags can be used to cache pages or prevent pages from being cachedMetatags are easy to use, but aren’t very effective because they’re only honored by a few browser caches(which actually read the HTML), not proxy caches (which almost never read the HTML in the document). You can put a Pragma: no-cache metatag into a web page, but it won’t necessarily prevent it from being cached because an inter mediate proxy might be caching the page.

HTML metatags and HTTP headers

Cache control in response header

HTTP headers give you a lot more control over how browser caches and proxies handle your representations compared to metatags. HTTP headers are not part of the response body and thus not available in the HTML and are usually auto matically generated by the web server. However, you can control them to some degree, depending on the server you use.

HTTP headers are sent by the server before the HTML and are only seen by the browser and any inter mediate caches. Typical HTTP 1. 1 response headers might look like Example

Example of response header that says the response should be cached

The Cache-Control, Expires, Last-Modified, and ETag headers are responsible for controlling how to cache the entire response

Pragma HTTP headers

Many people believe that assigning a Pragma: no-cache HTTP header to a HTTP response will make it uncacheable. This is not necessarily true, because the HTTP specification does not set guidelines for Pragma response headers, but instead Pragma request headers (the headers that a browser sends to a server). Although a few caches may honor this header, the majority won’t, and it won’t have any effect.

Controlling caches with the Expires HTTP header

The Expires HTTP header is a basic way to control caches; it tells all caches how long the response can be stored in cache. After the expiry date, browsers will ignore what’s on the cache and make a call to the origin server to get the fresh content. Expires headers are supported by practically every cache.

Most web servers allow you to set the expiration in a number of ways. Commonly, they will allow setting an absolute time to expire, the last time that the client saw the representation (last access time), or the last time a document changed on your server(last modification time).

Using the Expires Header for Static Content

The Expires header is especially good for making static images(like navi- gation bars and buttons)cacheable. Because it doesn’t change much, you can set an extremely long expiration time on it, making your site appear much more responsive to your users. It is also useful for controlling the caching of a page that is regularly changed. For instance, if you update a news page once a day at 6 a. m. , you can set the representation to expire at that time so caches will know when to get a fresh copy, without users having to hit reload.

The only value valid in an Expires header is a HTTP date — anything else will most likely be interpreted as “in the past, ” so that the response is un cacheable. Also, remember that the time in a HTTP date is Green wich Mean Time (GMT), not local time. For example:

Expires: Fri, 30 Oct 1998 14:19:41 GMT

Although the Expires header is useful, it has some limitations. First, because there’s a date involved, the clocks on the web server and the cache must be synchronized. If they aren’t, the intended results won’t be achieved and the caches might wrongly consider stale content as fresh.

Using the Expires Header for Static Content

Another problem with absolute Expires is that it’s easy to forget that you’ve set some content to expire at a particular time.

Although you change the expiration date to some other date, some browser are still going to request the content on a previously set date because they have already received the response with the previous expiration date.

Cache-control HTTP headers

Cache-control HTTP headers
HTTP 1. 1 introduced a new class of headers, Cache-Control response headers, to give web publishers more control over their content and to address the limitations of ExpiresUseful Cache-Control response headers include:

Specifies the maximum amount of time that a response will be considered fresh. Similar to Expires, this directive is relative to the time of the request, rather than absolute. [seconds] is the number of seconds from the time of the request you wish the response to be cached for.

Similar to max-age, except that it applies only to shared (e. g. , proxy) caches.

Indicates that the response may be cached by any cache (if max-age is not specified), even if it normally would be non cacheable or cacheable only within a non shared cache.

Indicates that all or part of the response message is intended for a single user and must not be cached by a shared cache. This allows an origin server to state that the specified parts of the response are intended for only one user and are not a valid response for requests by other users. A private (non shared) cache may cache the response unless max-age is defined.

Forces caches to submit the request to the origin server for validation before releasing a cached copy, every time. This is useful to ensure that authenti cation is respected (in combination with public)or to maintain rigid freshness, without sacrificing all of the benefits of caching. 230 | Improving Client-Side Performance.

Instructs caches not to keep a copy of the representation under any conditions.

If this header is not present, the browser sometimes return cached responses that have already expired on some special occasion, e. g. , when the browser’s back button is pressed. When the response has expired, this header instructs the browser to fetch fresh content no matter what.

Similar to must-revalidate, except that it applies only to proxy caches.

For example:

Cache-Control: public, max-age=3600, must-revalidate, proxy-revalidate.

This header tells the browser and proxy servers to cache the content for one hour. After one hour, both the browser and proxy must fetch fresh content from the origin no matter what.

ETag, last-modified headers

The Cache-control header allows you to set the duration of the cache. Once the browser or proxy caches the content for that duration, they won’t make a call to the origin regardless of whether the content has changed or not. So, if you have set a piece of Java Script to be cached for seven days, no matter how many times you change that Java Script, the browser and proxies that have already cached it for seven days will not ask for the latest Java Script. This could be exactly what you want to do in some cases because you want content to be delivered from the cache instantly, but it isn’t always a desired result.

Say you are delivering an RSS feed from your server to the browser. You have set the cache control to cache the feed for one day. However, the feed has already changed and users cannot see it because they are getting cached content no matter how many times they visit the site. So, you want to verify whether there’s a new RSS feed availabla by hitting the server. If it is available, then fresh content should be downloaded. If not, then the response is served from the cache

One way to do this is to use the Last-Modified header. When a cache has stored content that includes a Last-Modified header, it can use the Last-Modified header to ask the server if the content has changed since the last time it was fetched with an If- Modified-Since request. However, the Last-Modified header is applicable to content that is date-dependent. You cannot use the Last-Modified header unless you have time stamped your content.

HTTP 1. 1 introduced a new tag called the ETag for better control over cache validations. ETags are unique identifiers that are generated by the server and changed ever time the content is updated. Because the server controls how the ETag is generated, the server can check if the ETag matches when a If-None-Match request is made.

Browsers will send the ETag for a cached response to the server, and the server can check whether the content has changed or not. The server does this by using an algorithm to generate an ETag out of available content and seeing whether the ETag is the same as what the browser has sent. Some hashing of content can be used to generate the ETag. If the generated ETag does not match with the ETag that the browser sent, then the content has changed in between. The server can then decide to send the latest content to the browser.

In our previous example of caching an RSS feed, we can use the hash of the last item in the feed as an ETag. So, when the user requests the same feed again, the browser will make a call to the server and pass the last known ETag. On the server, you can download the RSS from the feed source (or generate it), check the hash of the last item, and compare it with ETag. If they match, then there’s no change in the RSS feed, and you can return HTTP 304 to inform the browser to use cached content. Othe rwise, you can return the freshly down loaded feeds to the browser.

Principles for Making the Best Use of Cache

Now that you know how caching works and how to control it, here are some tips on how to make best use of cache.

Use URLs consistently
Browsers cache content based on the URL. When the URL changes, the browser fetches a new version from the origin server. The URL can be changed by changing the query string parameters. For example, if /default. aspx is cached on the browser and you request /default. aspx?123, it will fetch new content from the server. The response from the new URL can also be cached in the browser if you return the proper caching headers. In that case, changing the query parameter to some thing else like /default. aspx?456 will return new content from the server.

So, you need to make sure you use the URL consistently every where when you want to get the cached response. From the home page, if you have requested a file with the URL /welcome. gif, make sure you request the same file from another page using the same URL.
One common mistake is to sometimes omit the “www” sub domain from the URL. www. sometargetdomain. com/default. aspx is not the same as sometargetdomain. com / default. aspx. Both will be cached separately.

Cache static content for longer period
Static files can be cached for longer period, like a month. If you are thinking that you could cache for couple of days and then change the file so users will pick it up sooner, you’re mistaken. If you update a file that was cached by the Expires header, new users will immediately get the new file while old users will see the old content until it expires on their browser. So, as long as you are using the Expires header to cache static files, you should use as a high value as possible to cache the files for as long as possible.

For example, if you have set the Expires header to cache a file for three days, one user will get the file today and store it in cache for next three days. Another user will get the file tomorrow and cache it for three days after tomorrow. If you change the file on the day after tomorrow, the first user will see it on the fourth day and the second user will see it on he fifth day. So, different users will see different versions of the file As a result, it does not help to set a lower value and assume all users will pick up the latest file soon. You will have to change the file’s URL to ensure everyone gets the same exact file immediately.

You can set up the Expires header from static files in IIS Manager. You’ll learn how to do this in the “How to Configure Static Content Caching in IIS” section later in this chapter.

Use a cache-friendly folder structure
Store cached content in a common folder. For example, store all images of your site in the /static folder instead of storing images separately under different sub folders. This will help you use consistent URLs throughout the site because you can use / static/images /somefile. gif from anywhere. It’s easier to move to a CDN when you have static cacheable files under a common root folder (see the “Different Types of CDNs” section later in this chapter).

Reuse common graphics files
Sometimes we put common graphics files under several virtual directories to write smaller paths. For example, say you have indicator. gif in the root folder, in some subfolders, and in a CSS folder. You did it because you don’t want to worry about paths from different places and can use the filename as a relative URL. This does not help with caching. Each copy of the file is cached in the browser separately. So, eliminate the duplicates, collect all of the graphics files in the whole solution, put them under the same root static folder, and use the same URL from all the pages and CSS files.

Change filename when you want to expire a cache
When you want to change a static file, don’t just update the file because it’s already cached in the user’s browser. You need to change the file name and update all references ever ywhere so that the browser downloads the new file. You can also store the file names in database or con figuration files and use data binding to generate the URL dyna mically. This way you can change the URL from one place and have the whole site receive the change immediately.

Use a version number when accessing static files
If you don’t want to clutter your static folder with multiple copies of the same file, use a query string to differentiate versions of same file. For example, a GIF can be accessed with a dummy query string like /static/images/indicator. gif?v=1. When you change the indicator. gif, you can overwrite the same file and up date all references to the file to direct to/static/images/ indicator. gif?v=2. This way you can keep changing the same file again but just update the references to access the graphics using the new version number.

Store cacheable files in a different domain
It’s always a good idea to put static contents into a different domain. First of all, the browser can open two additional concurrent connections to download the static files. Another benefit is that you don’t need to send the cookies to the static files. When you put the static files on the same domain as your web application, the browser sends both the ASP. NET cookies and all other cookies that your web application is producing. This makes the request headers un necessarily large and wastes band width. You don’t need to send these cookies to access the static files. So, if you put the static files in a different domain, those cookies will not be sent. For example, you could put your static files in the www. staticcontent. com domain while your web site is running on www. dropthings. com. The other domain doesn’t need to be a completely different web site. It can just be an alias and share the same web application path.

SSL is not cached, so minimize SSL use
Any content that is served over SSL is not cached. So, you need to put static content outside SSL. Moreover, you should try limiting SSL to only secure pages like the login or pay ment page. The rest of the site should be outside SSL over regular HTTP. Because SSL encrypts requests and responses, it puts an extra load on the server. Encrypted content is also larger than the original content and takes more band width.

HTTP POST requests are never cached
Cache happens only for HTTP GET requests. HTTP POST requests are never cached So, any kind of Ajax call you want to make cacheable needs to be HTTP GET-enabled.

Generate Content-Length response header
When you are dynamically serving content via web service calls or HTTP handlers, make sure you emit a Content-Length header. A browser has several opti mizations for down loading contents faster when it knows how many bytes to down load from the response by looking at the Content-Length header. Browsers can use per- sisted connections more effectively when this header is present. This saves the browser from opening a new connection for each request. When there’s no Content-Length header, the browser doesn’t know how many bytes it’s going to receive from the server and keeps the connection open (as long as bytes are delivered from the server) until the connection closes. So, you miss the benefit of persisted connections that can greatly reduce the download timeof several small files.

How to Configure Static Content Caching in IIS

In IIS Manager, the web site properties dialog box has an HTTP headers tab where you can define the Expires header for all requests that IIS handles (see Figure). You can set the content to expire immediately, after a certain number of days, or ona specific date. The option to Expire after uses sliding expiration, not absolute expiration, which is useful because it works per request. When someone requests a static file, IIS will calculate the expiration date based on the number of days/months from the Expire after.

The HTTP header is set for static content to expire in 30 days

The HTTP header is set for static content to expire in 30 days

For dynamic pages that are served by ASP. NET, a handler can modify the expiration header and override the IIS default setting.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

ASP.NET Topics