Mastering Browser Cache: A Comprehensive Guide

Understanding Cache Control Headers

When it comes to optimizing the performance of your web application, mastering browser cache is crucial. With multiple headers available to manipulate cache behavior, it can be overwhelming to navigate the nuances of each.

Long-Term Caching Strategy

For a single-page application, our goal is to cache JavaScript, CSS, fonts, and image files indefinitely while preventing caching of HTML files and service workers. This strategy is viable since our asset files have unique identifiers in their file names, ensuring that updates can be easily propagated.

output: {
  filename: '[name].[chunkhash].js',
  path: path.join(__dirname, 'dist')
}

We can achieve this configuration using WebPack to include a [hash] or [chunkhash] in the file name of our assets.

Cache-Control Header

The Cache-Control header is instrumental in controlling browser cache behavior. The no-store directive instructs the browser not to store anything about the request, making it ideal for HTML and service worker scripts.

Cache-Control: no-store

On the other hand, no-cache allows for serving cached responses with the exception that the browser must validate if the cache is fresh using ETag or Last-Modified headers.

Cache-Control: no-cache

Pragma and Expires Headers

The Pragma header, although outdated, is still used as a precautionary measure to protect legacy servers that don’t support newer cache control mechanisms.

Pragma: no-cache

The Expires header, universally understood by proxies, gives it a slight edge over Pragma. For HTML files, we disable or set the Expires header to a past date, while for static assets, we manage it together with Cache-Control’s max-age via the Nginx expires directive.

location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
  expires 1y;
  add_header Cache-Control "public";
}

ETags and Last-Modified Headers

ETags and Last-Modified headers are essential for cache validation. ETags uniquely identify resources, while Last-Modified headers use the last modification date.

ETag: "123456789"
Last-Modified: Wed, 21 Oct 2020 07:28:00 GMT

By sending the If-None-Match request header with the ETag of a cached resource, the browser expects either a 200 OK response with a new resource or an empty 304 Not Modified response.

Debugging and Configuration

When testing cache configurations, it’s crucial to debug close to your server, considering factors like Dockerized servers, VMs, and Kubernetes clusters.

An example of unexpected behavior is Cloudflare removing the ETag header if Email Address Obfuscation or Automatic HTTPS Rewrites are enabled.

Nginx and Express Configurations

To put our knowledge into practice, we’ll focus on configuring Nginx and Express to serve a single-page application that supports long-term caching.

http {
 ...
  gzip on;
  gzip_types text/plain application/xml application/json;

  server {
    listen 80;
    server_name example.com;

    location / {
      expires -1;
      add_header Cache-Control "no-store, no-cache, must-revalidate, proxy-revalidate";
    }

    location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
      expires 1y;
      add_header Cache-Control "public";
    }
  }
}
const express = require('express');
const compression = require('compression');
const expressStatic = require('express-static');

const app = express();

app.use(compression());
app.use(expressStatic('public', {
  etag: true,
  lastModified: true
}));

app.listen(3000, () => {
  console.log('Server started on port 3000');
});

Real-World Examples

Let’s explore how popular services like Twitter, Instagram, and The New York Times utilize caching headers.

  • Twitter uses Express to serve HTML files.
  • Instagram supports long-term caching for CSS and JavaScript files.
  • The New York Times serves server-side rendered pages with real last modification dates.

By examining these examples, we can gain a deeper understanding of how to effectively utilize cache control headers in our own applications.

Leave a Reply