BlogReactJS

Cache busting a React app for production

Written by Codemzy on November 13th, 2023

If browsers save old code bundles for your React app, it can create errors and problems loading your app for your users. Here are some cache-busting tips for React in production.

When I'm developing in React, making changes and seeing them in the browser is as simple as refreshing the page.

And if you use some kind of hot reload, you don't even need to refresh!

And when you're developing, you know when you make changes, so hitting that refresh button is no big deal.

But here's something that took me a little while to work out:

In production, your users are not going to refresh the page. They don't know when you have done an update. And some of them don't even know what refreshing a page is. Not everyone is tech-literate.

And even though I am a little more tech-literate than some, I still don't think it's a good user experience to run into an error and then refresh the page to get things working again.

So here's how I try to ensure my users always get the latest version of my code, before they run into errors (and to help them if they do!):

  • hash my javascript bundles
  • inject the most recent bundles into my HTML
  • refresh on lazy load if the app tries to load an old bundle
  • refresh on location change if the version has changed
  • add a refresh button to the error boundary

Why do we need cache busting for React?

React is often used as a single-page application. More recently, frameworks like Next.js and Remix have brought React to the server, but I'm mostly using React on the client.

I started building in React back in 2015 before frameworks like Next.js or Remix existed. And most of my setups are still React + Webpack. But some of these cache-busting techniques may have already been solved out of the box if you are using a framework.

If you're bundling your React application with Wedpack, you will probably end up with a bunch of JavaScript bundles in production, let's imagine it's something like this:

📁 dist/
├── 📁 app/
 ├── 📄 index.html
 ├── 📄 app.js
 ├── 📄 checkout.bundle.js
 └── 📄 user.bundle.js

You deploy your app, and your users are delighted. It works, and it solves some important problems for them. They use it regularly, so they leave the tab open.

It's Monday.

You think about some cool new features you can add and start working hard to improve your app.

By Wednesday, you are ready to release the latest version. And you launch it. And then users start getting in touch that they are getting errors and can't use the app! Argh! What's happened?!

Well, on Monday, the user might have logged in and maybe signed up on your free plan, loading the app.js file. But since you care about splitting your code into chunks to keep bundle sizes down, you've split checkout.bundle.js and user.bundle.js.

On Wednesday after you released a new version, the user navigates to the checkout to upgrade to a paid plan, but because checkout.bundle.js is the latest version, but app.js is the old version, app.js is not sending the right props or whatever that the new version of checkout.bundle.js expects.

Hash those javascript bundles

Even if your user refreshes the page, they might not get the latest version of app.js - because it's cached by the browser!

See, if you name your webpack chunks (javascript bundles) the same for each version, the browser doesn't know the code has been updated. It sees app.js and thinks "I've already downloaded that, here it is!".

We're going to use [contenthash] to add a unique hash based on the content. That means if the content of the code in that bundle changes so does the hash!

Here's an example from how to name a webpack chunk post with [contenthash] added:

const path = require('path');

module.exports = {
  entry: {
    app: './src/app.js',
  },
  target: ['web', 'es5'],
  output: {
    filename: '[name].[contenthash].js', // 🆕
    chunkFilename: '[name].[contenthash].bundle.js', // 🆕
    path: path.resolve(__dirname, 'dist/app'),
    clean: true, // 🆕
  },
  resolve: {
    extensions: ['.js', '.jsx']
  },
  module: {
    rules: [
      {
        test: /\.(js|jsx)$/,
        exclude: /node_modules/,
        use: 'babel-loader'
      }
    ]
  },
  optimization: {
    splitChunks: {
      name: (module, chunks, cacheGroupKey) => {
        const allChunksNames = chunks.map((chunk) => chunk.name).join('-');
        return allChunksNames;
      },
      cacheGroups: {
        reactVendor: {
          test: /[\\/]node_modules[\\/](react|react-dom|react-router-dom)[\\/]/,
          name: 'vendor-react',
          chunks: 'all',
        },
      },
    },
  },
};

Now we have a content hash added to our files, so they will be something like this:

📁 dist/
├── 📁 app/
 ├── 📄 index.html
 ├── 📄 app.575d644de9307ca8621d.js
 ├── 📄 checkout.9aa7812dc584e7b6ff0c.bundle.js
 ├── 📄 user.ef6cca37c0fc362ff91a.bundle.js
 └── 📄 vendor-react.7c8b137d61bff5cad6e7.bundle.js

Now if we update checkout.js the hash will update. But, we don't want it to update for all our bundles! Bundles like vendor-react contain the React code which shouldn't change as often as our code. And we don't want the users to have to redownload React each time we fix a bug or typo!

I've also added clean: true to the output to remove any old bundles from the directory before creating new ones. This is so that we don't end up with hundreds of old code bundles in our output directory!

We can add moduleIds: 'deterministic', as an optimization option, and I'm also going to add runtimeChunk: 'single' to split the runtime code.

const path = require('path');

module.exports = {
  entry: {
    app: './src/app.js',
  },
  target: ['web', 'es5'],
  output: {
    filename: '[name].[contenthash].js',
    chunkFilename: '[name].[contenthash].bundle.js',
    path: path.resolve(__dirname, 'dist/app'),
    clean: true,
  },
  resolve: {
    extensions: ['.js', '.jsx']
  },
  module: {
    rules: [
      {
        test: /\.(js|jsx)$/,
        exclude: /node_modules/,
        use: 'babel-loader'
      }
    ]
  },
  optimization: {
    moduleIds: 'deterministic', // 🆕
    runtimeChunk: 'single', // 🆕
    splitChunks: {
      name: (module, chunks, cacheGroupKey) => {
        const allChunksNames = chunks.map((chunk) => chunk.name).join('-');
        return allChunksNames;
      },
      cacheGroups: {
        reactVendor: {
          test: /[\\/]node_modules[\\/](react|react-dom|react-router-dom)[\\/]/,
          name: 'vendor-react',
          chunks: 'all',
        },
      },
    },
  },
};

If you're stuck on anything we have done so far, please see the Caching guide for Webpack, which contains a more detailed explanation of the above.

Ok, we have our hashed bundles, but we are not done yet!

Inject the bundles into your HTML file

The browser needs the hashes of the latest bundles to load. For lazy loaded bundles, this is taken care of in React.lazy() when it loads the chunks. But for your entry bundles, your HTML needs to tell the browser what the new filenames are.

You don't want to manually go and get the hashes, and put them in your HTML file like:

<script src="/runtime.a630a7d0d87315bebaa4.js"></script>
<script src="/app.cda214f54ece02597bfc.js"></script>

Luckily, webpack has the HtmlWebpackPlugin extension, which will inject these filenames for you.

We can pass it our HTML template and tell it where to output the file with the bundles included.

const path = require('path');

module.exports = {
  entry: {
    app: './src/app.js',
  },
  target: ['web', 'es5'],
  output: {
    filename: '[name].[contenthash].js',
    chunkFilename: '[name].[contenthash].bundle.js',
    path: path.resolve(__dirname, 'dist/app'),
    clean: true,
  },
  resolve: {
    extensions: ['.js', '.jsx']
  },
  plugins: [ // 🆕
    new HtmlWebpackPlugin({
      filename: 'index.html',
      template: 'src/assets/app.html'
    }),
  ],
  module: {
    rules: [
      {
        test: /\.(js|jsx)$/,
        exclude: /node_modules/,
        use: 'babel-loader'
      }
    ]
  },
  optimization: {
    moduleIds: 'deterministic',
    runtimeChunk: 'single',
    splitChunks: {
      name: (module, chunks, cacheGroupKey) => {
        const allChunksNames = chunks.map((chunk) => chunk.name).join('-');
        return allChunksNames;
      },
      cacheGroups: {
        reactVendor: {
          test: /[\\/]node_modules[\\/](react|react-dom|react-router-dom)[\\/]/,
          name: 'vendor-react',
          chunks: 'all',
        },
      },
    },
  },
};

Now when the browser refreshes, it will see that any updated bundles have a new filename (because of the the new hash) and load the updated version! Yay!

Refresh if the app tries to load an old bundle

This is a SPA, and if the files are already loaded, the browser won't automatically refresh. And since our code-split chunks are loaded when they are needed, our app might try to lazy load a hashed bundle that no longer exists!

For example, if we have the old app.[oldhash].js it will still try to lazy load checkout.[oldhash].js. But checkout.[oldhash].js won't exist anymore, it's been replaced by checkout.[newhash].js.

Instead of getting the old code, which was our problem at the beginning of this post, we will get no code at all!

Haven't we just made the problem... worse?!

Well you know what they say, no code is better than old code - they don't say that, I just made it up! - but at least the app can now know there's a problem!

A ChunkLoadError problem!

If you use React.lazy() like this:

const UserSettings = React.lazy(() => import(/* webpackChunkName: "userSettings" */ './settings')));

But that was created in an old build, it will try to look for the old bundle. And if that old bundle is gone, you'll get a ChunkLoadError. Which literally means, there was an error loading the chunk.

I wrap my React.lazy() bundles in a custom lazyRetry() function.

const UserSettings = React.lazy(() => lazyRetry(() => import(/* webpackChunkName: "userSettings" */ './settings')));

And the lazyRetry() function will try a refresh if there's an error loading a chunk:

// a function to retry loading a chunk to avoid chunk load error for out of date code
const lazyRetry = function(componentImport) {
    return new Promise((resolve, reject) => {
        // check if the window has already been refreshed
        const hasRefreshed = JSON.parse(
            window.sessionStorage.getItem('retry-lazy-refreshed') || 'false'
        );
        // try to import the component
        componentImport().then((component) => {
            window.sessionStorage.setItem('retry-lazy-refreshed', 'false'); // success so reset the refresh
            resolve(component);
        }).catch((error) => {
            if (!hasRefreshed) { // not been refreshed yet
                window.sessionStorage.setItem('retry-lazy-refreshed', 'true'); // we are now going to refresh
                return window.location.reload(); // refresh the page
            }
            reject(error); // Default error behaviour as already tried refresh
        });
    });
};

I've written a guide on fixing ChunkLoadError's with the lazyRetry() function if you want to understand it better.

Refresh if the app version updates

Your React app is probably connected to the back-end in some way. Maybe you have a server, or some serverless functions that return data to your application.

If it is, then another thing I like to do is return the latest app version with every server response.

For example on Monday, maybe the server response is 2.0.0 but on Wednesday when you release an update it's 2.0.1.

With this information, I can check if the version from the server matches the version of the app that's running. If it doesn't, and a new version is available, we can refresh the browser to force the client to reload the latest version.

I'm using Node.js on the back end, so I get the version from my package.json file. When I build my React apps, I can tell the app what version it is with an environment variable in my webpack.config.js.

module.exports = {
  entry: {
    "app": './app/app.js'
  },
  // ...
  plugins: [
    new webpack.DefinePlugin({
      'process.env.VERSION': JSON.stringify(process.env.npm_package_version)
    })
  ],
};

Now every time you get a server response, you can check if the version has changed, and refresh with window.location.reload();.

You can choose to either display a modal to the user notifying them of the update with a button to refresh or force a reload when they perform an action like moving to a different route.

I've written a more detailed post for reloading the latest single-page application version with code examples you can use.

Add a refresh button to the error boundary

Everything we have done so far should help get the latest version of our code running in production without the user seeing any errors.

But as a last resort, I also add a "Try Again" button to the error boundary. This is a fallback so that the user can refresh the browser themselves. If any cached code is causing the issue, this should hopefully fix it.

<button onClick={() => window.location.reload(true)}>Try Again</button>

Related Posts: