Skip to content

Commit

Permalink
Add new Navigations post
Browse files Browse the repository at this point in the history
  • Loading branch information
csswizardry committed Aug 19, 2024
1 parent 9b17ac4 commit 3f787fd
Showing 1 changed file with 312 additions and 0 deletions.
312 changes: 312 additions & 0 deletions _posts/2024-08-19-cache-grab-how-much-are-you-leaving-on-the-table.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,312 @@
---
layout: post
title: "Cache Grab: How Much Are You Leaving on the Table?"
date: 2024-08-19 13:17:11
categories: Web Development
main: "https://csswizardry.com/wp-content/uploads/2024/08/devtools.png"
meta: "Quantifying the importance of caching just got a lot easier"
---

For the longest time now, I have been _obsessed_ with caching. I think every
developer of any discipline would agree that caching is important, but I do tend
to find that, particularly with web developers, gaps in knowledge leave a lot of
opportunities for optimisation on the table. How does it affect you?

<div class="c-highlight mb">

<p>Want to know everything (and more) about HTTP cache? <a
href="https://slideslive.com/39021005/cache-rules-everything">Check out my
recent talk at WebExpo.</a></p>

</div>

## Navigation Information

The CrUX Report has begun including [navigation
information](https://developer.chrome.com/blog/crux-navigation-types) which
tells you _how_ people visited your pages: did they land afresh? Reload the
page? Arrive via the back button? This new data not only gives us insights as to
how people visit and traverse our websites, but also offers up opportunities for
optimisation.

The possible navigation types are:

* **Navigate:** A hard navigation that resulted in an HTML page being fetched
from the network.
* **Cache:** An HTML response returned from the HTTP cache.
* **Back–Forward:** A navigation incurred by hitting the back or forward button.
* **bfcache:** A navigation incurred by hitting back or forward button that was
served by the browser’s [back/forward cache](https://web.dev/articles/bfcache)
(bfcache).
* **Reload:** A user reloaded the page.
* **Restore:** The page was restored either as the result of a crash, a browser
restart, or the page had previously been unloaded by the browser due to memory
constraints, etc.
* **Prerender:** The page was prerendered, usually as a result of the new
[Speculation Rules
API](https://developer.mozilla.org/en-US/docs/Web/API/Speculation_Rules_API).

An important thing to note is that each of these navigation types are exclusive:
a bfcache hit will have been triggered by someone hitting their back button, but
does not also get counted in the Back–Forward navigation type.



<small><strong>N.B.</strong> As with all CrUX data, this only applies to hard
navigations—SPAs are not very well represented in the report.</small>

## Insights

We can infer a lot from the data. Some of it is pretty matter-of-fact whereas
other aspects depend a little on how you view them—you’d probably want to
cross-reference a few statistics with other analytics tools to check whether an
inefficiency has been highlighted. Let’s take a look at the BBC as an example.

<figure>
<img src="{{ site.cloudinary }}/wp-content/uploads/2024/08/bbc.png" alt="" width="1200" height="662">
<figcaption><a href="https://treo.sh/sitespeed/www.bbc.co.uk">Treo</a>’s free <i>Sitespeed</i> tool is a great way to view origin-level Core Web Vitals data.</figcaption>
</figure>

### Navigate

**70% of page views are _Navigations_**, which means a user landed on them via
a hard navigation and the file was fetched from the network. This is what we
imagine most cold-start page views to look like by default, and nothing is
particularly out of the ordinary. That said, they are comparably slow, so can we
move some of these page views elsewhere?

### Cache

Only **0.6% of page views came from _Cache_**. This isn’t a bad thing
in-and-of itself, but let’s consider two competing points:

1. **If a lot of navigations are from cache, our caching strategy must be pretty
good!** I’m of the opinion that almost all responses can be cached for at
least a little but, so serving absolutely zero files from cache would be
cause for concern. Are we leaving opportunities on the table here?
2. **If a lot of navigations are from cache, why are users hitting the same
pages over and over again?** The only way a file can be served from browser
cache is if it’s been visited before, and a high percentage of _Cache_
navigation types would indicate that people are revisiting the same URLs(s)
repeatedly. Does this match behaviour you would expect of your user base? Or
does this potentially point at holes in your IA?

I would consider this entry in context of your own project to determine what
issues it highlights. Sites where we would expect users to hit the same URL
multiple times would benefit from moving more navigations into this bucket;
sites that would not expect to see many repeat page views might have other
problems.

### Back–Forward

**5.4% of navigations are from hitting the _back or forward_ button** but couldn’t
be served from the bfcache specifically. Remember, these buckets are mutually
exclusive. These navigations were triggered by a user hitting their browser
controls, but we don’t know if the response itself came from the network or
cache. What we do know is that they didn’t come from bfcache, and this is one of
the first opportunities for improvement we’ll find—we want to move as many of
the _Back–Forward_ navigations into the _bfcache_ bucket as possible. So, a high
number of _Back–Forward_ navigations tell us a lot about how users traverse our
site, and also that we aren’t serving these navigation types as quickly as we
perhaps could.

### bfcache

**13.7% of navigations are _bfcache_.** This is promising. We’d always prefer
this number to be larger than _Back–Forward_, because bfcache navigations are
triggered the same way, only served faster. [bfcache is
better.](https://web.dev/articles/bfcache) Again, these navigation types are
exclusive, so URLs that enter the bfcache bucket will not also be counted in
Back–Forward. Ideally, we’d move all of the entries from _Back–Forward_ into
_bfcache_ by fixing any issues that prevent the back/forward cache from being
used.

The **71.7%** figure listed is a nice touch from Treo, and it shows us that of
all navigation-types initiated by the back or forward buttons, how many were of
the much faster _bfcache_ variant? This is your hit-rate, and we can see that
the BBC serve the majority of it’s back/forward button page views from the
back/forward cache. The figure is determined by:

`hit rate = bfcache / (bfcache + back–forward)`, or:

`13.7 / (13.7 + 5.4) = 0.717277487`

The reason the bfcache is so much faster is that, where traditional back/forward
navigations should hopefully retrieve most of their (sub)resources from HTTP
cache, the bfcache restores pages from memory—near-instant page loads! I’d
recommend [reading up on bfcache](https://web.dev/articles/bfcache) and how to
gain access to it.

As it stands, the BBC serves more of its back/forward initiated navigations via
the much faster bfcache—this is a good thing, but there is still the remaining
28% that could be fixed.

### Reload

**7.5% of navigations are _Reloads_.** Again, this would need viewing in
context. A lot of reloads could be the symptom of broken-looking sites or sites
where content has failed to load. Or, it could be quite typical of your site in
general—let’s say you provide breaking news, or live flight status, or you’re
Hacker News. It may well be a site that expects to have users refreshing pages,
which might therefore mean there no cause for concern. If this isn’t what you’d
expect, I would try to cross reference this behaviour with in-browser error
tracking, etc.

### Restore

**0.2% or page views are a _Restore_.** These are navigations that are initiated
by either a browser restart, or a tab being restored after previously having
been unloaded for memory-preservation reasons. Again, this information is less
performance-facing and might point to other issues with your site, so cross
reference this data with other sources of user behaviour.

<small>Chrome for Android does not currently emit <i>Restore</i> navigation
types and instead <a href="https://issues.chromium.org/issues/40268039">includes
them in the previous <i>Reload</i> type</a>.</small>

### Prerender

Finally, we see that **2.5% of navigations are _Prerender_.** These are among
the more interesting navigation types for the performance engineer and, as with
bfcache, prerendered pages provide near-instant page loads. Prerendered pages
are navigations that have been preemptively fetched and assembled and are
completed, ideally, before a user actually requests them.

The most current mechanism for prerendering URLs is the [Speculation Rules
API](https://developer.mozilla.org/en-US/docs/Web/API/Speculation_Rules_API),
a declarative way to instruct supporting browsers to fetch certain responses
ahead of time. We could opt to `prefetch` responses, or go in with the much more
fuller-featured `prerender`:

* **`prefetch`:** This simply fetches the URL in question and drops it into the
HTTP cache. None of the target URL’s subresources are fetched, and no
processing of the response is done. `prefetch` is much more light-touch than…
* **`prerender`:** This will fetch the target URL and its subresources, as well
as parse and render the resulting page, and process timers, scripts, etc.
`prerender` is much more intensive.

Both of these are faster than doing nothing at all, but they do have drawbacks.
These include, but are not limited, to increased and potentially wasted resource
usage on both the server and the client as more requests are made than a user
might utilise, and, if also `prerender`ing, more pages are being constructed
than may potentially be viewed. While `prefetch` is generally safer than
`prerender`, it’s vital that you fully understand the implications of both.

On this page right now, I use the new Speculation Rules API to `prerender` the
previous and next articles that are available via the pagination component:

```html
{% raw %}<script type=speculationrules>
{
"prerender": [
{
"urls": [
{% if page.next.url %}
"{{ page.next.url }}",
{% endif %}
{% if page.previous.url %}
"{{ page.previous.url }}"
{% endif %}
]
}
]
}
</script>{% endraw %}
```

I could also extend this to `prefetch` the documents at the end of any links
I hover:

```html
<script type=speculationrules>
{
...
"prefetch": [
{
"source": "document",
"where": {
"selector_matches": "a[href]"
},
"eagerness": "moderate"
}
]
}
</script>
```

Be sure to read [the
documentation](https://developer.mozilla.org/en-US/docs/Web/API/Speculation_Rules_API)
and [Chrome’s own
announcement](https://developer.chrome.com/docs/web-platform/prerender-pages) to
see what configuration and options we have available, as well as important
considerations for safe implementation.

#### Debugging Prerenders

There are a number of different tools for debugging and observing Speculative
Loads. Firstly, the _Speculative loads_ section in Chrome’s DevTools is likely
to be the most useful when working locally to create your Speculation Rules:

<figure>
<img src="{{ site.cloudinary }}/wp-content/uploads/2024/08/devtools.png" alt="" width="750" height="433">
<figcaption>Chrome’s Speculative loads section tells you about documents that
loaded or were loaded by any currently active Speculation Rules.</figcaption>
</figure>

Secondly, any requests made as the result of either a `prefetch` or
a `prerender` would carry the following request headers _respectively_:

```
Sec-Purpose: prefetch
```

Or:

```
Sec-Purpose: prefetch;prerender
```

<figure>
<img src="{{ site.cloudinary }}/wp-content/uploads/2024/08/headers.png" alt="" width="750" height="434">
<figcaption>The <code>Sec-Purpose</code> header shows why a file was requested.</figcaption>
</figure>

You could use your own logs to determine how many `prefetch` or `prerender`
requests you received, but this wouldn’t tell you how many `prefetch`ed or
`prerender`ed pages were actually served to your visitors. CrUX and/or custom
monitoring would be needed for that.

## How Far Can We Take This?

If you really want to see someone overachieve in this space, check out [Tim
Vereecke](https://x.com/timvereecke)’s
[Scalemates](https://www.scalemates.com/):

<figure>
<img src="{{ site.cloudinary }}/wp-content/uploads/2024/08/scalemates.png"
alt="CrUX navigation data for scalemates.com" width="750" height="671">
<figcaption>Stunning numbers!</figcaption>
</figure>

A third of navigations prerendered and almost a fifth from cache! 83% of all
back/forward-type navigations were restored from the much faster bfcache. Just
over half of Tim’s page views were served near-instantly.

## Key Takeaway

From the performance engineer’s perspective, what we’d really like to do is
increase the count of _bfcache_ and _Prerender_ navigation types, and if your
site fits the bill, perhaps move a few more URLs into _Cache_ as well. The key
thing to remember is that navigations can only come from cache if a user is
hitting the URL for a subsequent time: this means that they still likely
incurred a relatively expensive _Navigation_ type for their visit, and also
their repeat views might point at other IA-type issues. Having a solid caching
strategy is very important, but it’s equally important to understand how and why
your users have ended up in that situation in the first place.

New web platform APIs can grant near-instant page loads for our users, and they
are relatively trivial to implement. I would encourage all developers to look
where speculative loads fit into their sites, and also how and where users can
be granted access to bfcache navigations when they hit the back or forward
button. These are much cheaper optimisations than we’re commonly told to
implement.

0 comments on commit 3f787fd

Please sign in to comment.