Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report cache status to RUM #4537

Merged
merged 2 commits into from
Jan 13, 2025
Merged

Report cache status to RUM #4537

merged 2 commits into from
Jan 13, 2025

Conversation

richardTowers
Copy link
Contributor

@richardTowers richardTowers commented Jan 9, 2025

What

Refactor the custom RUM measurement code, and add some new code to report whether or not a navigation was a cache hit.

Why

It would be great if we could segment our performance statistics into cache hits and cache misses. This would help us quantify how much benefit the CDN is giving us, and also how much of an impact cache miss performance has on users.

If this shows that (a) quite a lot of real user requests are cache misses and (b) cache misses are much slower than cache hits then we'd be more strongly incentivised to improve cache hit rate and origin performance. On the other hand, if overall web performance isn't significantly degraded by cache misses, we might be more willing to change the architecture of origin in ways that might make it's performance less good. It would also be good to measure the impact of any such moves, which this should help us do.

How

I've split this into two commits - the first refactors the custom RUM data so it's in its own module, which simplifies a lot of the checks it was doing. The second adds the code I actually want. This uses data from the Server-Timing header, which is populated by Fastly using some VCL code I added in alphagov/govuk-fastly#130 (links with more context in that PR).

Visual Changes

None

@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 9, 2025 17:43 Inactive
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 10, 2025 10:29 Inactive
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 10, 2025 11:03 Inactive
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 10, 2025 11:13 Inactive
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 10, 2025 11:21 Inactive
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 10, 2025 11:35 Inactive
@richardTowers richardTowers changed the title Rum cache state Report cache status to RUM Jan 10, 2025
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 10, 2025 11:46 Inactive
@richardTowers richardTowers marked this pull request as ready for review January 10, 2025 11:46
Copy link
Contributor

@sihugh sihugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me 👍

@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 13, 2025 10:06 Inactive
Copy link
Contributor

@AshGDS AshGDS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Nitpick: Can the last two commits be merged together?

It's a bit surprising to have a bunch of our code tacked on to the end
of a vendored file. Our docs say the lux-measurer file "shouldn't ever
change", but that feels like a bit of a shaky assumption.

In any case, it feels much cleaner to have our code in its own file,
instead of muddling it with vendored code.

I've use script type=module for two reasons:

1) Like the design system, we can use it as a check that we're in a
   modern browser, which means we can use const / arrow functions and
   functions like `.some()` which might not be present in old browsers.
   For the same reason, we can simplify a lot of the checks around whether
   the performance API will be there, and whether it will work correctly.
2) It defers the execution of the script, so we'll automatically wait
   for the HTML to be fully parsed before our code runs. Although we're
   not interacting with the HTML, I think this should more or less
   guarantee that (a) the nextHopProtocol timing property is available
   and (b) the LUX global will already have been defined by lux-measurer.js
   Just in case (b) isn't always true, I've checked that LUX is defined
   before attempting to use it.

I'm also removing the try / catch code - it doesn't seem like it's doing
anything useful even if there is an error, and running as a module the
code shouldn't have any reference errors / surprisingly undefined APIs.
We added the cache state to the Server-Timing header in alphagov/govuk-fastly#130

This uses the browser's performance API to check if there's a cacheHit /
cacheMiss value in the Server-Timing header, and if there is, report it
to RUM.

This will allow us to separate the performance of GOV.UK for cache hits
from cache misses, which will give us some useful insights into how much
the cache speeds things up, and how many requests are actually cached in
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-4537 January 13, 2025 10:34 Inactive
@richardTowers richardTowers merged commit 5e49f36 into main Jan 13, 2025
12 checks passed
@richardTowers richardTowers deleted the rum-cache-state branch January 13, 2025 10:53
@AshGDS AshGDS mentioned this pull request Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants