Journal tags: backend

6

The Invisibles

When I was talking about monitoring web performance yesterday, I linked to the CrUX data for The Session.

CrUX is a contraction of Chrome User Experience Report. CrUX just sounds better than CEAR.

It’s data gathered from actual Chrome users worldwide. It can be handy as part of a balanced performance-monitoring diet, but it’s always worth remembering that it only shows a subset of your users; those on Chrome.

The actual CrUX data is imprisoned in some hellish Google interface so some kindly people have put more humane interfaces on it. I like Calibre’s CrUX tool as well as Treo’s.

What’s nice is that you can look at the numbers for any reasonably popular website, not just your own. Lest I get too smug about the performance metrics for The Session, I can compare them to the numbers for WikiPedia or the BBC. Both of those sites are made by people who prioritise speed, and it shows.

If you scroll down to the numbers on navigation types, you’ll see something interesting. Across the board, whether it’s The Session, Wikipedia, or the BBC, the BFcache—back/forward cache—is used around 16% to 17% of the time. This is when users use the back button (or forward button).

Unless you do something to stop them, browsers will make sure that those navigations are super speedy. You might inadvertently be sabotaging the BFcache if you’re sending a Cache-Control: no-store header or if you’re using an unload event handler in JavaScript.

I guess it’s unsurprising the BFcache numbers are relatively consistent across three different websites. People are people, whatever website they’re browsing.

Where it gets interesting is in the differences. Take a look at pre-rendering. It’s 4% for the BBC and just 0.4% for Wikipedia. But on The Session it’s a whopping 35%!

That’s because I’m using speculation rules. They’re quite straightforward to implement and they pair beautifully with full-page view transitions for a slick, speedy user experience.

It doesn’t look like WikiPedia or the BBC are using speculation rules at all, which kind of surprises me.

Then again, because they’re a hidden technology I can understand why they’d slip through the cracks.

On any web project, I think it’s worth having a checklist of The Invisibles—things that aren’t displayed directly in the browser, but that can make a big difference to the user experience.

Some examples:

A web application manifest JSON file to help people install your website.
A service worker, even if it’s just to cache static assets.
Open Graph meta elements in the head of documents so they “unroll” nicely when the link is shared.
An appropriate content security policy header.
A Speculation-Rules header that points to a JSON file.
An OpenSearch XML file if your site has search functionality.
An icon, or as us olds call it, a favicon.
A lang attribute on the body of every page.

If you’ve got a checklist like that in place, you can at least ask “Whose job is this?” All too often, these things are missing because there’s no clarity on whose responsible for them. They’re sorta back-end and sorta front-end.

Federation syndication

I’m quite sure this is of no interest to anyone but me, but I finally managed to fix a longstanding weird issue with my website.

I realise that me telling you about a bug specific to my website is like me telling you about a dream I had last night—fascinating for me; incredibly dull for you.

For some reason, my site was being brought to its knees anytime I syndicated a note to Mastodon. I rolled up my sleeves to try to figure out what the problem could be. I was fairly certain the problem was with my code—I’m not much of a back-end programmer.

My tech stack is classic LAMP: Linux, Apache, MySQL and PHP. When I post a note, it gets saved to my database. Then I make a curl request to the Mastodon API to syndicate the post over there. That’s when my CPU starts climbing and my server gets all “bad gateway!” on me.

After spending far too long pulling apart my PHP and curl code, I had to come to the conclusion that I was doing nothing wrong there.

I started watching which processes were making the server fall over. It was MySQL. That seemed odd, because I’m not doing anything too crazy with my database reads.

Then I realised that the problem wasn’t any particular query. The problem was volume. But it only happened when I posted a note to Mastodon.

That’s when I had a lightbulb moment about how the fediverse works.

When I post a note to Mastodon, it includes a link back to the original note to my site. At this point Mastodon does its federation magic and starts spreading the post to all the instances subscribed to my account. And every single one of them follows the link back to the note on my site …all at the same time.

This isn’t a problem when I syndicate my blog posts, because I’ve got a caching mechanism in place for those. I didn’t think I’d need any caching for little ol’ notes. I was wrong.

A simple solution would be not to include the link back to the original note. But I like the reminder that what you see on Mastodon is just a copy. So now I’ve got the same caching mechanism for my notes as I do for my journal (and I did my links while I was at it). Everything is hunky-dory. I can syndicate to Mastodon with impunity.

See? I told you it would only be of interest to me. Although I guess there’s a lesson here. Something something caching.

JavaScript

A recurring theme in my writing and talks is “lay off the JavaScript, people!” But I have to make a conscious effort to specify that I mean client-side JavaScript.

I thought it would be obvious from the context that I was talking about the copious amounts of JavaScript being shipped to end users to download, parse, and execute. But nothing’s ever really obvious. If I don’t explicitly say JavaScript in the browser, then someone inevitably thinks I’m having a go at JavaScript, the language.

I have absolutely nothing against JavaScript the language. Just like I have nothing against Python or Ruby or any other language that you might write with on your machine or your web server. But as soon as you deliver bytes over the wire, I start having opinions. It just so happens that JavaScript is the universal language for client-side coding so that’s why I call for restraint with JavaScript specifically.

There was a time when JavaScript only existed in web browsers. That changed with Node. Now it’s possible to write code for your web server and code for web browsers using the same language. Very handy!

But just because it’s the same language doesn’t mean you should treat it the same in both circumstance. As Remy puts it:

There are two JavaScripts.

One for the server - where you can go wild.

One for the client - that should be thoughtful and careful.

I was reading something recently that referred to Eleventy as a JavaScript library. It really brought me up short. I mean, on the one hand, yes, it’s a library of code and it’s written in JavaScript. It is absolutely technically correct to call it a JavaScript library.

But in my mind, a JavaScript library is something you ship to web browsers—jQuery, React, Vue, and so on. Whereas Eleventy executes its code in order to generate HTML and that’s what gets sent to end users. Conceptually, it’s like the opposite of a JavaScript library. Eleventy does its work before any user requests a URL—JavaScript libraries do their work after a user requests a URL.

To me it seems obvious that there should an entirely different mindset for writing code intended for a web browser. But nothing’s ever really obvious.

I remember when Node was getting really popular and npm came along as a way to manage all the bundles of code that people were assembling in their Node programmes. Makes total sense. But then I thought I heard about people using npm to do the same thing for client-side code. “That can’t be right!” I thought. I must’ve misunderstood. So I talked to someone from npm and explained how I must be misunderstanding something.

But it turned out that people really were treating client-side JavaScript no different than server-side JavaScript. People really were pulling in megabytes of other people’s code to ship to end users so that they could, I dunno, left pad numbers or something.

Listen, I don’t care what you get up to in the privacy of your own codebase. But don’t poison the well of the web with profligate client-side JavaScript.

Mental models

I’ve found that the older I get, the less I care about looking stupid. This is remarkably freeing. I no longer have any hesitancy about raising my hand in a meeting to ask “What’s that acronym you just mentioned?” This sometimes has the added benefit of clarifying something for others in the room who might have been to shy to ask.

I remember a few years back being really confused about npm. Fortunately, someone who was working at npm at the time came to Brighton for FFConf, so I asked them to explain it to me.

As I understood it, npm was intended to be used for managing packages of code for Node. Wasn’t it actually called “Node Package Manager” at one point, or did I imagine that?

Anyway, the mental model I had of npm was: npm is to Node as PEAR is to PHP. A central repository of open source code projects that you could easily add to your codebase …for your server-side code.

But then I saw people talking about using npm to manage client-side JavaScript. That really confused me. That’s why I was asking for clarification.

It turns out that my confusion was somewhat warranted. The npm project had indeed started life as a repo for server-side code but had since expanded to encompass client-side code too.

I understand how it happened, but it confirmed a worrying trend I had noticed. Developers were writing front-end code as though it were back-end code.

On the one hand, that makes total sense when you consider that the code is literally in the same programming language: JavaScript.

On the other hand, it makes no sense at all! If your code’s run-time is on the server, then the size of the codebase doesn’t matter that much. Whether it’s hundreds or thousands of lines of code, the execution happens more or less independentally of the network. But that’s not how front-end development works. Every byte matters. The more code you write that needs to be executed on the user’s device, the worse the experience is for that user. You need to limit how much you’re using the network. That means leaning on what the browser gives you by default (that’s your run-time environment) and keeping your code as lean as possible.

Dave echoes my concerns in his end-of-the-year piece called The Kind of Development I Like:

I now think about npm and wonder if it’s somewhat responsible for some of the pain points of modern web development today. Fact is, npm is a server-side technology that we’ve co-opted on the client and I think we’re feeling those repercussions in the browser.

Writing back-end and writing front-end code require very different approaches, in my opinion. But those differences have been erased in “modern” JavaScript.

The Unix Philosophy encourages us to write small micro libraries that do one thing and do it well. The Node.js Ecosystem did this in spades. This works great on the server where importing a small file has a very small cost. On the client, however, this has enormous costs.

In a funny way, this situation reminds me of something I saw happening over twenty years ago. Print designers were starting to do web design. They had a wealth of experience and knowledge around colour theory, typography, hierarchy and contrast. That was all very valuable to bring to the world of the web. But the web also has fundamental differences to print design. In print, you can use as many typefaces as you want, whereas on the web, to this day, you need to be judicious in the range of fonts you use. But in print, you might have to limit your colour palette for cost reasons (depending on the printing process), whereas on the web, colours are basically free. And then there’s the biggest difference of all: working within known dimensions of a fixed page in print compared to working within the unknowable dimensions of flexible viewports on the web.

Fast forward to today and we’ve got a lot of Computer Science graduates moving into front-end development. They’re bringing with them a treasure trove of experience in writing robust scalable code. But web browsers aren’t like web servers. If your back-end code is getting so big that it’s starting to run noticably slowly, you can throw more computing power at it by scaling up your server. That’s not an option on the front-end where you don’t really have one run-time environment—your end users have their own run-time environment with its own constraints around computing power and network connectivity.

That’s a very, very challenging world to get your head around. The safer option is to stick to the mental model you’re familiar with, whether you’re a print designer or a Computer Science graduate. But that does a disservice to end users who are relying on you to deliver a good experience on the World Wide Web.

Indy web

It was Indie Web Camp Brighton on the weekend. After a day of thought-provoking discussions, I thoroughly enjoyed spending the second day tinkering on my website.

For a while now, I’ve wanted to add maps to my monthly archive pages (to accompany the calendar heatmaps I added at a previous Indie Web Camp). Whenever I post anything to my site—a blog post, a note, a link—it’s timestamped and geotagged. I thought it would be fun to expose that in a glanceable way. A map seems like the right medium for that, but I wanted to avoid the obvious route of dropping a load of pins on a map. Instead I was looking for something more like the maps in Indiana Jones films—a line drawn from place to place to show the movement over time.

I talked to Aaron about this and his advice was that a client-side JavaScript embedded map would be the easiest option. But that seemed like overkill to me. This map didn’t need to be pannable or zoomable; just glanceable. So I decided to see if how far I could get with a static map. I timeboxed two hours for it.

After two hours, I admitted defeat.

I was able to find the kind of static maps I wanted from Mapbox—I’m already using them for my check-ins. I could even add a polyline, which is exactly what I wanted. But instead of passing latitude and longitude co-ordinates for the points on the polyline, the docs explain that I needed to provide …cur ominous thunder and lightning… The Encoded Polyline Algorithm Format.

Go to that link. I’ll wait.

Did you read through the eleven steps of instructions? Did you also think it was a piss take?

Take the initial signed value.

Multiply it by 1e5.

Convert that decimal value to binary.

Left-shift the binary value one bit.

If the original decimal value is negative, invert this encoding.

Break the binary value out into 5-bit chunks.

Place the 5-bit chunks into reverse order.

OR each value with 0x20 if another bit chunk follows.

Convert each value to decimal.

Add 63 to each value.

Convert each value to its ASCII equivalent.

This was way beyond my brain’s pay grade. But surely someone else had written the code I needed? I did some Duck Duck Going and found a piece of PHP code to do the encoding. It didn’t work. I Ducked Ducked and Went some more. I found a different piece of PHP code. That didn’t work either.

At this point, my allotted time was up. If I wanted to have something to demo by the end of the day, I needed to switch gears. So I did.

I used Leaflet.js to create the maps I wanted using client-side JavaScript. Here’s the JavaScript code I wrote.

It waits until the page has finished loading, then it searches for any instances of the h-geo microformat (a way of encoding latitude and longitude coordinates in HTML). If there are three or more, it generates a script element to pull in the Leaflet library, and a corresponding style element. Then it draws the map with the polyline on it. I ended up using Stamen’s beautiful watercolour map tiles.