Static site by HTML rewriting

Setting up static site generators is fun; making changes later is not. Having tried several, and always ending up hacking around other people’s code, I decided to write my own based on HTML rewriting.

Elevator pitch:

  1. Write blog post content in HTML (no Markdown or whatever)
  2. Do all preprocessing with a simple, single-pass HTML rewriter

Concretely:

  1. Posts are stored in ./blog/slug/index.html, at their desired path
  2. A post starts with a <meta> tag with metadata like title, date
  3. The rest is the post body, in HTML

For example, the source of this post (full source below) reads:

<meta type = post
      title = "Static site by HTML rewriting"
      date = 2025-12-27>
<p>Setting up static site generators...

SSGs always end up processing HTML in one way or another, I just cut out the middleman. Writing in HTML directly does not exclude fancy features: for example, I have a rule that replaces <img type=code src=in.c> with syntax-highlighted code snippets. There is also good support for an HTML-oriented approach: almost every programming language has a library for processing HTML, and many support CSS selector syntax (example: img[type=code]).

I decided to implement my SSG in JavaScript, targeting Bun, which has a builtin HTML rewriting library (based on lol-html), as well as builtin APIs to quickly implement a local preview server. The whole thing is a few hundred lines, including code for the blog index, Atom feed, and a local preview server that compiles pages on-the-fly and triggers browser reloads when files change. I have a better drafting experience now than I had with off-the-shelf SSGs.

Recently, the company behind Bun was acquired by a large VC-backed AI company… So I’m not sure if I would recommend it still.

I still use some self-contained dependencies like highlight.js, another benefit of staying in the JavaScript ecosystem. But it’s nowhere near the 5000-line package-lock.json I had with Metalsmith.

In some sense, the time is ripe for more minimal static site generators. Where ten years ago I needed a CSS preprocessor for nesting and calculations, nowadays this is part of standard CSS. Writing HTML by hand is much nicer than in the XHTML days: attribute values almost never have to be quoted, and most-used prose elements like <p> and <li> often do not need end tags. If custom elements didn’t require JavaScript, I would have been tempted to do away with the static site generator entirely!

My Metalsmith-based blog lasted nine years. Hopefully, this new script holds out for another ten. And my on-and-off blogging!

Bonus: source of this post
<meta type = post
      title = "Static site by HTML rewriting"
      date = 2025-12-27>
<p>
Setting up static site generators is fun;
making changes later is not.
Having tried several, and always ending up hacking around other people’s code,
I decided to write my own based on HTML rewriting.

<p>
Elevator pitch:
<ol>
	<li>Write blog post content in HTML (no Markdown or whatever)
	<li>Do all preprocessing with a simple, single-pass HTML rewriter
</ol>

<p>
Concretely:
<ol>
	<li>Posts are stored in ./blog/slug/index.html, at their desired path
	<li>A post starts with a <code>&lt;meta&gt;</code> tag
		with metadata like title, date
	<li>The rest is the post body, in HTML
</ol>

<p>
For example, the source of this post (full source below) reads:
<img type=code src=example lang=html>

<p>
<abbr title="Static Site Generator">SSG</abbr>s always end up processing HTML in one way or another,
I just cut out the middleman.
Writing in HTML directly does not exclude fancy features:
for example, I have a rule that replaces
<code>&lt;img type=code src=in.c&gt;</code>
with syntax-highlighted code snippets.
There is also good support for an HTML-oriented approach:
almost every programming language has a library for processing HTML,
and many support CSS selector syntax
(example: <code>img[type=code]</code>).

<p>
I decided to implement my SSG in JavaScript,
targeting <a href=https://bun.com>Bun</a>,
which has a builtin
<a href=https://bun.com/docs/runtime/html-rewriter>HTML rewriting library</a>
(based on <a href=https://github.com/cloudflare/lol-html>lol-html</a>),
as well as builtin APIs
to quickly implement a local preview server.
The whole thing is a few hundred lines,
including
code for the blog index,
Atom feed,
and a local preview server
that compiles pages on-the-fly
and triggers browser reloads when files change.
I have a better drafting experience now than I had with off-the-shelf SSGs.

<side-note>
Recently,
the company behind Bun was acquired by a large VC-backed AI company&hellip;
So I’m not sure if I would recommend it still.
</side-note>

<p>
I still use some self-contained dependencies
like <a href=https://highlightjs.org/>highlight.js</a>,
another benefit of staying in the JavaScript ecosystem.
But it’s nowhere near the 5000-line package-lock.json I had with Metalsmith.

<p>
In some sense, the time is ripe for more minimal static site generators.
Where ten years ago I needed a CSS preprocessor for nesting and calculations,
nowadays this is part of standard CSS.
Writing HTML by hand is much nicer than in the XHTML days:
attribute values almost never have to be quoted,
and most-used prose elements like <code>&lt;p&gt;</code> and <code>&lt;li&gt;</code>
<a href=https://html.spec.whatwg.org/multipage/syntax.html#optional-tags>often do not need end tags</a>.
If custom elements didn’t require JavaScript,
I would have been tempted to do away with the static site generator entirely!

<p>
My Metalsmith-based blog lasted nine years.
Hopefully, this new script holds out for another ten.
And my on-and-off blogging!

<details>
	<summary>Bonus: source of this post</summary>
	<img type=code src=index.html lang=html>
</details>