Articles
Engineering12 Nov 2025

Designing for the millisecond

Why I treat response time as a first-class design constraint, and the techniques that keep an interface feeling instant from database query to first paint.

Designing for the millisecond

Speed is not a feature you bolt on at the end. It is a design constraint you carry from the first sketch to the last deploy — and like every constraint worth keeping, it shapes the thing you build into something better than it would otherwise be.

The budget is the brief

Before a single component is drawn, I write down a number. Not a vague aspiration — a hard budget, in milliseconds, for the moment between a tap and a visible response. Everything downstream answers to it: the query plan, the payload size, the animation curve, the order in which bytes arrive.

A budget you can't measure is a wish. A budget you measure on every commit is a contract.

When the number is explicit, trade-offs stop being arguments and start being arithmetic. You don't debate whether a 400 ms waterfall is acceptable — you look at the budget and you know.

A waterfall chart with the latency budget drawn as a hard ceiling line
A waterfall chart with the latency budget drawn as a hard ceiling line

The chart above is the whole argument in one image: every bar that pokes above the line is a conversation you get to have before it ships, not an incident you get to explain after.

Where the milliseconds hide

Most perceived latency is not compute. It is waiting: for a round trip, for a font, for a layout pass that blocks paint. The fix is rarely "make it faster" and usually "make it sooner" — move the work earlier, or move it off the critical path entirely.

  • Prefetch the likely next step before the user asks for it.
  • Stream the shell so the frame exists while data is still in flight.
  • Defer the inessential — analytics, third-party widgets, anything below the fold.

A small helper I reach for constantly is a timing wrapper that surfaces the budget right in the logs:

const BUDGET_MS = 120;
 
async function timed<T>(label: string, fn: () => Promise<T>): Promise<T> {
  const start = performance.now();
  try {
    return await fn();
  } finally {
    const elapsed = performance.now() - start;
    if (elapsed > BUDGET_MS) {
      console.warn(`[slow] ${label} took ${elapsed.toFixed(1)}ms (budget ${BUDGET_MS}ms)`);
    }
  }
}

It is almost embarrassingly simple, but a warning in the console the moment a path blows its budget catches regressions far earlier than any dashboard.

First paint is a promise

The first frame is the interface making a promise: I heard you, and I'm working on it. Honour that promise within ~100 ms and the rest of the load can take its time — the user is already reassured. Miss it, and no amount of raw throughput will make the thing feel responsive.

That is the whole discipline, really. Not "be fast everywhere," but "be instant where it counts, and honest everywhere else."