Featured · Distributed Systems · Postmortem

Why we replaced our message queue (and what we learned)

After two years of running a custom message broker we finally pulled the plug and migrated everything to NATS JetStream. The migration took six weeks of careful planning, three weekends of late-night cutovers and one production incident that we are still apologising for.

AAlex Chen·Jun 2, 2026·9 min read

Latest writing

Engineering Culture

Feature flags are not a substitute for testing

Every team eventually discovers the joy of shipping behind a flag. Fewer teams discover the misery of cleaning up flags two years later when the original author has long since left the company. A short rant and a longer guide.

PPriya Raman·Apr 24, 2026·5 min read
Internal ToolsUX

Designing an admin panel that engineers do not hate

Internal tools tend to rot the moment they leave the founding team. We rebuilt our admin from scratch around three principles: every action should be auditable, nothing should require a tribal-knowledge runbook, and there should be exactly one button that resolves the on-call page.

MMei Watanabe·May 11, 2026·8 min read
GoTooling

Building a tiny static site generator in 200 lines of Go

Sometimes the right answer is not Hugo or Astro but a file you can read in one sitting. We walked through every line in a guild lunch and somehow shipped a new docs site by the end of the day.

SSam Lee·May 27, 2026·7 min read
PostgresKubernetes

Notes on running Postgres on Kubernetes in 2026

Five years ago this would have been a controversial post. Today we have battle-tested operators, decent storage classes and a community that has finally agreed on backup tooling. Here is the stack we landed on for a fleet of around forty clusters.

HHassan Idris·Apr 27, 2026·10 min read
GoObservability

Profiling Go services in production without breaking everything

Continuous profiling sounds great until you turn it on in a fleet of three hundred pods and your p99 latency doubles. Here is how we instrumented pprof endpoints behind a feature flag, sampled at one percent and shipped flame graphs straight to Grafana.

PPriya Raman·May 11, 2026·7 min read
DatabasesOperations

A practical guide to writing migrations you can actually run on Friday

Most database migrations fail not because the SQL is wrong but because nobody tested them against a realistic dataset. We share the rollout checklist we use for every schema change, including the awkward question of when to lock a table and when to live with the inconsistency.

SSam Lee·Apr 14, 2026·11 min read