blog
Engineering, incidents, and the tools between them.
Technical deep dives on causal tracing, pre-arm buffers, and distributed systems. War stories from real incidents. Practical SRE content you can use today.
The Incident Is the Outlier
Sampling-based tracing is optimized for showing you normal requests on normal days. Incidents are neither. Why the primitive fails at incident reconstruction.
Migrating from OpsGenie: What Every Alternative Misses
Every OpsGenie alternative covers on-call routing and Slack integration. None of them fix the gap between team assembled and team actually investigating the same thing.
The First 20 Minutes of Every Incident Are Wasted
Five engineers. Five dashboards. Zero agreement. The problem isn't observability — it's that nobody is looking at the same thing.
One Middleware, Eight Ghosts
What happens when you add distributed tracing to a single service — ghost services, caller-side metrics, and a dependency map you never drew.