Skip to content

Your team.
One causal truth.
Before the war room starts guessing.

Monitoring tools help you investigate. Incidentary helps your team converge first.

When an alert fires, teams don't lack dashboards. They lack agreement. Incidentary captures the pre-alert causal chain and delivers it as a shared replayable artifact — so the room starts from one picture, not five.

pre-alert causal trace · assembled in < 2s
trace · onboarding-quickstart · 1.8s
service / operationtimelinems
api-gatewayroot
1847
checkout-svc
1588
payment-svc ERR
996
payment-svc
848
inventory-svc
442
5 spans·1 error·pre-alert
root cause: payment-svcpg_pool exhaustion
< 2s
trace assembly time
1 artifact
shared by the whole room
60 sec
pre-alert window captured
No lock-in
Open-source capture layer

Built for the teams who feel the pain of
distributed incidents.

just split the monolith?

You went from one service to three. Now incidents involve services you didn't even know called each other. Incidentary shows the causal chain across every boundary — before the war room starts guessing.

running distributed services?

When five engineers are looking at five dashboards, agreement takes longer than the fix. Incidentary delivers one shared artifact so the room converges before anyone opens a terminal.

One SDK install.
Every dependency revealed.

Install the SDK on a single service. Incidentary observes every outbound call and surfaces uninstrumented dependencies as ghost services — services you depend on but don't have data from. One install reveals your entire dependency topology.

No teammates required. No config. The anomaly feed catches latency spikes and error bursts before they become incidents. The coverage scorecard shows you where to instrument next.

You get value in five minutes, not after the next outage.

1serviceSee what your service depends on
2–5servicesMap your topology as you instrument
6–15servicesCross-service incidents, clearly
15+servicesTeam convergence and shared traces

The 60 seconds before the alert.
Usually reconstructed at 3am.
Now already waiting.

Signal correlators watch your telemetry streams continuously. The moment anomalies appear, pre-arm sequences begin — assembling the causal path, linking related events, and tagging the break before the alert fires.

By the time PagerDuty wakes your team, the causal prelude is already rendered. Not a guess. Not an AI summary. A deterministic trace built from what your services actually reported to each other.

T-90s
checkout-svc latency ↑
T-72s
payment-svc 5xx rate spike
T-55s
db-pool exhaustion detected
T-38s
trace assembly started
T-20s
root cause isolated
T-8s
runbook linked
alert fires → context ready

Four steps. No black boxes.

01

instrument

Drop in the SDK. One middleware call wraps your HTTP handlers and propagates incident context automatically. No distributed config files. No sampling tuning. No OpenTelemetry collector to maintain.

import { incidentary } from '@incidentary/sdk-node';

app.use(incidentary.middleware());
02

ingest

Spans, errors, and structured logs flush over a persistent gRPC stream. No sampling. No dropped events at the boundary. No gaps caused by buffer timeouts.

// spans flushed automatically
// errors captured at boundary
// logs correlated by trace-id
03

assemble

The correlator builds a causal graph in real time. When anomaly thresholds breach, the pre-arm ring buffer locks: the 60 seconds before the alert are already captured and attached to the incident artifact.

// pre-arm triggered at T-90s
// root cause isolated T-20s
// runbook linked T-8s
04

respond

The alert fires with a direct link. Your team opens one artifact — the complete causal trace, shared across the room. Not five dashboards. Not one senior engineer narrating to four others. One picture.

// alert fires
// one shared trace lands in Slack
// cause before dashboards, not after
82:00
avg MTTR before incidentary
< 8:00 avg MTTR with Incidentary

The war room used to start by figuring out what happened.
Now it starts by acting on what happened.

Teams using Incidentary report an average MTTR drop from 82 minutes to under 8. The difference is not heroism — it is convergence happening in the first 90 seconds instead of the first 20 minutes.

Ready to see your MTTR drop? Start free →

Same incident. Different first minute.

Without Incidentary

  • Alert fires
  • Responders open different tools
  • Each person sees a different symptom
  • Cause and fallout get confused
  • One engineer synthesizes the story for everyone else
  • 10 to 20 minutes spent aligning before real debugging begins

With Incidentary

  • Alert fires with a direct link to the shared trace
  • The room opens one artifact
  • Everyone sees what broke first, how it propagated, and where coverage is missing
  • Responders align in minutes, by shared evidence — not narration
  • Datadog becomes the second step, not the first

The incident is the product demo.

One engineer shares a trace link in Slack. Teammates see the causal chain without installing anything. They notice the ghost service gaps — services where Incidentary knows a call was made but can't see inside. The product sells itself through its own gaps.

01one engineer installsSDK on one service, 3 minutes. Ghost services and the anomaly feed appear immediately.
02first incident sharedA trace link lands in Slack. Teammates see the causal chain — and the ghost service gaps.
03teammates instrumentGhost services become real services. The coverage scorecard tracks progress toward full visibility.
04team convergesEvery incident starts from one shared artifact. MTTR drops. The coverage scorecard turns green.

Every library. Zero config. One causal chain.

auto-instrumentation

The SDK detects libraries in your dependency tree and patches them at startup. No manual span creation. No config files. If OpenTelemetry already patched a library, the SDK skips it.

nodeexpress · fastify · koa · pg · ioredis · bullmq · amqplib · kafkajs · grpc
pythonfastapi · flask · django · psycopg2 · asyncpg · celery · kombu
gogin · echo · chi

19 libraries · 3 ecosystems · zero config required

database query capture

Query timing and connection metadata captured automatically. No parameters. No full query text. No sensitive data.

pg · ioredis · psycopg2 · asyncpg

queue instrumentation

Publish-consume pairs linked causally. Async workflows traced end-to-end without manual context propagation.

bullmq · amqplib · kafkajs · celery · kombu
grpcfull causal linkage · all sdks
opentelemetryzero-code ingest from collector
custom eventswebhooks · jobs · custom ops
rest api10K req/min · cursor pagination

Plugs into the tools your team already uses.

  • slack

    notifications + slash commands

    Incident URL posted automatically. /incidentary slash command to open traces inline.

  • pagerduty

    incident url in timeline

    Webhook fires on alert. Causal trace URL injected into PagerDuty incident timeline.

  • opsgenie

    webhook triggers

    Webhook integration triggers artifact assembly. Link back into OpsGenie alert.

  • opentelemetry

    zero-code ingest from existing collector

    Send existing OTel spans to Incidentary via OTLP. No SDK install needed. Coexists with Incidentary SDKs in the same trace.

  • shared links

    no login · token-based · read-only

    Paste in Slack, email, or Jira. Anyone with the link sees the trace. No account needed.

trust posture
# incidentary trust posture

privacy:
  data_boundary:    metadata-only
  request_bodies:   never captured
  query_parameters: never captured
  headers:          never captured

completeness:
  labels:           full | partial | low
  topology_aware:   true

retention:
  windows:          14d | 30d | 90d
  deletion:         hard delete at expiry

pre_arm:
  signals:          5xx rate · slow success
                    in-flight pileup · retry onset
  thresholds:       configurable per service

One middleware call.

No distributed config files. No sampling tuning. No OpenTelemetry collector to maintain. The SDK is a single middleware — it handles context propagation, event capture, and span flushing.

Your services keep running. Incidentary keeps watching.

$npm install @incidentary/sdk-node
full quickstart guide →
checkout-svc/index.ts
import { incidentary } from '@incidentary/sdk-node';
import express from 'express';

const app = express();

// Wrap once — all routes instrumented
app.use(incidentary.middleware({
  apiKey: process.env.INCIDENTARY_API_KEY,
  serviceName: 'checkout-svc',
}));

app.post('/checkout', async (req, res) => {
  // spans, errors, and slow queries captured automatically
  const order = await processOrder(req.body);
  res.json(order);
});

Start in minutes.
The SDKs are yours. The infrastructure is ours.

The capture SDKs are Apache 2.0 licensed. Read every line of source. Fork freely. No proprietary agent. No lock-in at the instrumentation layer.

Incidentary runs as a managed cloud service. No infrastructure to provision, no database cluster to operate, no retention policies to tune. Install the SDK, point it at your workspace, and the shared causal trace is there when the next alert fires.

Start free — 5 minutes to first traceview sdk on github →

First 20 teams get a direct Slack channel with the founder for feature requests and priority support.