Overview

alert-event-receiver

Overview

An Alertmanager webhook receiver that turns alert lifecycle transitions into immutable OpenTelemetry log events, with Redis/Valkey for deduplication and state tracking.

Why this receiver?

Many teams struggle with alert fatigue:

alerts are not always treated as urgent incidents
response times are slow or inconsistent
people compensate by increasing repeat_interval, silencing alerts, or ignoring them

Over time this can reduce trust in alerting and create noisy monitoring setups.

Alertmanager is excellent at reliable notification delivery and is designed around a simple assumption: if nobody acted, keep reminding. That behavior is exactly right for urgent incidents, but it can be noisy for signals that still matter without requiring immediate action.

This receiver keeps Alertmanager as the notification engine while turning alert lifecycle transitions into immutable OpenTelemetry log events. Those events can be stored in log backends such as VictoriaLogs, Parseable, or Loki and queried later with rich label-based filters.

That makes it easier to keep truly urgent alerts for human response while still retaining searchable history for lower-priority signals, review, reporting, and work planning.

How it works

Alertmanager (send_resolved: true)
      │
      │  POST /webhook
      ▼
alert-event-receiver
  ├─ reads alert.Fingerprint as fingerprint identity key
  ├─ checks Redis / Valkey for state
  ├─ emits OTLP log record (firing | resolved)
  └─ updates state in Redis / Valkey
      │
      │  OTLP/gRPC or OTLP/HTTP
      ▼
OTel Collector (optional)  →  VictoriaLogs / Loki / Grafana Cloud / …

Configuration

All configuration is via environment variables.

HTTP

Variable	Default	Description
`ADDRESS`	`:9011`	HTTP listen address (for example `:9011` or `127.0.0.1:9011`)

Redis / Valkey

Variable	Default	Description
`REDIS_ADDR`	`localhost:6379`	Redis/Valkey address
`REDIS_PASSWORD`	(empty)	Password
`REDIS_DB`	`0`	Database number
`CLOSED_STATE_TTL`	`24h`	TTL for closed-state tombstones (prevents duplicate resolved events)
`IDEMPOTENCY_TTL`	`168h` (7d)	TTL for idempotency keys

OpenTelemetry

The OTel SDK reads its standard environment variables automatically. You only need to set the relevant ones for your deployment:

Variable	Default	Description
`OTEL_EXPORTER_OTLP_PROTOCOL`	`grpc`	`grpc` or `http/protobuf`
`OTEL_EXPORTER_OTLP_ENDPOINT`	`http://localhost:4317`	Collector or backend endpoint
`OTEL_EXPORTER_OTLP_HEADERS`	(empty)	Auth headers (e.g. `Authorization=Bearer ...`)
`OTEL_SERVICE_NAME`	`alert-event-receiver`	Service name on emitted records
`OTEL_RESOURCE_ATTRIBUTES`	(empty)	Extra resource attributes (e.g. `cluster=prod-eu1,tenant=platform`)

Redis / Valkey storage model

The receiver stores only working state in Redis/Valkey. It does not store the full alert history there. Full lifecycle history is emitted as OTLP log records.

There are two key families:

Alert state hash

One hash per alert fingerprint:

alertstate:{fingerprint}

Example:

alertstate:4a4f0d2b7c9e1a23

Fields currently stored:

status → firing or closed
first_firing_at → first known firing timestamp for the current lifecycle
last_seen_at → last time a firing notification was seen
starts_at → Alertmanager startsAt value
alertname → copied from alert labels
label.* → every alert label is stored with a label. prefix

Example hash content:

status=firing
first_firing_at=2026-04-15T09:22:58Z
last_seen_at=2026-04-15T09:23:14Z
starts_at=2026-04-15T09:22:58Z
alertname=HighErrorRate
label.severity=warning
label.service=checkout
label.instance=checkout-7d8f9

Behavior:

When an alert is firing, the hash is written/updated and made persistent.
When an alert is resolved, the same hash is marked status=closed and a TTL is applied using CLOSED_STATE_TTL.
That short-lived closed tombstone prevents duplicate late resolved deliveries from creating extra resolved events.

Idempotency keys

One short-lived string key per transition delivery:

alertidemp:{fingerprint}:{transition}:{unix_timestamp}

Examples:

alertidemp:4a4f0d2b7c9e1a23:firing:1776244978
alertidemp:4a4f0d2b7c9e1a23:resolved:1776246061

Behavior:

The key is written with Redis SET NX.
If it already exists, the receiver treats the delivery as a duplicate and drops it.
The TTL is controlled by IDEMPOTENCY_TTL.
The stored value is currently just 1; the key name carries the useful information.

Deduplication and TTL behavior

Deduplication happens in two layers:

1. Identity: alert fingerprint

The receiver uses alert.Fingerprint from the Alertmanager webhook payload as the identity of one alert instance.

That means:

repeated notifications for the same alert instance reuse the same Redis state key
a new fingerprint is treated as a different alert instance
the receiver does not recompute identity from labels itself

2. Delivery deduplication: idempotency keys

For each transition, the receiver creates an idempotency key:

alertidemp:{fingerprint}:{transition}:{unix_timestamp}

Timestamp source:

firing → startsAt (or now if missing)
resolved → endsAt (or now if missing)

The key is written with SET NX:

if the key is new, the event is processed
if the key already exists, the event is dropped as a duplicate delivery

This protects against webhook retries and repeated deliveries for the same transition.

3. State-based suppression

The receiver also uses the alert state hash to suppress repeated lifecycle events:

if state is already firing and another firing arrives, no new event is emitted
if state is closed and a new firing arrives, that is treated as a new cycle and a new firing event is emitted
if a resolved arrives without matching open state, the receiver still emits a resolved orphan event and writes a short-lived closed tombstone

TTLs used by the receiver

Open alert state: no TTL

When an alert is open (status=firing), the receiver removes any TTL from alertstate:{fingerprint}.

Reason:

open alerts are active working state and should not disappear while the alert is still firing

Implication:

if Redis is cleared or keys are evicted externally, the receiver may later treat a resolved event as an orphan

`CLOSED_STATE_TTL`

Applies to:

alertstate:{fingerprint} when the alert has been marked closed

Default:

24h

Reason:

keep a short-lived tombstone so repeated late resolved deliveries do not create duplicate resolved events
keep enough recent closed state to distinguish a genuine reopen from a duplicate late resolve

If you increase it:

better protection against very late duplicate resolved deliveries
more Redis memory used by recently closed alerts
closed state entries stay around longer in operational queries and metrics

If you decrease it:

less Redis memory used by closed alerts
greater chance that a late duplicate resolved arrives after the tombstone expired and is emitted again as a resolved orphan

`IDEMPOTENCY_TTL`

Applies to:

alertidemp:{fingerprint}:{transition}:{unix_timestamp}

Default:

7d

Reason:

suppress duplicate deliveries and retries over a longer time window than the closed-state tombstone alone

If you increase it:

better protection against delayed retries or replayed webhook deliveries
more Redis memory used by idempotency keys

If you decrease it:

less Redis memory used by idempotency keys
greater chance that the same transition is accepted again after the idempotency key expires

TTL tuning guidance

As a rule of thumb:

increase CLOSED_STATE_TTL if late resolved deliveries are common
increase IDEMPOTENCY_TTL if webhook retries/replays can happen over long periods
decrease them only if Redis memory pressure matters more than long-window duplicate suppression

If you tune these values, the trade-off is simple:

longer TTLs = more duplicate protection, more Redis retention
shorter TTLs = less Redis retention, more risk of duplicate lifecycle events

Inspecting data with `redis-cli`

Connect to Redis/Valkey:

redis-cli -h localhost -p 6379

If you use a password:

redis-cli -h localhost -p 6379 -a "$REDIS_PASSWORD"

If you use a non-default DB:

redis-cli -h localhost -p 6379 -n 2

List alert state keys

redis-cli --scan --pattern 'alertstate:*'

Read one alert state hash

redis-cli HGETALL 'alertstate:4a4f0d2b7c9e1a23'

Read specific fields only

redis-cli HMGET 'alertstate:4a4f0d2b7c9e1a23' status first_firing_at last_seen_at starts_at alertname

Check the remaining TTL on a closed alert tombstone

redis-cli TTL 'alertstate:4a4f0d2b7c9e1a23'

Interpretation:

-1 → key exists and has no TTL (typically an open/firing alert)
-2 → key does not exist
positive integer → seconds until the closed tombstone expires

List idempotency keys

redis-cli --scan --pattern 'alertidemp:*'

Inspect one idempotency key

redis-cli GET 'alertidemp:4a4f0d2b7c9e1a23:resolved:1776246061'
redis-cli TTL 'alertidemp:4a4f0d2b7c9e1a23:resolved:1776246061'

Find all keys for one fingerprint

redis-cli --scan --pattern 'alertstate:4a4f0d2b7c9e1a23'
redis-cli --scan --pattern 'alertidemp:4a4f0d2b7c9e1a23:*'

Show only stored labels for one alert

redis-cli HGETALL 'alertstate:4a4f0d2b7c9e1a23' | grep '^label\.'

End-to-end example (webhook -> Redis keys)

The example below shows a minimal lifecycle for one alert fingerprint.

Step 1: Firing webhook alert

{
  "groupKey": "{}:{alertname=\"HighErrorRate\"}",
  "receiver": "event-webhook",
  "externalURL": "https://alertmanager.example",
  "alerts": [
    {
      "status": "firing",
      "fingerprint": "4a4f0d2b7c9e1a23",
      "startsAt": "2026-04-15T09:22:58Z",
      "labels": {
        "alertname": "HighErrorRate",
        "severity": "warning",
        "service": "checkout",
        "instance": "checkout-7d8f9"
      },
      "annotations": {
        "summary": "Checkout error rate is high"
      }
    }
  ]
}

Expected Redis writes:

alertstate:4a4f0d2b7c9e1a23 hash with status=firing and label.* fields
alertidemp:4a4f0d2b7c9e1a23:firing:1776244978 with value 1 and IDEMPOTENCY_TTL

Inspect with redis-cli:

redis-cli HGETALL 'alertstate:4a4f0d2b7c9e1a23'
redis-cli TTL 'alertstate:4a4f0d2b7c9e1a23'
redis-cli GET 'alertidemp:4a4f0d2b7c9e1a23:firing:1776244978'
redis-cli TTL 'alertidemp:4a4f0d2b7c9e1a23:firing:1776244978'

At this point, state key TTL should usually be -1 (open alert, persisted).

Step 2: Resolved webhook alert

{
  "groupKey": "{}:{alertname=\"HighErrorRate\"}",
  "receiver": "event-webhook",
  "externalURL": "https://alertmanager.example",
  "alerts": [
    {
      "status": "resolved",
      "fingerprint": "4a4f0d2b7c9e1a23",
      "startsAt": "2026-04-15T09:22:58Z",
      "endsAt": "2026-04-15T09:41:01Z",
      "labels": {
        "alertname": "HighErrorRate",
        "severity": "warning",
        "service": "checkout",
        "instance": "checkout-7d8f9"
      }
    }
  ]
}

Expected Redis writes/updates:

alertstate:4a4f0d2b7c9e1a23 updated to status=closed
alertstate:* gets a TTL from CLOSED_STATE_TTL
alertidemp:4a4f0d2b7c9e1a23:resolved:1776246061 with value 1 and IDEMPOTENCY_TTL

Inspect with redis-cli:

redis-cli HMGET 'alertstate:4a4f0d2b7c9e1a23' status first_firing_at last_seen_at starts_at alertname
redis-cli TTL 'alertstate:4a4f0d2b7c9e1a23'
redis-cli GET 'alertidemp:4a4f0d2b7c9e1a23:resolved:1776246061'
redis-cli TTL 'alertidemp:4a4f0d2b7c9e1a23:resolved:1776246061'

If you send the exact same transition again, the existing alertidemp:* key causes it to be dropped as a duplicate.

Troubleshooting notes

If an alert is currently open, expect status=firing and TTL = -1.
If an alert was recently resolved, expect status=closed and a positive TTL.
If a duplicate delivery was suppressed, look for a matching alertidemp:* key.
If no alertstate:* key exists for a resolved alert, that can still be valid: the receiver emits a resolved orphan event and writes a short-lived closed tombstone.

Running locally

Prerequisites

Go 1.23+
Redis or Valkey
An OTLP-compatible log backend (or OTel Collector)

go build -o alert_event_receiver ./cmd/server

Quick start

# Start Redis
docker run -d -p 6379:6379 redis:7

# Start an OTel Collector or any OTLP-compatible backend, then:
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_RESOURCE_ATTRIBUTES="cluster=local,tenant=dev"

go run ./cmd/server

Endpoints

Path	Method	Description
`/webhook`	POST	Alertmanager webhook receiver
`/metrics`	GET	Prometheus metrics (RED pattern)
`/healthz`	GET	Health check

Observability

Prometheus metrics

Metric	Labels	Description
`alertreceiver_webhook_requests_total`	`status` (2xx/4xx/5xx)	Webhook request rate
`alertreceiver_webhook_duration_seconds`	`status`	Webhook latency histogram
`alertreceiver_events_emitted_total`	`transition` (firing/resolved)	Events emitted via OTLP
`alertreceiver_emit_errors_total`	`transition`	OTLP emit failures
`alertreceiver_redis_errors_total`	`operation`	Redis operation errors
`alertreceiver_redis_ops_total`	`operation`, `result`	Redis operation outcomes
`alertreceiver_redis_op_duration_seconds`	`operation`	Redis operation latency histogram
`alertreceiver_redis_idemp_setnx_total`	`result` (`set`/`exists`/`error`)	Idempotency `SET NX` outcomes
`alertreceiver_duplicates_dropped_total`	—	Events dropped by idempotency check
`alertreceiver_state_writes_total`	`result`	Alert transition outcomes such as `opened`, `reopened`, `closed`, `resolved_without_open_state`, `already_firing`
`alertreceiver_state_entries`	`status` (`firing`/`closed`)	Current Redis-backed alert state entries maintained from write-path updates
`alertreceiver_idempotency_keys_created_total`	`transition`	Idempotency keys created successfully
`alertreceiver_resolved_orphans_total`	—	Resolved alerts seen without matching open state
`alertreceiver_closed_ttl_seconds`	—	TTL applied to closed alert tombstones

Internal logs

Structured JSON written to stdout. All errors against Redis and the OTLP backend are logged with context (fingerprint, alertname, operation, error).

Running tests

go test ./...

Project structure

cmd/server/              main — wires dependencies and starts HTTP server
internal/
  config/                env var config loading
  models/                Alertmanager webhook payload + LifecycleEvent types
  processor/             alert state transition logic
  state/                 Redis/Valkey Store interface + implementation
  telemetry/
    logger.go            JSON slog logger (stdout)
    metrics.go           Prometheus RED metrics
    otel.go              OTel SDK init (log provider)
    emitter.go           OTelEmitter — maps LifecycleEvent → OTel LogRecord
  webhook/               HTTP handler for Alertmanager webhook
docs/
  architecture.md        Full design document

Log storage

VictoriaMetrics VisctoriaLogs

VictoriaMetrics Logs support direct OTLP ingestion over HTTP. Configuration example:

export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:9428/insert/opentelemetry
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
# Define the fields tha should be used as stream labels - default will be determine from the log resource attributes
export OTEL_EXPORTER_OTLP_HEADERS="VL-Stream-Fields=service.name,am.external_url,alert.alertname,alert.fingerprint,alert.status,alert.transition"
# Alternatively use the logs-specific var (takes precedence over OTEL_EXPORTER_OTLP_HEADERS):
# export OTEL_EXPORTER_OTLP_LOGS_HEADERS="VL-Stream-Fields=alert.alertname,alert.label.severity,alert.state_write_result,alert.status,alert.transition"

Query examples in VictoriaLogs

VictoriaLogs queries use LogsQL.

All firing events for one service in the last hour

_time:1h alert.label.service:checkout alert.transition:firing

All resolved events longer than 15 minutes

alert.transition:resolved alert.duration_seconds:>900

Top flapping alerts

Assuming one event per transition:

_time:24h event.kind:alert_transition
| stats by (alert.alertname, alert.label.service) count() as transitions
| sort by (transitions desc)
| limit 20

Average duration by alertname

_time:7d alert.transition:resolved
| stats by (alert.alertname) avg(alert.duration_seconds) as avg_duration, count() as total
| sort by (avg_duration desc)

Alerts resolved without ticket

If there is a ticket_id label: alert.transition:resolved alert.label.ticket_id:""

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
cmd/server		cmd/server
internal		internal
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Folders and files

Latest commit

History

Repository files navigation

alert-event-receiver

Overview

Why this receiver?

How it works

Configuration

HTTP

Redis / Valkey

OpenTelemetry

Redis / Valkey storage model

Alert state hash

Idempotency keys

Deduplication and TTL behavior

1. Identity: alert fingerprint

2. Delivery deduplication: idempotency keys

3. State-based suppression

TTLs used by the receiver

Open alert state: no TTL

CLOSED_STATE_TTL

IDEMPOTENCY_TTL

TTL tuning guidance

Inspecting data with redis-cli

List alert state keys

Read one alert state hash

Read specific fields only

Check the remaining TTL on a closed alert tombstone

List idempotency keys

Inspect one idempotency key

Find all keys for one fingerprint

Show only stored labels for one alert

End-to-end example (webhook -> Redis keys)

Step 1: Firing webhook alert

Step 2: Resolved webhook alert

Troubleshooting notes

Running locally

Prerequisites

Quick start

Endpoints

Observability

Prometheus metrics

Internal logs

Running tests

Project structure

Log storage

VictoriaMetrics VisctoriaLogs

Query examples in VictoriaLogs

All firing events for one service in the last hour

All resolved events longer than 15 minutes

Top flapping alerts

Average duration by alertname

Alerts resolved without ticket

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`CLOSED_STATE_TTL`

`IDEMPOTENCY_TTL`

Inspecting data with `redis-cli`

Packages