CNV-80608: PR12 add alerts_effective_active_at_timestamp_seconds metric#16
CNV-80608: PR12 add alerts_effective_active_at_timestamp_seconds metric#16sradco wants to merge 1 commit into
Conversation
760bab8 to
f62057b
Compare
5d89003 to
efd5a1a
Compare
f62057b to
bad1af3
Compare
e8fff27 to
56481c7
Compare
|
Please note, I found a bug with the metric value. It was based on the StartAt label of Alertmanager, which I thought holds the alert start time, but it actually holds the time the alert was collected to Alertmanager, which is nt what we need. |
bad1af3 to
6ff63c9
Compare
56481c7 to
ba86543
Compare
6ff63c9 to
6b2d009
Compare
ba86543 to
6cfc7e4
Compare
6b2d009 to
1c95614
Compare
6cfc7e4 to
5c81f95
Compare
1c95614 to
6ace03c
Compare
6ec1d22 to
8ad2c3a
Compare
| enrichActiveAt(alerts, promAlerts) | ||
| return append(alerts, filterAlertsByState(promAlerts, "pending")...), nil |
There was a problem hiding this comment.
why not filter first then enrich?
| if promErr != nil { | ||
| return nil, promErr | ||
| } |
There was a problem hiding this comment.
this should be done immediately after the failed call
| promAlerts, promErr := pa.getAlertsViaProxy(ctx, namespace, promRouteName, source) | ||
|
|
||
| if amErr == nil { | ||
| if promErr == nil { |
There was a problem hiding this comment.
if we handled the error, this check wouldnt be needed
| if promErr == nil { | ||
| enrichActiveAt(amAlerts, promAlerts) | ||
| } | ||
| pending := filterAlertsByState(promAlerts, "pending") |
There was a problem hiding this comment.
we are only filtering if no amErr
why would we return all promAlerts if no amErr
but return only the pending if amErr
|
|
||
| // alertFingerprint builds a stable identity key from an alert's labels, | ||
| // excluding metadata labels injected by this plugin (source, backend). | ||
| func alertFingerprint(labels map[string]string) string { |
There was a problem hiding this comment.
whats the difference between using this implementation and the one for the id label?
| managementRouter := managementrouter.New(managementClient) | ||
| router.PathPrefix("/api/v1/alerting").Handler(managementRouter) | ||
|
|
||
| metricsHandler, err := managementClient.MetricsHandler(ctx, k8sconfig) |
There was a problem hiding this comment.
maybe pkg/monitoring/management/metrics/alerts_collector.go
or even
pkg/management/metrics/alerts_collector.go
There was a problem hiding this comment.
in pkg/management/management.go
"github.com/openshift/monitoring-plugin/pkg/management/metrics" -> metrics.NewHandler(ctx, c, kubeConfig)
instead of
"github.com/openshift/monitoring-plugin/pkg/metrics" -> metrics.NewHandler(ctx, c, kubeConfig)
| // backend, component, layer) plus "alertstate". Thanos-sourced alerts are | ||
| // filtered out to avoid duplicates. Annotations are excluded because they | ||
| // are available from the alert rule definition. | ||
| type AlertsCollector struct { |
There was a problem hiding this comment.
should only be instantiated if isLeader
and not receive a function to check
There was a problem hiding this comment.
should not belong in metrics package
metrics, specifically the new collector, depends on being the leader
not the other way around
leader election is a server responsability
server should manage the leader election
and then if takes the lead, instantiate the collector
6ace03c to
53c6cf9
Compare
ce1decd to
dc78a65
Compare
f99316a to
2aeb592
Compare
dc78a65 to
0147856
Compare
0e3ceaa to
5f87dde
Compare
497e2db to
0398569
Compare
5f87dde to
f5bda9f
Compare
0398569 to
cb7aa1d
Compare
f5bda9f to
7276833
Compare
cb7aa1d to
cfc4f9f
Compare
7276833 to
528338b
Compare
cfc4f9f to
3edd517
Compare
b6094a8 to
a262f88
Compare
db8d155 to
5c7c6d4
Compare
a262f88 to
f2f4bbe
Compare
5c7c6d4 to
d24450d
Compare
f2f4bbe to
6014450
Compare
d24450d to
8cbd67c
Compare
6014450 to
598abd6
Compare
8cbd67c to
a217801
Compare
598abd6 to
5b090e4
Compare
a217801 to
59c226e
Compare
672bc3b to
a24fa21
Compare
59c226e to
31921ea
Compare
a24fa21 to
ed02e88
Compare
31921ea to
e2a91c0
Compare
ed02e88 to
6a001cd
Compare
e2a91c0 to
66459e8
Compare
Expose a Prometheus gauge metric whose value is the activeAt Unix timestamp for every effective alert (firing, pending, silenced). Labels include all alerts labels after relabeling plus enrichment labels and alertstate. Annotations are excluded since they are available from the alert rule definition. Signed-off-by: Shirly Radco <sradco@redhat.com> Co-authored-by: AI Assistant <noreply@cursor.com>
Expose a Prometheus gauge metric whose value is the activeAt Unix
timestamp for every effective alert (firing, pending, silenced).
Labels include all alerts labels after relabeling plus enrichment labels
and alertstate.
Annotations are excluded since they are available from the alert rule definition.
Signed-off-by: Shirly Radco sradco@redhat.com
Co-authored-by: AI Assistant noreply@cursor.com