feat(metrics): user-defined metrics catalog + monitoring (tripl-dxhp)#27
Merged
Conversation
First foundational piece of the metrics catalog + monitoring epic (tripl-dxhp.1). MetricDefinition is a global, project-scoped catalog entity (no branch_id), mirroring EventType/Event, with a simple draft/active/archived lifecycle and a hybrid collection binding: sql/fact_aggregation carry their own data_source_id + interval; event_composition references canonical events via numerator/denominator FKs. Adds MetricKind/MetricStatus/MetricAggregation/MetricComposition enums and migration d6e7f8a9b0c1 (4 native PG enum types + metric_definitions table). 5 model tests; mypy + ruff clean; single alembic head; revision-graph test passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Metrics epic (tripl-dxhp) backend slices .2/.3/.4:
- .2 value storage: MetricValue + MetricValueBreakdown models (metric-scoped,
Float value, optional scan_config_id grid alignment), dialect-aware UPSERT and
window-delete helpers in metric_rows.py, migration e9f0a1b2c3d4.
- .3 adapters: get_time_bucketed_aggregate (+breakdown) across ClickHouse,
Postgres and BigQuery (count/sum/avg/min/max/count_distinct); new
measure_validator (identifier allowlist, SQL-fragment/SELECT safety, UNION ban,
paren-depth FROM detection) preserving the no-bound-params escaping model;
ClickHouse interval validation on new and pre-existing count methods. Existing
count path SQL unchanged.
- .4 catalog API: kind-discriminated Pydantic schemas with every identifier/SQL
field gated through measure_validator at the boundary; service (CRUD, list,
bulk, reorder, move) and thin /projects/{slug}/metrics router (EditorUserDep +
audit).
openapi.json regenerated (additive: 5 routes, 16 schemas). 255 tests pass; mypy
and ruff clean. Security re-verify: user-controlled SQL injection surface closed.
Metrics epic (tripl-dxhp) wave 1 — slices .5 and .7:
- .5 collection: collect_metric_definitions worker task. fact/sql kinds run via
the M3 adapter aggregations over the metric data source on interval-aligned
chunks (window-delete-then-upsert into metric_values/breakdowns); event_composition
uses a pure evaluator (worker/analyzers/metric_composition) that densifies
numerator/denominator onto the shared scan grid, divide-by-zero -> None,
per_distinct_user denominator via warehouse count_distinct. Beat dispatch
check_metric_definitions_due with a dedicated advisory lock + stale-running reap;
inline last_collected_at/status/error.
- .7 series read: metric_series_service (densify + anomaly join + forecast, reusing
metrics_service helpers read-only) exposing GET series/breakdowns/versions under
/projects/{slug}/metrics; catalog list enriched with latest value + signal + spark
via batched window-function queries. Float-shaped response models preserve <1 ratios.
openapi.json regenerated (additive: 3 routes, 6 schemas, enriched list item).
278 tests pass; mypy + ruff clean.
Carried to .6 (value-kind work): metric_values.value is NOT NULL so divide-by-zero
buckets are skipped; .6 should add a value-kind flag (nullable value + null-aware
zero-fill) and the metric anomaly scope (scope_ref=metric_definition_id, plus a
(scope_ref,bucket) index).
…/detail (.8) Metrics epic (tripl-dxhp) wave 2 — slices .6 and .8: - .6 anomaly scope + alerting: MetricScopeType += metric (migration a1b2c3d4e5f6: ALTER TYPE in an autocommit block, a partial unique index on (scope_type,scope_ref,bucket) WHERE scan_config_id IS NULL, nullable MetricAnomaly.scan_config_id + (scope_ref,bucket) index, project_anomaly_settings .detect_metrics, alert_rules.include_metrics). detect.py recomputes metric-scope anomalies (project-global, NULL scan_config_id, idempotent via the partial index); a value-kind helper (is_count_shaped) gates zero-fill / min_expected_count so ratio/avg metrics don't false-fire, and the series densify renders fractional gaps as null. Scope threaded through signals/dispatch/alert_payload/alerting_matching/ get_active_signals; include_metrics gates delivery (default OFF); per-project canonical AlertRuleState avoids multi-scan-config duplicate deliveries. - .8 frontend: metricsCatalogApi + types, MetricsPage (MiniStat + table, latest value/signal + sparkline, filters), MetricForm with kind-specific config sections and per-kind validation mirroring the backend discriminated union (kind/config immutable on edit), metric drilldown via a new 'metric' MonitoringScope reusing MonitoringDetailPage, nav item + lazy routes. Anomaly row visuals gated on signal state. Integration: regenerated openapi.json (additive: include_metrics/detect_metrics, metric enum value, nullable MetricSignalResponse.scan_config_id) and frontend api.gen.ts; tightened the catalog list-enrichment anomaly query to filter scope_type='metric'. Full backend suite 801 passed; frontend tsc + eslint clean, 53 vitest pass.
Metrics epic (tripl-dxhp) wave 3 — slice .9: - backend/src/tripl/tests/test_metrics_pipeline_e2e.py: 6 end-to-end tests driving definition -> collection -> anomaly recompute -> alert dispatch for every kind (fact_aggregation, sql, event_composition single/ratio/per_distinct_user), including the ratio divide-by-zero gap and the include_metrics opt-in gate (delivers when True, none when False). Green. - docs: documented Metrics as a first-class concept alongside Events across concepts.md, feature-reference.md, architecture.md, anomaly-detection.md and alerting.md (three kinds, global/project-scoped lifecycle, collection + scheduling, count-shaped vs fractional value-kind, detect_metrics / include_metrics). Browser E2E (Playwright) deferred: the repo has no Playwright harness and there is no live stack in this environment; the catalog list/form/drilldown flow is covered by the wave-2 vitest suite + MonitoringDetailPage tests. No API drift (openapi regenerated, unchanged).
Fact-tables epic (tripl-ysji) wave 1 — slices .1 and .6: - .1 FactTable foundation: new FactTable model (project-scoped: data_source_id, sql SELECT, timestamp_column, columns/identifier_columns/row_filters JSON), 'fact' added to the MetricKind enum, metric_definitions.fact_table_id FK, migration a2b3c4d5e6f7 (ALTER TYPE ADD VALUE in an autocommit block + fact_tables table + FK; single head). The collect task dispatch now fails with a clear NotImplementedError for kinds without a collector instead of a cryptic KeyError. (Fact tables are defined separately; metrics will be built on top of them in later slices. The legacy inline fact_aggregation kind is being removed in .4.) - .6 metrics UI polish: metric drilldown shows "Back to metrics" + a Metrics breadcrumb (was Monitors/Back to events); MetricForm clears stale validation on kind change; the misleading fractional-metric forecast tail is dropped for catalog-metric series; lazy metric routes get a Suspense fallback so they no longer flash the previous page. openapi.json + frontend api.gen.ts regenerated (additive: MetricKind gained 'fact'). Backend ruff/mypy + 26 tests + contract test green; frontend tsc/eslint + 43 vitest green.
…(ysji.2/.3)
Wave 2 of the fact-tables epic:
- Introspection service: runs a fact-table SELECT via the warehouse adapter,
buckets column types (number/string/bool/timestamp), derives identifier
candidates, returns JSON-safe sample rows. Scope-checks the data source to
the project (via ScanConfig), re-validates SELECT safety as defense in depth,
buckets Postgres interval as string, coerces non-finite floats to null, and
redacts driver exceptions from the WARNING log tier.
- FactTable CRUD: kind-mirrored schemas (SQL/fragment/identifier validated via
measure_validator), service (409 on duplicate name, project-scoped data
source binding, append-at-end ordering), router /projects/{slug}/fact-tables
with create/list/get/patch/delete + POST /preview.
- openapi.json regenerated (additive: 3 paths, FactTable* schemas).
Gates: ruff + mypy clean (248 files), 71 fact-table tests + openapi contract
pass. Review: 2 HIGH (order-null 500, cross-project source binding) + 4 MEDIUM
fixed; 1 MEDIUM deferred to ysji.4 (column-name validation belongs at the
metric measure/distinct column gate where it reaches SQL).
…ggregation (ysji.4) Wave 3 of the fact-tables epic. ADD 'fact' kind: a metric built on a FactTable. - Schema FactMetricCreate (single: fact_table_id + aggregation + measure/distinct/ row_filter; ratio: numerator/denominator FactOperands, denominator may use a different fact table). - Service validation: fact-table existence + project ownership; measure_column/ distinct_column must pass validate_identifier AND be a column of the referenced fact table; row_filter must name a stored filter; per-aggregation required-field rules; data source comes from the FactTable. - Collection _collect_fact (single/ratio) reuses the adapter get_time_bucketed_aggregate(+breakdown); ratio divides via evaluate_composition (divide-by-zero -> gap). Value-kind: count/count_distinct count-shaped; sum/avg/min/max and ratio fractional. REMOVE fact_aggregation entirely (owner request, no back-compat): - Drop MetricKind.fact_aggregation via PG enum type-recreation migration b3c4d5e6f7a8 (single head after a2b3c4d5e6f7; pre-flight DELETE makes the USING cast safe; restartable via DROP TYPE IF EXISTS; PG-guarded). - Delete FactAggregationConfig/MetricCreate, the collector, all docstrings, and the frontend surface; regenerate openapi.json + api.gen.ts. Hardening (review findings): row-filter fragments now reject SELECT/WITH subqueries; runtime fact-table project-scope check in collection; empty-columns guard; asserts -> explicit raises; idempotent enum migration; loud FE guard for the deferred fact form. Gates: ruff + mypy clean (248 files), 221 backend tests + openapi contract + single alembic head, frontend tsc/eslint/vitest green. fact_aggregation: 0 refs.
Wave 4 (final epic slice), frontend. - factTablesApi + src/types/factTables.ts (aliasing the generated FactTable* schemas) mirroring the metrics-catalog client/types. - FactTables page: list + create/edit form with a SQL editor, a 'Preview columns' action (POST /fact-tables/preview) that renders sampled columns/types and persists them into the create payload, and a named row-filters editor. - Rich 'fact' metric form: fact-table picker -> column/aggregation/filter dropdowns; composition toggle single vs ratio (numerator/denominator operands, each may reference a different fact table); client validation mirrors the backend required-field rules. Removes the wave-3 placeholder. - Routes (/p/:slug/fact-tables[/new|/:id/edit]) + 'Fact tables' nav item + breadcrumb. Quality: a11y (accessible row-filter labels, aria-required, no double tab stop), stable list keys, exhaustive kind dispatch, type-guarded aggregation parsing. Gates: tsc + eslint clean, 22 vitest tests. Epic tripl-ysji complete (6/6).
Extend create_demo_project to showcase the metrics catalog out of the box: - an 'orders' FactTable (synthetic read-only SELECT, 6 introspected columns, a named 'completed' row filter); - four MetricDefinitions, one per kind: a sql metric (Active Sessions), an event_composition ratio (Purchase conversion), a fact single (Revenue completed = sum(amount) filtered), and a fact ratio (Average order value = sum / count); - fabricated MetricValue series (7d hourly for the ratio on the scan grid; 30d daily for the sql/fact metrics) so the catalog and metric drilldowns render with data without the worker ever running. Definitions are built from the Pydantic create schemas (to_create_values) so all validators run and the config JSON is exact, then persisted as ORM rows to preserve the seeder's single end-of-function commit. Tests: 2 new demo assertions (catalog returns >=4 defs covering all kinds with a spark; fact table exposes its named filter). ruff + mypy clean, 8 demo tests pass.
The event_composition ratio was seeded as purchases/screen-views where both use the same sinusoid base, so the ratio cancelled to a near-flat constant and the catalog sparkline read as a dead line. Seed a gentle upward trend + daily ripple instead (~0.04-0.11) so the demo ratio renders as a live series.
… fact table) Collect all fact metrics of a fact table in a single multi-aggregate query instead of one query per metric. Adapter contract (base.py): AggregateSpec + get_time_bucketed_multi_aggregate and _breakdown, implemented across ClickHouse / Postgres / BigQuery. A per-metric row filter becomes a per-aggregate CONDITIONAL aggregate (sumIf/countIf/-If; FILTER (WHERE) / CASE) so differing filters share one scan. Empty conditional groups are guarded to NULL (count sentinel in ClickHouse, NULLIF for count-style in PG/BQ) so per-bucket values stay byte-identical to the per-metric path. Collection: collect_fact_metrics_batch builds dedup'd specs per fact table, chunks the covering window by the smallest replay_chunk_interval (disjoint-bucket merge), runs one multi-aggregate query per fact table + one per breakdown dimension, and assembles each metric — single, same-table ratio, and cross-table ratio (operands gathered across fact tables in the same interval group, divided via evaluate_composition, divide-by-zero -> gap). Per-metric upsert + isolated error capture. The scheduler groups due fact metrics by interval into one batch dispatch; sql/event_composition stay per-metric; per-metric _collect_fact is kept as the manual-recollect fallback. Gates: ruff + mypy clean (248 files), full backend suite 915 passed; 180 impacted tests reverified. 1 CRITICAL + 3 HIGH review findings fixed (value-identity).
UX: a fact table is a reusable data definition (read-only SELECT + introspected columns) that fact metrics are built on — a modeling primitive, not an observation surface. Move it from Observe into Plan alongside events/variables, and reorder Observe to Live activity / Metrics / Monitors / Anomalies / Alerting so 'define' and 'watch' surfaces no longer interleave. resolveNavLocation derives the breadcrumb area from the group, so Fact tables now reads 'Plan > Fact tables'.
UX: fact tables exist only to back fact metrics, so they no longer warrant a standalone top-level nav item. Metrics becomes a tabbed shell (Catalog | Fact tables) with URL-driven tabs (deep-linkable, back/forward works); the catalog body is extracted to MetricsCatalog.tsx and the fact-tables list to FactTablesList.tsx (FactTablesPage removed). The primary action is contextual (New metric / New fact table) and the H1 stays 'Metrics'. Routing: fact-table routes move under /metrics/fact-tables[/new|/:id/edit]; the old /fact-tables* routes redirect (param-preserving) so links/bookmarks survive. The standalone 'Fact tables' sidebar item is removed — the Metrics nav item matches /metrics* so it stays active on the tab; breadcrumb reads 'Observe > Metrics'. Back-links + edit links repointed; tests updated. Gates: tsc + eslint clean, 54 tests pass across metrics/fact-tables/navigation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships the Metrics catalog + monitoring epic (
tripl-dxhp, 9/9 slices): a metric is like an event but user-defined, monitored with the same anomaly detection + alerting.A
MetricDefinition(global, project-scoped; lifecycle draft/active/archived) is one of three kinds:…with breakdowns, app-version/platform splits, anomaly monitoring and opt-in alerting.
What's included
MetricDefinition+ 4 enums;metric_values/metric_value_breakdowns; metric anomaly scope (MetricScopeType.metric, nullableMetricAnomaly.scan_config_id, partial unique index). 3 migrations, single heada1b2c3d4e5f6.get_time_bucketed_aggregate(+breakdown) across ClickHouse/Postgres/BigQuery;measure_validator(identifier allowlist, SQL-fragment/SELECT safety, UNION ban) preserving the no-bound-params escaping model./projects/{slug}/metricsrouter.collect_metric_definitionsworker + composition evaluator (divide-by-zero → gap) + distinct-user denominator; beatcheck_metric_definitions_due.detect_metrics, per-ruleinclude_metrics(opt-in, default off).Quality
openapi.jsonregenerated (additive); graphify graph current.Test plan
include_metricsdelivers a metric anomalyFollow-ups (filed)