Skip to content

feat: add overdue periods gauge#400

Draft
silent-cipher wants to merge 9 commits intomainfrom
fix/stalled-data-retention-counters
Draft

feat: add overdue periods gauge#400
silent-cipher wants to merge 9 commits intomainfrom
fix/stalled-data-retention-counters

Conversation

@silent-cipher
Copy link
Copy Markdown
Collaborator

@silent-cipher silent-cipher commented Mar 26, 2026

Summary

Adds a new Prometheus gauge metric pdp_provider_overdue_periods that tracks estimated unrecorded overdue proving periods per provider in real-time. This gauge complements the existing cumulative counters by providing immediate visibility into providers that are behind on submitting proofs, even before the subgraph confirms the faults.

Changes

New Metric

  • pdp_provider_overdue_periods (Gauge): Estimates overdue proving periods by calculating (currentBlock - (nextDeadline + 1) / maxProvingPeriod for each proof set where the deadline has passed
  • Naturally resets to 0 when providers submit proofs and the subgraph catches up
  • Independent of cumulative counter baselines, emitted on every poll

Subgraph Query Enhancement

  • Extended GET_PROVIDERS_WITH_DATASETS query to fetch proofSets with overdue deadlines
  • Added blockNumber parameter to filter proof sets where nextDeadline < currentBlock
  • Fetches nextDeadline, and maxProvingPeriod per proof set

closes #374

Copilot AI review requested due to automatic review settings March 26, 2026 08:33
@FilOzzy FilOzzy added this to FOC Mar 26, 2026
@github-project-automation github-project-automation bot moved this to 📌 Triage in FOC Mar 26, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a real-time Prometheus gauge to complement existing PDP data-retention counters by estimating overdue proving periods per provider from subgraph deadlines.

Changes:

  • Introduces pdp_provider_overdue_periods gauge and emits it on every data-retention poll.
  • Extends PDP subgraph providers query to include overdue proofSets filtered by blockNumber.
  • Updates validation/types and adds/extends unit tests for the new query fields and gauge behavior.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
docs/checks/events-and-metrics.md Documents the new pdp_provider_overdue_periods metric.
docs/checks/data-retention.md Describes overdue estimation logic and how the new gauge differs from counters.
apps/backend/src/pdp-subgraph/types.ts Adds blockNumber option and proofSets typing/validation.
apps/backend/src/pdp-subgraph/types.spec.ts Extends validation tests for proofSets.
apps/backend/src/pdp-subgraph/queries.ts Adds blockNumber variable and proofSets selection/filtering.
apps/backend/src/pdp-subgraph/pdp-subgraph.service.ts Threads blockNumber through provider fetch requests and retries.
apps/backend/src/pdp-subgraph/pdp-subgraph.service.spec.ts Updates service tests for the new query variable and response shape.
apps/backend/src/metrics-prometheus/metrics-prometheus.module.ts Registers the new gauge metric provider.
apps/backend/src/data-retention/data-retention.service.ts Computes overdue estimate and emits gauge; adds safe BigInt gauge setter.
apps/backend/src/data-retention/data-retention.service.spec.ts Adds tests for gauge emission, cleanup removal, and large-value handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@silent-cipher silent-cipher requested a review from SgtPooki March 26, 2026 15:15
@BigLep BigLep moved this from 📌 Triage to 🔎 Awaiting review in FOC Mar 26, 2026
Copy link
Copy Markdown
Collaborator

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very quick look before vacation, but mostly lgtm.. @juliangruber can you take a peek as well?

@BigLep BigLep requested a review from juliangruber March 31, 2026 00:16
@github-project-automation github-project-automation bot moved this from 🔎 Awaiting review to ⌨️ In Progress in FOC Mar 31, 2026
@silent-cipher
Copy link
Copy Markdown
Collaborator Author

We should hold off on merging this until the pdp-subgraph url in infra is updated to support dataset lifecycle tracking.

@juliangruber juliangruber marked this pull request as draft April 3, 2026 14:36
@juliangruber
Copy link
Copy Markdown
Member

We should hold off on merging this until the pdp-subgraph url in infra is updated to support dataset lifecycle tracking.

Converted to draft to prevent accidental merge. Is there an issue for mentioned work?

@silent-cipher
Copy link
Copy Markdown
Collaborator Author

Raised pr in infra to update to latest subgraph url - https://github.com/FilOzone/infra/pull/104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ⌨️ In Progress

Development

Successfully merging this pull request may close these issues.

Detect stalled data-retention counters (no NextProvingPeriod fired)

6 participants